Replies: 3 comments 2 replies
-
Thanks for the feedback here! I'd like to split up this discussion into a few different areas, and thus will open a few different threads here:
|
Beta Was this translation helpful? Give feedback.
-
Data Explorer
|
Beta Was this translation helpful? Give feedback.
-
@jthomasmock
To summarize, my points are:
In short, I want to emphasize that while previews are not essential, the stability of the data being processed should always be the top priority. Before the development of similar IDEs, we primarily conducted data analysis in terminals without these features or by using Jupyter Lab. So, we’re already accustomed to working without the additional features that Positron provides. What I want to emphasize is that the added functionalities in Positron should ensure the stability of the Console or Session. Thank you for your work. Another is your reply: For example, if you have a data table with cells containing lengthy information, you might expand the cell (taking up half of the IDE window) to view the contents. However, when you want to look back at the previous columns or check the columns ahead, intermittent scrolling is not well-suited for this situation. Continuous scrolling has been adopted in However, this issue isn’t actually that important in this discussion thread. Everyone can maintain their own preferences; there’s no right or wrong. What I’m more interested in is how Positron behaves under heavy load. From a software engineering perspective, rendering large-scale data is undoubtedly challenging. You really don’t have to render it (I feel it’s too difficult), but it’s crucial to ensure the stability of the working session in the Console. It’s really not necessary to compromise stability just to render all the previews. Thank you for your work @jthomasmock . The reason I’m bringing this up is that RStudio is incredibly stable, no matter the workload. As the next-generation IDE introduced by Posit PBC, I genuinely hope this characteristic can be maintained. |
Beta Was this translation helpful? Give feedback.
-
This might be a topic composed of several issues. The reason for putting this in the discussion is mainly out of concern that the development team lacks test scenarios for heavy data loads. (Sorry, but please allow me to say this)This may be overlooked in all the development, making Positron extremely fragile when dealing with "large" data. The RPC frequently reports errors, and the connection between the backend and frontend keeps failing, similar to the previous issue #3628. #3628 was also triggered by the need to preview relatively large tables, which led to the crash of the entire R session.
However, this kind of fragility isn't limited to the R environment—even Python is affected. Positron, built on VSCode, should ideally support such analytical needs very well, as VSCode does. Yet, as a dedicated data science IDE, Positron's performance is actually worse than VSCode. And VSCode isn’t even specifically developed for data science. To this day, VSCode provides various features like line-by-line execution for Python, various code autocompletions, image preview and saving, a variables pane, integration with Jupyter notebooks, and data table preview capabilities through Data Wrangler—these are the features that Positron is supposed to have.
Given this, what exactly makes Positron unique?
This might sound a bit harsh, but I have immense respect for the work you do. Since the first day I stepped into data science, I have greatly benefited from your company’s outstanding products—especially RStudio, ggplot2, and Reticulate. Particularly RStudio and Reticulate. Moreover, Posit are selflessly developing open-source software, and we really shouldn't have so many critical demands.
I'm concerned that the communication method between Positron's frontend and backend may have systematically overlooked scenarios with heavy data preview loads, leading to issues like #3628 and a series of related problems. That's why I want to point this out early in the project's development. I may be overstepping here, so please forgive me.
System details:
Positron and OS details:
Positron Version: 2024.08.0 (Universal) build 48
Code - OSS Version: 1.91.0
Commit: ed616b3
Date: 2024-08-19T04:26:51.868Z
Electron: 29.4.0
Chromium: 122.0.6261.156
Node.js: 20.9.0
V8: 12.2.281.27-electron.0
OS: Darwin arm64 23.5.0
Interpreter details:
Python 3.9.18
Example 1
When creating multi-faceted scatter plots with Matplotlib containing over 40k+ points: For figure previewing, Positron's RPC may intermittently report timeouts, especially when submitting commands multiple times. This could be because the images are sometimes too large, exceeding the rendering wait time limit. When changing the size of the plot pane, whether or not accompanied by switching images, or even just switching images alone, the entire console can freeze. In this case, RStudio's default handling is more preferable. The lag only occurs the first time, after which resizing the image and re-rendering are separated from code execution, allowing the code to run smoothly.
Code to reproduce:
Example 2
And also, when trying to open large pandas table, it was usually display an error like this:
VSCode's Data Wrangler enable a smooth strolling even when viewing large table with 20k+ rows, and without any error.
Example 3
Positron's Data Viewer only can stroll according to each column (like video), and sometimes is slow.
PositronBehavior.mp4
VSCode's Data Wrangler is much better in this
VscodeDW.mp4
With Positron, every time you click on any preview, you worry about an RPC crash freezing the entire IDE. What I’m trying to say is, Positron, based on VSCode and supposedly designed for data science, doesn’t seem more suitable for it than VSCode itself—maybe even less so.
Thank you again for your work. I hope Positron continues to improve and wish you all the best.
Beta Was this translation helpful? Give feedback.
All reactions