-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy Loading Overhaul #651
base: master
Are you sure you want to change the base?
Conversation
If this option is set to False, then do not save generated thumbnails to the disk.
In order to reduce memory usage, the new 'thumbnail.max_ahead' and 'thumbnail.max_behind' can be used to specify a limit to the number of thumbnails after the current selection that can be loaded. Thumbnails outside of this range will be unloaded, and will be loaded again when they enter the range. Setting either of these variables to 0 will remove the loading limit, to restore previous behavior.
The 'thumbnail.max_count' configuration setting prevents thumbnails from being unloaded if they are under this count, even if they fall outside the range specified by 'thumbnail.max_ahead' and 'thumbnail.max_behind'. If this value is 0, it is ignored. ignored.
For sort 'image_order' and 'directory_order', the passthrough setting can be used to sort images in the order they were first encountered by the software, whether passed from the command line, or from directory monitoring.
vimiv typically uses imghdr to scan input files for valid images. This potentially results in a great deal of startup disk IO for large lists of files. When `image.id_by_extension` is enabled, the extension of the file will naively be believed to represent the correct image format.
When this option is set to true, if the result of attempting to load an image using a reader determined by file extension fails, imghdr will be used in its place as a last ditch effort to find a working reader.
Wow, thanks a bunch for the PR and all the work that went into this! Would you mind splitting this into four distinct PRs for discussion purposes, as I do see parts being merged quickly, others needing more thoughts / discussion / time? Some initial thoughts:
|
@@ -161,14 +164,18 @@ def _on_monitor_fs_changed(self, value: bool) -> None: | |||
def _load_directory(self, directory: str) -> None: | |||
"""Load supported files for new directory.""" | |||
self._dir = directory | |||
self._images, self._directories = self._get_content(directory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you remove the assignment of self._directories
by purpose? For me (i.e. without modifying the config or anything), this leads to no directories getting listed in the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reproduced your issue- this was most-certainly a careless deletion on my part. Thank you for catching it. Of all the changes, those to 'working_directory.py' will probably need the most careful scrutiny. For whatever reason, I had the most difficult time here.
Sure thing! And, I don't mind at all. The whole point of this draft pull request was to gauge sentiments towards these changes and figure out how to structure them.
|
Thanks for your explanations!
|
Sure thing! I have nothing to add concerning points 1, 2, and 4. For number 3 yeah, I see what you mean now. According to the Python docs, And yeah, I would have passthrough/none set by default. Arguably, none/passthrough should just naively take whatever existing order, so it'd make sense for changing the sort mode to wipe out the original command line order. With the present behavior, it could be called A tangentially-relevant question to the matter of thumbnail loading: Do you know if there are any memory leaks associated with creating QIcon, QPixmap, and QListWidgetItem? When scrolling to the end and then to the beginning again with the max_ahead and max_behind settings, it seems that at least for me, the memory usage will slowly climb upwards as QIcons are created and assigned to ThumbnailItems. It probably isn't too much of an issue, but it's frustrating to see that something is being allocated and not freed, and not being certain of what that is, and I've tried various methods to 'ensure' objects are deallocated, but to no avail. |
While I agree that the option to reset the sorting would be more consistent, I would definitely argue that the added complexity is not worth it. Yet another option could be calling it I don't know of any, but am certainly not an expert on the matter. Would be a bit surprising to me though if this is an upstream issue. I can reproduce this, but playing around with the memory usage on |
It really is an edge case with a relatively complex implementation that goes against how Vimiv generally handles arguments. So, I could also imagine it potentially being a pain to work around in the future. Let's just do the sorting option without keeping a record of the initial order, then? Part of me just wants to keep the name of the sorting mode simple with 'none'. It seems like the first thing that'd come to mind with most people? But, I will defer to you, if you have a preference there. Concerning the 'memory leak', I honestly was imagining this might be something from the C++ side of QT, or perhaps it's a quirk in how Python's GC works. It doesn't seem like something in vimiv itself, though I could be wrong on all those counts. This may be one of those behaviors that is technically a feature, that there's some reasoning for this extra memory being allocated. Anyway, I think it's a little quirk to keep an eye on. Finally is the name of And otherwise, this should be four separate pull requests for
If this all looks good, I think I have what I need to move forward and will close this draft. Thank you for considering my ideas and taking the time to talk them out with me. |
Perfect, sounds good to me 👍 I am also totally fine with naming it Definitely worth keeping an eye on the memory issue, especially if this is an area where you have expertise. Personally, I probably won't be able to spend much time and extra thoughts on it given the general time limitations. Happy to help if anything specific comes up though of course. The four PRs sound good, although the Concerning the naming of Thanks for your ideas and taking your time to share and discuss them! |
I am a bit late to the party (did not have time yesterday to answer), but I also want to add a few comments 😊
I have also an issue with the way thumbnails are generated. While my problem is different from yours, my solution may also (partially) solve your issue (or a combination of both solutions). I deal with many RAW images. As the extraction of the embedded thumbnail needs quite some time, and as vimiv generates previews starting from the top, it takes an eternity (up to several minutes) till the thumbnails of the last images, which are the ones I am concerned with (when dealing with an SD card, where the newest images are by default at the bottom), are loaded. While your solution may (partially) solve the my issue, I still want that in the end, all thumbnails are generated. To solve my issue, I thought that it would be nice if priority is put onto the thumbnails displayed in the current view. As soon as the view changes, the new set of images is prioritized. Once all images in the view are generated, the vimiv deals with the generation of all other, not yet generated, thumbnails (in a somehow "smart" (load what is probably used next) way). How this addresses your issue. If we add a boolean setting that specifies that only images in the current view should be loaded, it would kind of have the effect you wanted to achieve with The Would that also solve your issue?
|
Alright, I'll try to respond to everything roughly in order. The sort type will be named I don't know if my expertise is QT5's memory allocation in particular, but I'll most certainly keep it in the back of my head, and let you know if I come up with any tangible leads on it. Definitely still waiting for the The changes to Anyway! Now to @jcjgraf Sorry, I might be missing something. I don't think there will be two options to that effect. Originally, they were Already generated thumbnails will not be deleted from the disk by any of this code. Only potentially unloaded from memory, so they'd need to be reopened, and that can be turned off. One option adds the ability to not write new thumbnails from the disk, so the original will be reopened instead. For specifics on Changing the options during runtime is definitely something I should test, though tentatively, I think they should probably work. That's a great idea about generating thumbnail files in the background after the current range is displayed, and I'm totally happy to implement it. Finally, using a single option would be cleaner as you mention. My preference for specifying a number ahead and behind is that it can be tuned to slower hardware, where one would want to preemptively load more thumbnails. Could make So to be clear, it should go like this? Generate and display thumbnails in the range specified by Definitely happy about the file extension detection. The trickier part of it is trying to use the tentatively named Sorry for the length of this post. |
Ah, now I get it. Thanks for taking the time to write such a detailed answers. |
Sure thing! And hey, your feedback still has a lot of great ideas in it. I can definitely incorporate the background thumbnail file generation into my changes, so that the files are still made, even if they aren't loaded yet. I think this should just be added to the thumbnail.save? No need to make another config setting, I think. On that note, merging Will also definitely test changing these settings while the software is running. |
I agree the points @jcjgraf made are useful, I see two points here:
Personally, I won't be using the |
I agree, it is less related than I thought (well, code-wise they are, but not functionality-wise) and both changes are definitely complex enough on their own. I can deal with my idea once your PR is done. |
I actually think the first point can readily be implemented by changing the sorting of the thumbnail list that's fed into the manager. The machinery to pass thumbnail paths in arbitrary order with their intended index is already there. What I mean by the second point is, with Honestly, I've never tried setting |
Well, yes and no, depends on how far we want to go. Think of the following process:
If we know in advance, we want to start at the end, passing the images in reverse to image and using the new Anything in between just becomes a messy async communication, not sure we want to go into that rabbit hole. I will have to take a closer look if the new EDIT: |
One config option it is! And, I get what you're saying, now. Basically, off the top of my head, I would probably try to implement the "create all thumbnail files but not load them" functionality by making an option for the thumbnail manager to not send the signal that new images have been loaded. So, the thumbnail will be created on the disk, but the load signal will only be sent when the image comes into range. I think the question to figure out is: What happens the image comes into range while a background thumbnail creation job is running? Could alternately do it by having the thumbnail GUI component keep track of the paths it has sent for thumbnail file creation. If an image comes into range while its thumbnail is being made, it just lets the old job finish instead of starting a new one. All jobs send the creation signal back, but the signal is ignored unless it's in range. Yeah, you're right, this is separate pull request material, though. It's not trivial. |
Definitely extra PR material, agreed 😄 @jcjgraf could you clarify: I expect the extraction takes long, i.e. the generation of the initial |
@karlch Yes, the extraction is what is expensive. I am not at all familiar with this part of the code. I thought it is somewhat related to the generation of the thumbnails, but maybe not. I will have a look at relevant part of the code and propose something, sometimes in the future. |
Thanks for clarifying, I think the current #665 actually fixes a large part of your problem, as the thumbnails are created from the current position, and any not yet running "creation" threads are actually stopped. |
Included in this pull request are a number of changes to the way images are loaded by vimiv, with special focus on lazy loading. The code is most certainly not ready to be merged with the upstream project. Tests need to be written for it, the code generally needs to be cleaned up and the style made consistent, and any ideas that don't work well with the vision for upstream vimiv must be removed.
The rationale for these changes is primarily to accommodate the author's niche and specific use case. That is browsing sets of thousands of images, while minimizing memory usage and redundant disk IO. These images are typically sorted externally and fed in through stdin, so their order must be maintained.
The intention was to make all changes opt-in by means of configuration variables, the default values of which would maintain the original behavior of the software. These changes are already seeing use in my personal fork, so I will not be offended if they are turned down.
The following configuration variables are added to change the behavior of the software:
thumbnail.save
: If set to False, thumbnails will not be cached to the drive.thumbnail.max_ahead
: Only this number of thumbnails after the current selection will be loaded. Changing the selection will modify the range of loaded thumbnails accordingly. If zero, load unlimited, as is the default.thumbnail.max_behind
: Likethumbnail.max_ahead
, but for thumbnails preceding the current selection.thumbnail.max_count
: Thumbnails will not be unloaded unless the number of them exceeds this. Note, that this does not cause thumbnails to be forcibly unloaded, so this variable should probably be renamed. Anyway, set to 0 to immediately unload out of range thumbnails.image.id_by_extension
: If set to True, use the extension to assume the filetype, instead of reading withimghdr
. This prevents large sets of images from being all opened when starting.image.imghdr_fallback
: If set to True and an image fails to load by a reader guessed byimage.id_by_extension
, try using imghdr to find an appropriate reader. In light of the pending replacement of imghdr, this should perhaps be renamed.sort.image_order
andsort.directory_order
also now have apassthrough
sort option, which will maintain the order that images were provided to vimiv from the command line or stdin. Will probably removesort.directory_order
, since it's unlikely to be very useful. And I don't think it works properly.