-
-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Image scales catalog metadata #3521
Conversation
@cekk thanks for creating this Pull Request and helping to improve Plone! TL;DR: Finish pushing changes, pass all other checks, then paste a comment:
To ensure that these changes do not break other parts of Plone, the Plone test suite matrix needs to pass, but it takes 30-60 min. Other CI checks are usually much faster and the Plone Jenkins resources are limited, so when done pushing changes and all other checks pass either start all Jenkins PR jobs yourself, or simply add the comment above in this PR to start all the jobs automatically. Happy hacking! |
Is this implying that we generate all the scales whenever a new image is uploaded? |
isn't this already made when we create a new content on Volto? When you get its data from restapi, you get also its miniatures. |
@ale-rt @plone/framework-team if this is ok for you and you don't have any complaints, i will split that pr into the right packages |
@jensens this pr needs also an upgrade-step because we're adding a new metadata to the catalog and catalog needs to be updated. |
IIRC correctly @datakurre made a PR to some package that enables that behavior by setting some env variable, but with a quick search I was not able to find it. Anyway this is not the general case and I would like to keep it like that. Also I think that the upgrade step that adds the I am not deeply in to your use case and this might be the perfect solution, but I would ask you to think more about this. |
I also created a branch with async image scaling back then https://github.com/plone/plone.namedfile/tree/datakurre-image-scaling-queue (in practice it was found to lack optimistic savepoint support) That branch generated all scale metadata immediately, but responded with a temporary redirect to the original until the scale was really available. Therefore an additional patch for Volto was required to retry the scale request as long as the response had that status code. All the code is probably bitrotten by now, but the concept could still work. But agreed, it was a bit complex. |
Ok, i've found some problems with this implementation. Probably adding two event handlers that listen to object added and modified events can have the right blobs, but i need a way to check into my indexer (actually into the field adapter) if the current image is temporary or not. @datakurre @ale-rt any suggestions? I don't know if this is the perfect solution also because we have to re-index the catalog. Let me better explain our use-case: when dealing with listing blocks (and other blocks) we could have a list of results with images. An alternative solution that we discussed was to create a separate catalog (maybe just a BTree into the site root) that we are going to populate with the same data that i'm trying to store in metadata, and when we serialize the brains, we'll get scales infos from there without touching real objects. |
I do not know if I understand completely your issue and your "frontend optimizations". When working with a brain usually the URL to access the scale looks like: Knowing the scale you are pretty sure to know the width of the image but not the height given that usually the scales size look like You already see that puts emphasis on the width rather than the height and in my opinion this is a good thing. That said, with a brain in your hands there are good chances that you already know what is the with width of your scale and the URL to display it. That should be enough info to build a web page that can be rendered nicely across different browsers. If your "frontend optimizations" have desperate need to know the height of an image I strongly doubt they are fit for the Plone core. |
@ale-rt A couple explanations are probably due here:
|
@cekk Ouch. The rabbit hole gets deeper. A thing that has troubled me forever is that scale uids are not deterministic, but just random uuid4s: https://github.com/plone/plone.scale/blob/09b3598b3843363260a263c65435bd2f463249ab/plone/scale/storage.py#L198 If were able to use deterministic uids, this should be among the issues that get fixed. Like hash of the original file by its contents and full scale configuration. (I have had a similar patch for that for JYU for years, but don't have access to the code until next year. We did it for better caching, because the builtin 24h scale invalidation on any scale modification with continuously changing scale URLs was bad.) |
This is super true, having the possibility to get the proper URL out of some brain metadata would be really awesome. About the hashing, I think it is enough to consider the |
|
And if you want to know the height given the width you can just get it knowing the original images and doing a proportion or if you prefer you can divide it by the aspect ratio |
Ok, so we could get rid of real scales when storing the data (also because i suspect that if we call the view on temp files when we are uploading new image, we'll get blobs twice). I think that we should store something in the catalog anyway, because we need to know at least original image sizes. The big question now is: WHERE and HOW? In catalog as proposed? In a separate tree? An alternative solution could be: in summary serializer, if you are requesting an image-ish field (we need to define a way to set that list) with additional_metadata, we wake up the brain and get proper scales (as we already do when serializing the object). |
Hi! this is a bug, because when indexing the object on save, the object should already be finalized. I think it has to do with the uncommit state of the blob: can you check if the Also, @1letter had this: # first attempt
(Pdb++) pp blob._p_blob_committed
None
(Pdb++) dir(blob)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getstate__', '__hash__', '__implemented__', '__init__', '__module__', '__new__', '__providedBy__', '__provides__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_create_uncommitted_file', '_p_activate', '_p_blob_committed', '_p_blob_ref', '_p_blob_uncommitted', '_p_changed', '_p_deactivate', '_p_delattr', '_p_estimated_size', '_p_getattr', '_p_invalidate', '_p_jar', '_p_mtime', '_p_oid', '_p_serial', '_p_setattr', '_p_state', '_p_status', '_p_sticky', '_uncommitted', 'closed', 'committed', 'consumeFile', 'open', 'opened', 'readers', 'writers']
# second attempt
(Pdb++) pp blob._p_blob_committed
u'/Plone5/py2/var/blobstorage/0x00/0x00/0x00/0x00/0x00/0xaa/0x68/0xbc/0x03d9906455146bcc.blob' He solved with: blobfile = obj.file
blob = blobfile._blob
blob._p_activate() Hope this can help. |
@cekk |
@pnicolli we need to generate them in the backend because otherwise in brains you don't know what are the actual image sizes, and you need both if i understood correctly. |
@cekk I just meant the url of the image, which in the frontend could be generated like |
From what I learned and proposed for last friday at the Beethovensprint my current understanding is that the most useful and probably only thing to store in the catalog and what is needed to create the correct image sizes in the srcset is the width of the original image stored on the plone.namedfile field or of the images available on a context. What you don't want to do is to generate a srcset with for example image widths (300w, 600w, 1200w, 1600w) when the stored image is only 980 pixels wide. All other information can be generated independent of waking up the image object. And with the current knowledge for every site there will be a limited identical set of 'fixed' image widths that can be used for all images that are served and rendered in the frontend, either volto or classic. So then you could filter the image widths that are not available (for example for hidpi usage) because it would mean the image gets upscaled when that image is pulled from the srcset. |
One problem that we never had to deal with, but became relevant with plone.restapi and decoupled frontend is opening up the server for Denial of Service attacks by requestiong thousands of flexible image scales. When the templates are processed only server side and a uuid url is generated this is not possible. Some stand alone image servers that were looked at for inspiration or integration (also in 2020 at the Dresden Plonetagung) allow to 'sign' a image request with public/private key, effectively generating a signed 'ticket' with which the frontend is allowed to request a certain image scale/format/quality/crop . @datakurre I'm not saying you are proposing this: you are only discussing about deterministic scales, but if one more step would be to generate the image based on decoding the information in the url without any security checks we need to be aware of DoS. I'm surprised to learn that scales are 24h lived, My assumption until now was that they are there permanently until the image data is refreshed or in the current implementation the scale definitions is altered. I base that on the stale image scale annotations that build up and for which we run a purge image scales script on Plone sites (https://github.com/zestsoftware/plonescripts/blob/master/purge_image_scales.py). What I missed in a solution for problems we encountered the last few years with plone.restapi image responsse triggering generation of all image scales at once because returning the 'stable' url with uuid needs the generation of the image: why did nobody until now propose to make the generation of the image scaled to a certain width lazy by:
This avoid rendering scales that are not requested but still provide some safety measure against DoS. |
I don't propose decoding scale from the URL. Only that the scale URL remains the same as long as the input and parameters are the same. In this thread is would fix the weird blob related issues. Else where it would fix that when any update to scales after 24h clears old scales, the new scales would have the same URLs. |
@jensens and I where discussing this also today and came to the same conclusion, that we would like to generate the URL in the front-end. Because than the browser will only trigger scaling for individual images and only for one scale at a time. @@images/image/TIMESTAMP/large I also did some testing earlier today, to research what information we need. If we later set the width in CSS, that is fine. |
Asking for the huge scale on a 900x900 image would just return the same 900x900 image.
(I removed WIP from title: We have labels and draft state, its enough). |
I think this is ready for review. I have started Jenkins jobs for this PR together with plone/plone.namedfile#121 which uses the new catalog metadata in the What is missing mostly is an upgrade step. Add the new metadata column. And I fear we need to wake up all objects and update the metadata. Do we have a volunteer? Are we missing anything else? |
LGTM! Although I might also be missing something, I've been looking at this from the fence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After moving all parts to its respective packages as planned, I would give green light for merge here.
from zope.interface import Interface | ||
|
||
|
||
class IImageScalesAdapter(Interface): | ||
""" | ||
Return a list of image scales for the given context | ||
""" | ||
|
||
def __init__(context, request): | ||
"""Adapts context and the request.""" | ||
|
||
def __call__(): | ||
""" """ | ||
|
||
|
||
class IImageScalesFieldAdapter(Interface): | ||
""" """ | ||
|
||
def __init__(field, context, request): | ||
"""Adapts field, context and request.""" | ||
|
||
def __call__(): | ||
"""Returns JSON compatible python data.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interfaces here need to go to plone.base before merge. @cekk put them here to simplify prototyping. It was never planned to keep them here.
This is ready for review, with code in the new places.
Please review. I think this is the last blocker for a first beta of Plone 6! |
Let this replace the merged image srcsets PLIP. See plone/Products.CMFPlone#3521
I think I already wrote that somewhere else but:
That's a +1 from my point of view. It would be cool to have some other @plone/framework-team member or core contributor to say something about that. |
Also a 👍🏼 from me |
Branch: refs/heads/main Date: 2022-06-21T11:44:12+02:00 Author: Maurits van Rees (mauritsvanrees) <[email protected]> Commit: plone/plone.base@0cfbfd2 Add images interface with IImageScalesAdapter and IImageScalesFieldAdapter. See plone/Products.CMFPlone#3521 Files changed: A news/3521.feature A src/plone/base/interfaces/images.py M src/plone/base/interfaces/__init__.py Repository: plone.base Branch: refs/heads/main Date: 2022-06-22T22:13:22+02:00 Author: Jens W. Klein (jensens) <[email protected]> Commit: plone/plone.base@d8f67b6 Merge pull request #13 from plone/image_scales_metadata Add images interface with adapters for image scales Files changed: A news/3521.feature A src/plone/base/interfaces/images.py M src/plone/base/interfaces/__init__.py
I merged all except the upgrade step. See my comment there. plone/plone.app.upgrade#292 |
As discussed today at Beethoven sprint.
We need a way to pass image scales to Volto without waking up objects and calculating them every time.
This is a poc to see if we can store them in catalog.
We store a PersistentDict into metadata so it is only loaded from db if accessed.
There are two new adapters:
The first one iterate over context's schema and try to call the second over schema fields.
If there is an adapter registered for that field, it can return the proper scales.
This allows future customizations like for example if miniatures are stored in an external service and not in Plone.
I copied the code for getting scales from plone.restapi utils (we could move that code from restapi to here if needed).