-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Progressive AVIF encoding #761
base: main
Are you sure you want to change the base?
Support for Progressive AVIF encoding #761
Conversation
tongyuantongyu
commented
Sep 21, 2021
- Add support to write a1lx box for progressive AVIF, and interleave each layer of alpha and color AV1 payload when writing mdat.
- Add layer config to avifEncoder and avifCodec impl to encode scalable AV1 stream. (aom only, rav1e and SVT-AV1 simply fail.)
- New avifenc args: --progressive.
First, thank you for your continued contributions to libavif. This PR is going to take some time to think about. I haven't thought at all about how to expose progressive encoding as an API or how it impacts libavif's internals. This PR is quite large and is making some fairly large user-facing decisions (in both the API and in This PR needs much more time to digest than I can give it in the near future. I'm curious as to @wantehchang's first impressions. This PR needs a high-level design / API review before we can even get into the review of parts of the implementation. It's a lot. |
f3adf8f
to
7e82a52
Compare
I understand your concern, and I agree the final interface will be quite different from the current state of this PR. Progressive is an important feature for AVIF to have a wider adoption (at least on the web), so I wrote this PR in the hope of making it available sooner. I'm also writing this to have some early experiments on the possible usages. Meanwhile, it also spots two (probably) bugs in the current decoder implementation: #762, #763. I wroted this implementation in quite short time, and I agree there's quite some places my designs are not very good. I've changed some of the points you mentioned:
I've changed it to allow arbitrary ratios. Current API aom provides is somewhat awkward: AV1E_SET_SVC_PARAMS can set any scale ratio but can't set horizontal and vertical ratios to different values; AOME_SET_SCALEMODE can only set a limited set of scale ratios but allow horizontal and vertical ratios to be different.
I'm not a fan of extra "script" files (like what
I've changed them to named enums.
I've changed it to the |
I realized that we can provide different inputs for each layer to encoder, and claim them as different layers of one image. We can, for example, pass a blurred version as first layer to hide unpleasant compression artifacts. My current design can't support this usage. It requires specifying input image for each layer (and consequently, call But doing this loses the ability to specify different layer configs for color and alpha. From my test, adding layers do increase the file size (please point out if it there's config to avoid it), so we may want to reduce or disable progressive on alpha to save some bytes. Hiding the detail that color and alpha are encoded separately seems resulting in this awkward situation: we have to give up one of the two abilities above. |
…ing grid functionality)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR, I used it recently and it worked as is. I believe this change should be merged without too many modifications. It can be improved if needed later on and it would make progressive encoding available in the meantime (unless it generates ill-encoded files).
Specifically, the following should be enough to submit, in my opinion:
- Add
// WARNING: Experimental
above the new API in avif.h. - Postpone
avifenc
changes to another PR. - Apply nits.
If you do not plan to fix these in the coming weeks/months, that is alright; we may be able to take the PR over, with your permission.
There should be at least one simple test exercising this new API but adding it in another PR is fine.
apps/avifenc.c
Outdated
printf(" Color planes have 2 layers, alpha plane is not layered.\n"); | ||
printf("\n"); | ||
printf(" ;30,1/2:10\n"); | ||
printf(" Color planes is not layered, alpha plane have 2 layers.\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite different from the current syntax for encoding regular single-layered images, as Joe pointed out. Similar to what you did with avifEncoder
, I would suggest --progressive
only adds layers and does not replace the "default" layer nor override --min
and --max
values.
Going even further, we could reuse the current API for layers: instead of defining layers within --progressive
and its own syntax, use a new "separator" flag. For example:
avifenc \
--min ColorLayer0MinQ --max ColorLayer0MaxQ \
--add-color-layer \
--min ColorLayer1MinQ --max ColorLayer1MaxQ \
--add-alpha-layer \
--min AlphaLayer0MinQ --max AlphaLayer0MaxQ
This new flag (--progressive
or any) should be tagged as "experimental" and prone to future changes.
By the way, the current avifenc
flags are currently being redesigned (see PR #669 and #955).
Also, I remember someone arguing that avifenc
should stay simple and that ffmpeg
should be considered/improved instead, so I am not so sure about modifying avifenc
at all actually. In any case, this file can be modified in another PR in order to move the API changes forward faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that break what current --progressive
does into multiple per-layer flags seems to be a better design.
@y-guyon Thanks for your interest on this! I have some further developments on progressive avif, but not on this branch:
There are two bugs in libaom handling resizing of oddish dimension frames: https://bugs.chromium.org/p/aomedia/issues/detail?id=3203 and https://bugs.chromium.org/p/aomedia/issues/detail?id=3210. I've written patches for both, but forgot to push them forward, due to me being busy until very recent. Probably get them fixed first, or limited to even dimension input for now. I'd like to continue working on this. Can you have a look of the two branches above, and decide if we want them included in this PR? I'll update this PR then. |
Both solutions seem ok to me. If the patches are trivial enough, it probably is the simplest option. Otherwise focus on pushing this PR, returning UNIMPLEMENTED in the cases where it would generate an image that cannot be decoded.
Thanks!
My current feeling is that we should decide which API change we would like to make right now, even if it is not final, since you researched several options. I would vote for the new |
7e82a52
to
c2d3328
Compare
c2d3328
to
757271e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this PR and fixed the simple changes. Github lost tracking of some reviews, so I quoted them.
src/codec_aom.c
Outdated
@@ -712,23 +748,115 @@ static avifResult aomCodecEncodeImage(avifCodec * codec, | |||
return AVIF_RESULT_UNKNOWN_ERROR; | |||
} | |||
} | |||
|
|||
if (layerCount > 1) { | |||
#if defined(AVIF_AOM_LAYER_CONFIG_PREFER_SVC_PARAMS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm experimenting here, and actually I haven't figure out how to make AV1E_SET_SVC_PARAMS
method working.
Are you familiar, or can invite someone who is familiar with aom's API, to review my usage here?
Please avoid to force-push after a code review; now I cannot see the diff between the current version and the last time I read this pull request. If you still have the previous version somewhere, could you force-push it and send a separate commit with your recent changes? Otherwise I will reread all lines. |
757271e
to
2016e2e
Compare
Done. I prefer rebase to keep the tree clean, but if it makes bad review experience, I'll avoid that. Can you "squash and merge" this PR when it's finished to make it cleaner? |
That is the plan, yes. |
src/codec_aom.c
Outdated
@@ -712,23 +748,115 @@ static avifResult aomCodecEncodeImage(avifCodec * codec, | |||
return AVIF_RESULT_UNKNOWN_ERROR; | |||
} | |||
} | |||
|
|||
if (layerCount > 1) { | |||
#if defined(AVIF_AOM_LAYER_CONFIG_PREFER_SVC_PARAMS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you share more details? Is aom_codec_control(AV1E_SET_SVC_PARAMS)
returning an error or is the encoding broken later on?
@jzern may have some insight on this or know who to ping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay.
src/codec_aom.c
Outdated
@@ -712,23 +748,115 @@ static avifResult aomCodecEncodeImage(avifCodec * codec, | |||
return AVIF_RESULT_UNKNOWN_ERROR; | |||
} | |||
} | |||
|
|||
if (layerCount > 1) { | |||
#if defined(AVIF_AOM_LAYER_CONFIG_PREFER_SVC_PARAMS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AV1E_SET_SVC_PARAMS
produces valid bitstream, but scaling_factor_num
and scaling_factor_den
values in avifLayerConfig
are not honored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, I think we should revert the changes in avifenc.c
and keep them for another cl. I also believe avifScalingMode
and avifLayerConfig
are overkill in the public API. avifEncoderAddImageProgressive()
and avifEncoderAddImageProgressiveGrid()
should be enough.
Also, adding a simple test in this PR would be reassuring, such as the following tests/gtest/avifprogressivetest.cc
draft:
TEST(ProgressiveTest, EncodeDecode) {
testutil::avifImagePtr image = testutil::createImage(width, height etc.);
ASSERT_NE(image, nullptr);
testutil::fillImageGradient(image.get());
// Encode
testutil::avifEncoderPtr encoder(avifEncoderCreate(), avifEncoderDestroy);
ASSERT_NE(encoder, nullptr);
encoder->speed = AVIF_SPEED_FASTEST;
ASSERT_EQ(avifEncoderAddImageProgressive(encoder.get() etc.), AVIF_RESULT_OK);
ASSERT_EQ(avifEncoderAddImageProgressive(encoder.get() etc.), AVIF_RESULT_OK);
testutil::avifRWDataCleaner encodedAvif;
ASSERT_EQ(avifEncoderFinish(encoder.get(), &encodedAvif), AVIF_RESULT_OK);
// Decode
testutil::avifImagePtr decoded(avifImageCreateEmpty(), avifImageDestroy);
ASSERT_NE(image, nullptr);
testutil::avifDecoderPtr decoder(avifDecoderCreate(), avifDecoderDestroy);
ASSERT_NE(decoder, nullptr);
ASSERT_EQ(avifDecoderSetIOMemory(decoder.get(), encodedAvif.data, encodedAvif.size), AVIF_RESULT_OK);
ASSERT_EQ(avifDecoderNextImage(decoder.get()), AVIF_RESULT_OK);
// Check decoder->image
ASSERT_EQ(avifDecoderNextImage(decoder.get()), AVIF_RESULT_OK);
// Check decoder->image
}
More extensive testing can be done in following PRs.
Done
We need to set |
namespace libavif { | ||
namespace { | ||
|
||
class ProgressiveTest : public testing::Test {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was talking about class ProgressiveTest : public testing::Test {};
which can be removed, not about the empty line. Sorry for the confusion.
src/codec_aom.c
Outdated
uint8_t layerCount = alpha ? encoder->layerCountAlpha : encoder->layerCount; | ||
avifLayerConfig * layers = alpha ? encoder->layersAlpha : encoder->layers; | ||
if (layerCount > 1) { | ||
addImageFlags &= ~AVIF_ADD_IMAGE_FLAG_SINGLE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant was: I wonder if it would be safer to return an error if addImageFlags & AVIF_ADD_IMAGE_FLAG_SINGLE
rather than silently setting the bit to 0.
src/codec_aom.c
Outdated
@@ -712,23 +748,115 @@ static avifResult aomCodecEncodeImage(avifCodec * codec, | |||
return AVIF_RESULT_UNKNOWN_ERROR; | |||
} | |||
} | |||
|
|||
if (layerCount > 1) { | |||
#if defined(AVIF_AOM_LAYER_CONFIG_PREFER_SVC_PARAMS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a short snippet to reproduce the issue, I suggest filing a bug.
Done. Now the code no longer tweaks
I will try to do that. For now I removed the usage of |
aa15ef0
to
bc71300
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your patience.
I removed the usage of
AV1E_SET_SVC_PARAMS
.
This is simpler, thanks.
Encoder will still produces an valid avif anyway
This may be tricky for the user: AVIF_ADD_IMAGE_FLAG_SINGLE
will have an effect only without extra layers. This is already the case for grids, so I guess it is consistent, even if it is the other way around:
avifEncoderAddImageGrid()
will always forceAVIF_ADD_IMAGE_FLAG_SINGLE
, which should be at least as good as omitting it (it "upgrades" the encoder settings so no penalty for the user).avifEncoderAddImageProgressive()
will always ignoreAVIF_ADD_IMAGE_FLAG_SINGLE
, but for the main image it may result in worse results. As an example, a 4k main image with a 1x1 layer will not benefit fromAVIF_ADD_IMAGE_FLAG_SINGLE
, but the encoder silently proceeds (it "downgrades" the encoder settings so there is a penalty for the user, compared with the same 4k main image without the 1x1 layer).
avifLayerConfig
holds quality settings, so I put it inavifEncoder
like all other quality settings, instead of making it a paramter inavifEncoderAddImage*
.We need to set
AOME_SET_NUMBER_SPATIAL_LAYERS
before we start encoding, so we need to get all layers at once, instead of adding one layer for each call like encoding an animation.
I still think extraLayerCount*
and layers*
in avifEncoder
are misleading and obscure. All other fields of avifEncoder
are used, irrelevant of how many times and which avifEncoderAddImage*()
function is called.
I believe there are two options here:
- Get rid of
avifEncoderAddImageProgressive()
and callavifEncoderAddImage()
for each layer.- The downside is 8 new
avifAddImageFlag
s:AVIF_ADD_IMAGE_FLAG_LAYER
to indicate a new layer.AVIF_ADD_IMAGE_FLAG_LAYER_SCALE_H_ONETWO
,_ONEFOUR
,_ONEEIGHT
to specify the optional horizontal scaling. This limits scaling to only three possibilities, but users can scale input images themselves if they need more flexibility.- Same for vertical scaling.
AVIF_ADD_IMAGE_FLAG_PERSISTENT_IMAGE
to indicate that there is no need to create an internal temporary copy of the input image untilavifEncoderFinish()
is called. This is necessary becauseAOME_SET_NUMBER_SPATIAL_LAYERS
must be set before encoding, as you said. Alternatively to adding this new flag, it could be written in the comment that input must persist until encoding is done, but this is dangerous.
- The downside is 8 new
- Move
extraLayerCount*
andlayers*
fromavifEncoder
fields toavifEncoderAddImageProgressive()
arguments.- The downside is the complexity if we get to layered grids one day, but this is another issue.
In order to move forward, I would suggest the following:
- Return
AVIF_RESULT_INVALID_LAYERS
whenAVIF_ADD_IMAGE_FLAG_SINGLE
is passed toavifEncoderAddImageProgressive()
. It can still be allowed later on (the other way around, permitting it now and then forbidding it in a later PR, is riskier). - Return
AVIF_RESULT_INVALID_LAYERS
when layers and grids are mixed. The code at head forces single-frame for grid images, and layers require several frames. I find it hard to grasp all the implications and corner cases of having grid layered images, so I would prefer to forbid them for now, and investigate that topic further in another PR, with a lot more test coverage alongside. - I still believe the current public API should be reworked. Maybe there are other possibilities than the options discussed so far, if they do not satisfy everyone. Any thought @wantehchang?
avifEncoderAddImageProgressive(encoder.get(), layer_image_ptrs.data(), | ||
AVIF_ADD_IMAGE_FLAG_SINGLE), | ||
AVIF_RESULT_OK); | ||
avifImage* layer_image_ptrs[2] = {image.get(), image.get()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const avifImage* layer_image_ptrs[2]
namespace libavif { | ||
namespace { | ||
|
||
class ProgressiveTest : public testing::Test {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was talking about class ProgressiveTest : public testing::Test {};
which can be removed, not about the empty line. Sorry for the confusion.
@@ -1041,6 +1041,8 @@ typedef struct avifEncoder | |||
uint64_t timescale; // timescale of the media (Hz) | |||
|
|||
// Layers (used by progressive rendering) | |||
// * Note: libavif currently can only properly decode images without alpha, | |||
// or images whose extraLayerCount == extraLayerCountAlpha, if progressive decode is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this warning.
What happens exactly with alpha and extraLayerCount != extraLayerCountAlpha
? Does libavif return an error or does it proceed with the decoding but in some wrong way? I cannot tell from your comment.
@@ -879,13 +880,16 @@ avifResult avifEncoderAddImageGrid(avifEncoder * encoder, | |||
avifAddImageFlags addImageFlags) | |||
{ | |||
avifDiagnosticsClearError(&encoder->diag); | |||
return avifEncoderAddImageInternal(encoder, gridCols, gridRows, cellImages, 1, addImageFlags | AVIF_ADD_IMAGE_FLAG_SINGLE); // only single image grids are supported | |||
if (encoder->extraLayerCount == 0 && encoder->extraLayerCountAlpha == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ((encoder->extraLayerCount == 0) && (encoder->extraLayerCountAlpha == 0)) {
Sorry for the late reply. I'm having less spare time than I expected. The progressive encoding API indeed need a rework. Let's decide what the API would be like first. I have some ideas and considerations.
I do agree
The main reason for There's difference between "progressive image" in traditional formats and "layer image" in AVIF. "Progressive image" only encodes one image, but managed to reconstruct image in lower quality with half the encoded stream. "Layer image", in contrast, encodes multiple layers (that can be arbitrary images), and half the encoded stream contains data of layers at front, so it 's more like an animation. So Now I prefer using We also need to set different quality settings for each layer, and this requires update quality settings from
From comments in /*!\brief Width of the frame
*
* This value identifies the presentation resolution of the frame,
* in pixels. Note that the frames passed as input to the encoder must
* have this resolution. Frames will be presented by the decoder in this
* resolution, independent of any spatial resampling the encoder may do.
*/
unsigned int g_w; By the way, in
I'd like to point out that layered images with alpha is also hard to deal with. libavif now only correctly decodes those having same number of layers in their color and alpha sub image. So here is my design:
Would like to know your opinions. |
Thanks for your detailed answer.
I agree.
Yeah, me neither. It is convenient for users that just want to scale down a layer a bit, but that's it.
Indeed libaom's
I would have preferred a way for the user to provide their own custom-sized layers as
That sounds reasonable to me.
I approve that plan. |
Clean up the changes made in the following two pull requests to support updating encoder settings during encoding: AOMediaCodec#1033 AOMediaCodec#1058 In particular, restore the aomCodecEncodeImage() function in src/codec_aom.c to its original structure, plus a new block of code to handle encoder changes. Rename some functions and data members. Edit some comments and messages. In the avifEncoderChange enum, left-shift the unsigned int constant 1u because if we left-shift the signed int constant 1 by 31 bits, it will be shifted into the sign bit. Other miscellaneous cosmetic changes. AOMediaCodec#761
Clean up the changes made in the following two pull requests to support updating encoder settings during encoding: AOMediaCodec#1033 AOMediaCodec#1058 In particular, restore the aomCodecEncodeImage() function in src/codec_aom.c to its original structure, plus a new block of code to handle encoder changes. Rename some functions and data members. Edit some comments and messages. In the avifEncoderChange enum, left-shift the unsigned int constant 1u because if we left-shift the signed int constant 1 by 31 bits, it will be shifted into the sign bit. Other miscellaneous cosmetic changes. AOMediaCodec#761
Clean up the changes made in the following two pull requests to support updating encoder settings during encoding: AOMediaCodec#1033 AOMediaCodec#1058 In particular, restore the aomCodecEncodeImage() function in src/codec_aom.c to its original structure, plus a new block of code to handle encoder changes. Rename some functions and data members. Edit some comments and messages. In the avifEncoderChange enum, left-shift the unsigned int constant 1u because if we left-shift the signed int constant 1 by 31 bits, it will be shifted into the sign bit. Other miscellaneous cosmetic changes. AOMediaCodec#761
Clean up the changes made in the following two pull requests to support updating encoder settings during encoding: #1033 #1058 In particular, restore the aomCodecEncodeImage() function in src/codec_aom.c to its original structure, plus a new block of code to handle encoder changes. Rename some functions and data members. Edit some comments and messages. In the avifEncoderChange enum, left-shift the unsigned int constant 1u because if we left-shift the signed int constant 1 by 31 bits, it will be shifted into the sign bit. Other miscellaneous cosmetic changes. #761
Is progressive compression more or less dense than the sequential compression? |
Here is an out-of-date data point (using libavif 0f85943):
*SSIM according to https://chromium.googlesource.com/codecs/libwebp2#get_disto The settings were arbitrarily picked so this is not a fair comparison. Does it still answer your question? Could you give more details on the reason of it? To engage in a longer conversation, please open a separate issue. |
Now when the progressive decoding in #640 is merged, what needs to be done to merge this PR and when can we expect to support progressive AVIF in libavif and web browsers? |
#640 is already merged. Are you talking about another issue or the issue 640 of another repository perhaps?
Progressive AVIF decoding is already supported in libavif and Chrome. This PR is about encoding support. |
Sorry about that. The question about web browser support made me think you were asking about the decoding side.
The author of this PR @tongyuantongyu is the main contributor of this feature. At the time of writing this comment, before moving forward with this #761, they sent three prior patches:
Unfortunately at this point I cannot give any timed road map for this feature. |