-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: File Format Versions 1, 2 and Beyond #4
Comments
I would discourage the use of version numbers. They prevent a format from being forward-compatible. Better to define error handling behaviour for all possible error conditions (unknown op codes, etc) and then add features in a backwards-compatible manner, IMHO. |
(Breaking with FFV0 is fine, I'm just saying to avoid FFV2 being incompatible with FFV1. Consider for example how animated GIFs fall back to non-animated GIFs in legacy software, or how APNG is just PNG with extra data, so it similarly falls back to a non-animated version in legacy software, etc. Most successful formats follow this pattern.) |
The intention is for FFV 2 to be a superset of FFV 1. It will just define new opcodes and new metadata chunks. I think it's perfectly feasible for FFV 1 decoders to simply ignore opcodes and metadata that it does not recognize. I agree that the "APNG falls back to PNG" model is worth mimicking. I still think that's it's potentially useful to be able to distinguish FFVs 1 and 2. For example, the memory allocation requirements for an animated WxH graphic will be higher than for a static WxH graphic. Some decoders might wish to do all their allocation up front (after decoding Width and Height, available early in the decode process). They might like to know (potential) animated-ness just from parsing a few opening bytes rather than having to go arbitrarily deep into the file. Metadata chunks already explicitly list the chunk length in bytes. We'd have to ensure that any new (FFV 2+) opcodes also do so (so that you can skip over them). Thanks for the feedback. |
I forgot to mention... another distinction will be that FFV 1 only requires sequential access (not random access) to the IconVG file. Again, that might be useful to know up front rather than having to go arbitrarily deep into the file. |
Based on my recent experience implementing the spec in Dart, I have the following opinions:
No objection.
No objection, but anyone who is only looking at the first byte is doing themselves and their users a disservice. See also: https://mimesniff.spec.whatwg.org/
I would recommend against this, as discussed above.
I would recommend against this, as it makes implementations more complicated and does not seem to solve any immediate issues. If the concern is being able to read the metadata section without a full decoder, parsing the metadata section is already pretty trivial. I don't think it's worth making that use case simpler at the cost of making a full decoder more complicated (since now it would need yet another way to decode numbers, this one just for metadata blocks).
No objection. I'm not really sure why the order is required here though. The only benefit I see is that it makes catching duplicates more easy, but in practice I found it useful to have out-of-band flags for both of the existing metadata blocks anyway (viewBox because in languages with write-once-only fields you only want to write set the viewBox fields once so the default is set after reading metadata, not before; palette because I wanted to avoid copying into CREG if I didn't see a custom palette).
No objection. Would you also prohibit +/- infinity?
No objection. If I could be allowed to make some suggestions of my own:
|
Having metadata chunks appear in strictly-increasing MID order means that I can guarantee that e.g. the ViewBox (MID=8) chunk is in the first N bytes for some value of N, if it's present at all. That's assuming that every earlier chunk (lower MID) has an upper bound on how long it can be. It's not a must-have feature, but I think it's not onerous and it might be nice to be able to say "if you can give me the first 128 bytes of the IconVG file than I can definitely tell you its (explicit or implicit) viewbox".
I'd be happy to have your editor's hand... but I think that'd go best if you went to work after the spec gets 'upgraded' to at least FFV 1. |
For the record, the #18 thread also discusses dropping the smooth ops |
That's only true currently because it's MID 0, right? I don't see anything in the format that would prevent unknown metadata blocks from being arbitrarily large. |
Preliminary thoughts on FFV 2. Very preliminary. CollectionsLet a single MID 0 (in FFV 1 MID numbering) holds a map (wire format TBD) from string to FileSegment. FileSegment is a Palette Names, Parameter NamesA new (optional) metadata chunk that gives names to CREG indices. For example, 0:"skin", 1:"hair", etc. Might include the reverse map too: {"hair": 1, "skin": 0}. Also have another metadata chunk (call it "Parameter Names") that does this for NREG instead of CREG. For example, NREG[32] could conventionally be called t, an animation time parameter. Also allow Suggested and Custom Parameters, which do to NREG what the Suggested and Custom Palette do to CREG. The (human-readable) names are for use 'externally', by implementations or libraries that consume IconVG. 'Within' the IconVG itself, things are identified by an integer ID or by a FileSegment. Hit TestingAnswers "what part of the graphic did I just click on"? Add a new 6-bit HITTEST register and a new styling opcode to copy NSEL to HITTEST. Current value of HITTEST is passed to callbacks when exiting drawing mode (i.e. filling a path), augmenting the paint attributes (RGBA flat color, gradient, etc) that's already passed at the same time. Compound GraphicsLet multiple graphics (within a single file) share common elements. Allow a collection to hold "foo" and "foo-with-bar-badge" graphics. Allow a collection to hold "qux_en", "qux_de", "qux_zh_Hant_HK" graphics that re-use a base "qux" graphic. Note that text in general is out of scope due to its enormous complexity. Authors/tools are expected to 'flatten' the "en", "de" etc glyphs as simple paths. New opcode (or opcodes?) in styling mode to do a 'function call': play another headless IconVG graphic, again identified by a The callee has their own register state (CREG, NREG, etc) which is copied from the caller (possibly 'rotated' by the caller's CSEL/NSEL so that the caller's Transformation Matrix SupportStyling opcodes to manipulate
Reserved OpcodesThese need to encode "skip the next N bytes if the (older) implementation doesn't support this opcode" somehow. This might be on a per-opcode basis, or perhaps an overall "SkipLT(N, V)" opcode to skip the next N bytes if the library doesn't support File Format Version V. For drawing opcodes, we might also need to say whether unsupported opcodes should be replaced by Or maybe a single "IfElseV(M, N, V)" opcode that:
Control Flow OpcodesAdd "JumpXX(N)" opcodes to skip the next N bytes if Add an explicit "Return" opcode?? EOF (End-of-File) or End-of-FileSegment is still end of graphic. Might not be necessary if equivalent to an (unconditional?) jump to the end. Arithmetic Opcodes
Crazy (??) idea: just embed an eBPF interpreter (constrained similar to what the Linux kernel does, e.g. runtime verification of no backwards branches) and let authors/tools write their own ease-in ease-out curves or generally go wild. One complication is that IconVG speaks TweeningLike the 'function call' opcode, but with twice the number of args (TBD: is "twice" necessary if matrix lerping can be done by Transformation Matrix Support and Arithmetic Opcodes??). The two separate graphics are tweened according to a zero-to-one blend argument (in
AnimationAnimation comes from combining almost all of the above. User program passes t and other parameters (e.g. if various UI buttons are clicked), various FileSegment sub-graphics are programatically transformed, composed, tweened or skipped. We might also need new metadata chunk for animation length and loopiness. The following is hand-wavy, but the intention is for 'leaf nodes' (which don't make 'function calls', they're just a filled path) to be 'compilable' / uploadable to GPU-friendly formats and uniquely identified by their Consider restricting nodes to hold either 'function call' ops or 'drawing mode' ops but not both: nodes are either a (pure) branch or a leaf. TBD / Punted to FFV 3??
Still Out Of Scope
|
If it's MID 8, we could constrain every earlier MID to be e.g. at most 16 bytes long, which should be enough for a redirect-pointer if necessary. |
It's hard for me to provide feedback on these because I don't know what the problem domain is. I'm guessing from the list of features that it's substantially different from FFV0's problem domain, which seemed to be "format to allow the material icons to be rendered faithfully at any size from tiny files" (which explained the custom palette, the set of drawing features, gradients as a primitive, and the focus on small file sizes). |
It's my attempt at solving flutter/flutter#1831 and if I understand correctly, FFV 0 / FFV 1 isn't feature-rich enough (e.g. animation). |
I should make my work-in-progress doc for that effort public, but I think you may have seen it. It lists some of the criteria for what such a format would need to address. One of the highlights which seems relevant here is that the top priority is render speed, with file size being somewhat low on the list; ideally one should be able to get relatively close to just copying significant chunks of the raw data into a shader to draw most of the image. I don't know if the opcode-based approach of IconVG can achieve that. |
(By which I mean I literally don't know. There's an effort underway to provide arbitrary SPIR-V shader support for Flutter, and once that is landed I hope to experiment with it and see what kind of vector graphics renderer one can build directly into a shader.) |
Some more thinking out loud: if we had a "JumpLOD(H0, H1, N)" opcode, that skipped the next N bytes if the height-in-pixels H was outside the H0..H1 range, then we wouldn't need the LOD registers. Skipping N bytes in one motion would also be simpler and faster than decoding one opcode at a time until we're back in LOD range. Or maybe we add the JumpXX opcodes and also another one to set |
Typo spotted?
Should be 0x08 and 0x10? |
No, it's 0x10 and 0x20. MIDs are encoded as Natural Numbers and the IconVG spec says "For a 1 byte encoding, the remaining 7 bits form an integer value in the range [0, 1<<7). For example, 0x28 encodes the value 0x14 or, in decimal, 20". |
Another update summarizing my current thinking, in case anyone's interested. GoalsI still like the "mission statement" at the top of the main README file. "A compact, binary format for simple vector graphics: icons, logos, glyphs and emoji." Longer term, maybe animation or security would also gain an explicit mention. Compactness is a goal, but it's not the only goal. The aim isn't compactness at any cost. "Just use gzipped SVG" might be competitive in terms of compactness, but a very different story from a security and implementation complexity perspective. Simplicity is also a goal, but again, it's not the only goal. There's usually also a trade-off between simplicity and feature richness. ChangesDropping features from FFV 0
Future-proofing
New features
|
Unless carefully specified, this would mean that jumping to the middle of other operation is possible. I don't think this is desirable for a number of reasons including the security implication. I expect that the parsing cost is not very high, so I think this should be "decode but ignore next N instructions" instead. |
Well, this requires deciding how long (in bytes) each reserved opcode will be. Specifying that today could be awkward if we want to eventually have some sort of scripting or general computing (to support animation), but we haven't concluded yet how that'll be implemented or represented on the wire. |
Some more thinking out loud... The way that gradients are encoded in the unused parts of alpha-premultiplied RGBA space is clever. But somebody (I forget who) once told me that a difference between programming and software engineering is whether "clever" is a compliment or a pejorative. I can't find the link, but I do remember @Hixie saying at some point that this cleverness makes it hard, in the future, if we want to add different sorts of paints. For example, blend modes (color dodge), effects (blurs) or something something hit-testing. @lifthrasiir also made the point in #31 that a lot of a gradient's description could be "opcode arguments" instead of being cleverly squeezed into the CREGs. Perhaps we should split the paint ops (what's currently
"Explicit opcode arguments" means that a "special paint" opcode is followed by a number of extra bytes, the way that an Afterwards, If we also encourage a 'stack' model per #31, so that assigning Overall, changing how gradients are represented would make it a little harder to upgrade FFV 0 to FFV 1 automatically, if the graphic uses gradients, but it's probably still doable. |
That's true, but it is not much different from putting the length information for any subsequent opcode (unless multiple such opcodes in a run are frequent). I think the "special paint" opcode you've mentioned is a good candidate to include the explicit length for example. As noted by Hixie in #11, we need to explicitly decide if two unrelated data can overlap or can't. I prefer overlap to be impossible, mainly because it would be easier to control the interpretation than otherwise. If overlap is possible we risk diverging interpretations. Consider the following:
The opcode Y is overlapping with arguments to X in this example, and this desynchronization can result in wildly different interpretations or (more usually) an invalid image only when X is supported. Ideally we want this situation to be impossible at all. One alternative is the following:
All implementations since FFV 1 can determine the entire structure, but only those supporting X can execute the opcode X. No byte can be interpreted in multiple ways. This is not the only way to do that, but it seems that encoding the length of arguments right into all future opcodes is necessary. |
Another way to do this would be to split the opcode space by number of arguments, For example, Opcodes 0x00 .. 0x1F have zero arguments, opcodes 0x20 .. 0x7F have 6 arguments, opcodes 0x80 .. 0xDF have 12 arguments, opcodes 0xE0 .. 0xFF have 16 arguments. Or whatever. Or equivalently, opcodes could be two bytes long, with one byte always coincidentally giving the length of arguments. The point is that you decouple the parsing from the interpreting, so that parsing is future-proof. |
Some more thoughts. They're not final, I just want to write down some ideas-in-progress before I forget. Ring-Stack Registers
The first 128 opcodes (4+2+1 bits) set REG values
The next 54 (48 + 6) opcodes specify path geometryThe low 4 bits form a number
Processing the ellipse or parallelogram opcodes requires knowing the 'current point' to start from, also known as the 'pen location'. This is just the last coordinate pair of a LineTo, QuadTo, CubeTo or MoveTo op. For example, after 5 consecutive CubeTo operations, the current point is set to the last of the 15 coordinate pairs (5 * 3 = 15). The next 10 opcodes are miscellaneous / reserved
The next 32 opcodes specify path fillsFills close any in-progress path.
TBD: complexity 0 might be repurposed for hit-testing: filling rough paths with multiple invisible-but-different colors. The next 4 opcodes specify control flowThe first three are followed by a natural number
It's invalid to jump past the end of the file or macro segment. The next 4 opcodes specify sub-routinesThey are typically followed by an 8 byte FileSegment (40 bit file offset, 24 bit file length) and possibly further arguments.
The macro opcodes All four opcodes are a single instruction for "jump past the next The last 24 opcodes are reserved
|
More thoughts... FileSegmentsFileSegments are tweaked. There's a
There's also a IconVG files can be larger than 2 GiB. The redirect bit being set on an Absolute FileSegment means that the 31+24=55 middle bits are a file offset for another 16 bytes: Opcodes54 Path Geometry OpcodesThe low 4 bits form a number
Processing the ellipse or parallelogram opcodes requires knowing the 'current point' to start from, also known as the 'pen location'. See Three Points (Two Opposing) Define an Ellipse. For example, after 5 consecutive CubeTo operations, the current point is set to the last of the 15 coordinate pairs (5 * 3 = 15). 2 Miscellaneous Opcodes
4 Jump / Return OpcodesThe first three are followed by a natural number
It's invalid to jump past the end of the file or sub-routine FileSegment. 4 Call Sub-routine Opcodes
If the opcode The ATM (or lack of it) is followed by a 4 byte Inline FileSegment (e.g. 'switch to scripting mode') or 8 byte Absolute FileSegment (e.g. 're-use shared paths and fills', 're-use shared scripts'), depending on the opcode These four opcodes are only valid when executing 'at the top level'. They're invalid if encountered when already in a sub-routine call. 64 Set Register Opcodes64 ring-stack registers
For the first 48 opcodes, the low 4 bits give an For the last 16 opcodes, let "Sets the low/high 32 bits" means that the opcode is followed by a Low 32 bits are interpreted as unsigned 16.16 fixed point when used as gradient stops (e.g. High 32 bits are intepreted as alpha-premultiplied RGBA colors. Alpha less than any of Red, Green or Blue has special meaning, as they would otherwise be invalid alpha-premultiplied colors. That special meaning is either a blend (Alpha is zero) or a 'discriminated transparent black' (Alpha is non-zero). A blend is what FFV0 calls a 3-byte indirect color. G and B give 1-byte colors
1-byte colors are similar to but tweaked from FFV0. A 'discriminated transparent black' means that the paint is a no-op, in terms of modifying pixel colors, but having multiple 'transparent black' values can be useful for hit-testing: this shape is 'transparent black number 1', this other shape is 'transparent black number 2', etc.
64 Fill OpcodesThe opcode's low 4 bits give an
For the 64 Reserved Opcodes
|
They were generated by a simple program using https://go-review.googlesource.com/c/exp/+/332989 in the golang.org/x/exp/shiny/iconvg repository. Issue #4 discusses File Format Versions 0 and 1 (FFV0 and FFV1). The new Ellipse opcodes in FFV1 mean that there is no longer any distinction between "low" and "high" resolution forms of the action-info icon. Issue #30 discusses changing the file extension from ivg (for FFV0) to iconvg (for FFV1).
The iconvg Go implementation started at golang.org/x/exp/shiny/iconvg in 2016, speaking a file format retroactively named FFV0 (File Format Version 0). The github.com/google/iconvg/src/go/* packages were created more recently in 2021 and will speak a revised format, FFV1. The long term plan is for the github.com/... packages to *only* speak FFV1. The golang.org/... package (this package) will continue to speak FFV0 directly and will additionally delegate decoding FFV1 to the github.com/... packages. FFV0 will be considered a deprecated experiment (it was marked EXPERIMENTAL ever since its inception). See google/iconvg#4 This commit, which decodes FFV0 and encodes FFV1, lives in the golang.org/... packages, since it involves FFV0. Testing it (beyond the basic consistency checks in upgrade_test.go) lives in the github.com/... packages, since that involves decoding FFV1. For example, some outputs of the UpgradeToFileFormatVersion1 function (added by this commit) were checked in as google/iconvg@c98b08c Change-Id: Ib5861ae97928cf31faf207b915568532e2624f09 Reviewed-on: https://go-review.googlesource.com/c/exp/+/332989 Trust: Nigel Tao <[email protected]> Reviewed-by: Andrew Gerrand <[email protected]>
Summary
I propose to:
Background
Since its inception in 2016, IconVG has always carried the caveat that "WARNING: THIS FORMAT IS EXPERIMENTAL AND SUBJECT TO INCOMPATIBLE CHANGES".
Issue #2 in this repository is about adding animation to IconVG graphics. Tweening would almost certainly involve transformations (in the "affine transformation" sense) and interpolation.
The original IconVG design took the entirety of the SVG path model, including elliptical arc segments. Unlike
line_to
,quad_to
andcube_to
,arc_to
's parameterization is unique, not being a sequence of(x, y)
coordinate pairs, and a boolean argument likelarge-arc-flag
is impossible to interpolate smoothly.Rasterization backends like Cairo and Skia also don't provide
arc_to
as a primitive, or if they do, not in the way that SVG parameterizes it. We usually approximate arcs as cubic splines.Also recall that IconVG is a presentation format, not an authoring format, and it already isn't able to represent groups, strokes, text, etc 'natively'. Authoring tools like Illustrator or Inkscape, if they could export to IconVG, are expected to 'lower' e.g. stroked paths to more primitive operations (filled paths), the same way that they would 'flatten' layers if exporting to PNG. I'd expect such tools could also 'lower' arcs to cubic Béziers during export.
Thus, I'm considering removing arcs from the file format. This new version (File Format Version 1) would not be a superset of FFV 0 per se, but FFV 0 files could be converted in a straightforward way and the rasterizations would be equivalent. In essence, 'lowering' arcs becomes the responsibility of the authoring tools (which get more complicated) instead of the presentation tools (which get simpler).
Separately, the original Go implementation (the
golang.org/x/exp/shiny/iconvg
package in a separate repository) was released as an interim milestone of the unfinished 'Shiny' Go GUI project. IconVG hasn't had much adoption so far, as the only implementation was in Go and so not usable from e.g. C++, Dart or Python GUI programs. In recent weeks, this repository has gained a brand new C implementation, but we still don't yet have a vast back-catalogue of existing IconVG files to constrain us.Bringing all of the above together, if I were ever to make an IconVG FFV 1, especially one that isn't a superset of FFV 0 (because arcs), then now is the time to do it.
This issue is a place to discuss that process and what other features to add or warts to remove as part of FFV 1.
File Format Changes
See the spec for context.
The major change is:
A
anda
arc-related drawing opcodes.Minor clean-up changes are:
0x89
to0x8A
, so that we can distinguish IconVG from PNG (from JPEG from WebP etc) just from the first byte of the file. https://en.wikipedia.org/wiki/List_of_file_signatures doesn't show any previous claims on0x8A
.0x47
(ASCII 'G') to0x31
(ASCII '1') for FFV 1,0x32
(ASCII '2') for FFV 2, etc.0x10
and0x20
). Since metadata is presented in increasing MID order, the gaps allow future extensions to insert (optional) metadata chunks before these existing ones.Implementations
golang.org/x/exp/shiny/iconvg
, will speak FFVs 0 and 1+, delegating the latter to the 'new' Go library.Notably, any existing Go code (using the 'old' Go library) displaying existing (FFV 0) files will continue to work.
Timeline
FFV 1 should be finalized 'soon'. FFV 2 is more open ended and will require extensive prototyping.
The text was updated successfully, but these errors were encountered: