Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New struct for image mode data #6547

Closed
Yay295 opened this issue Aug 29, 2022 · 9 comments
Closed

New struct for image mode data #6547

Yay295 opened this issue Aug 29, 2022 · 9 comments

Comments

@Yay295
Copy link
Contributor

Yay295 commented Aug 29, 2022

Currently all of the data about the format/mode of an image is stored in the image itself.

struct ImagingMemoryInstance {
/* Format */
char mode[IMAGING_MODE_LENGTH]; /* Band names ("1", "L", "P", "RGB", "RGBA", "CMYK",
"YCbCr", "BGR;xy") */
int type; /* Data type (IMAGING_TYPE_*) */
int depth; /* Depth (ignored in this version) */
int bands; /* Number of bands (1, 2, 3, or 4) */
int xsize; /* Image dimension. */
int ysize;
/* Colour palette (for "P" images only) */
ImagingPalette palette;
/* Data pointers */
UINT8 **image8; /* Set for 8-bit images (pixelsize=1). */
INT32 **image32; /* Set for 32-bit images (pixelsize=4). */
/* Internals */
char **image; /* Actual raster data. */
char *block; /* Set if data is allocated in a single block. */
ImagingMemoryBlock *blocks; /* Memory blocks for pixel storage */
int pixelsize; /* Size of a pixel, in bytes (1, 2 or 4) */
int linesize; /* Size of a line, in bytes (xsize * pixelsize) */
/* Virtual methods */
void (*destroy)(Imaging im);
};

I think it would be cleaner if this information was stored in a separate struct.

struct ImageMode {
    char mode[IMAGING_MODE_LENGTH]; /* Band names ("1", "L", "P", "RGB", "RGBA", "CMYK", "YCbCr", "BGR;xy") */
    int type;                       /* Data type (IMAGING_TYPE_*) */
    int depth;                      /* Depth (ignored in this version) */
    int bands;                      /* Number of bands (1, 2, 3, or 4) */
    int pixelsize;                  /* Size of a pixel, in bytes (1, 2 or 4) */
};

Furthermore, I think some more changes could be made that would simplify coding for some calculations, and allow for more image formats to be processed (#1888).

struct BandDataType {
    char dataType; /* 'u' = unsigned integer, 's' = signed integer, 'f' = floating point */
    UINT8 numBytes;
} BAND_DATA_TYPES[] = {
    {'u', 1}, /* UINT8 */
    {'u', 2}, /* UINT16 */
    {'s', 4}, /* INT32 */
    {'f', 4}, /* FLOAT32 */
};
/* Normal bands are treated normally */
/* Alpha bands may be treated specially by some calculations */
/* Extra bands are ignored (this could be used for padding, or maybe storing non-image data) */
enum BandType { NORMAL, ALPHA, EXTRA };
struct Band {
    char *name; /* "R", "G", "B", "Y", "Cb", "Cr", etc. */
    BandDataType dataType;
    BandType type;
};
struct ImageFormat {
    char *name;      /* Format name ("1", "L", "P", "RGB", "RGBA", "CMYK", "YCbCr", "BGR;xy") */
    Band *bands;     /* Array of bands in this format */
    UINT32 numBands; /* Number of bands in this format */
    UINT32 numBytes; /* Total size in bytes of one pixel in this format */
};
@radarhere
Copy link
Member

@wiredfool did you have any thoughts on this?

@wiredfool
Copy link
Member

I don't have strong opinions on it -- perhaps from not seeing how this reorg would simplify existing code.

The additional formats thing -- where we're looking at multiband > 8bit data -- That one really starts to look like we'd want to be storing the data at a band level, rather than interleaved at the pixel level. That's the only reasonably sensible way that I could see supporting all of the operations that we're doing -- essentially we'd be doing most of the operations N times, one for each band of the image. How that interacts with feeding into the encoders, any color space operations, and so on, is definitely up for discussion.

I come at this from a position of dealing with GeoTiff data, which is the poster child for difficult data formats -- multiband (potentially many), large, tiled, and potentially different data types per band. I wouldn't want to be changing or adding core storage that couldn't be extended to that sort of image.

On the other hand, there are already pretty decent tools for dealing with GeoTiffs in the python ecosystem, at a level of care and precision to geospatial metadata that we're effectively never going to match, so there's not a huge reason for us to be dealing with it on that front.

@Yay295
Copy link
Contributor Author

Yay295 commented Nov 3, 2022

perhaps from not seeing how this reorg would simplify existing code

The idea is that instead of seeing that an image is "RGBA" and having to know that "RGBA" images have four 8-bit bands with the last band being alpha, you would directly see that the image has four 8-bit bands with the last band being alpha. None of the calculations would have to know what the image mode is, because they could just look at the individual bands to see their data type.

The additional formats thing -- where we're looking at multiband > 8bit data -- that one really starts to look like we'd want to be storing the data at a band level, rather than interleaved at the pixel level. That's the only reasonably sensible way that I could see supporting all of the operations that we're doing -- essentially we'd be doing most of the operations N times, one for each band of the image. How that interacts with feeding into the encoders, any color space operations, and so on, is definitely up for discussion.

I don't think the storage format particularly matters as far as the calculations go; we would just have to use different offsets to access the values. Also some of the operations already do things one band at a time, like the ones that need to handle the alpha band differently, and all of the converters. I don't have a strong opinion either way, other than that the code is already written one way and it would have to be changed to work a different way. I think the code could be faster if the data was stored by band since we wouldn't have to skip over pixels while looping, but I don't know how much faster it would be.

@Yay295
Copy link
Contributor Author

Yay295 commented Nov 6, 2022

I was planning on creating a branch to work on this, but I haven't had time lately.

@radarhere
Copy link
Member

None of the calculations would have to know what the image mode is

Perhaps I'm not quite interpreting your meaning correctly, but I think this ignores the fact that both RGBA and RGBa exist. Pre-multiplied alpha means that channel data can't be considered independently when converting.

@Yay295
Copy link
Contributor Author

Yay295 commented Nov 26, 2022

Conversions would still have to be done knowing the mode, but I think almost everything else does already work channel-by-channel.

@Yay295
Copy link
Contributor Author

Yay295 commented Dec 1, 2022

I created a branch for this change and I've converted Fill.c to use these structs.

https://github.com/Yay295/Pillow/tree/mode_struct

I forgot about endianness though so it only works with native endian data currently.

@radarhere
Copy link
Member

Yay295/Pillow@72372ad...6a30bf0 is a diff between the work-in-progress branch and the previous main.

perhaps from not seeing how this reorg would simplify existing code.

Echoing this previous comment, I understand you might find it conceptually neater, but the resulting code doesn't look simpler to me?

@radarhere
Copy link
Member

This is problematic because only the OP perceives the change as advantageous in the current conceptual form, and the WIP doesn't demonstrate otherwise. It's a substantial rewrite without an unequivocal need.

I think this is a PR proposal, rather than an issue to be solved. If a PR is created that shows this approach would dramatically simplify the C code, that would be something to consider, but I don't think this needs to be on the list of things to be addressed in Pillow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants