You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Usecase: Mounting a TAR with a lot of compressed (single block and therefore not parallelizable) xz archives. Such data can be created by archiving logfiles that have been logrotated and compressed with (single block) xz or gzip, which are either not parallelizable or have not been parallelized yet.
Because the outer layer is uncompressed, a simple folder, or because it might be compressed with parallel decompressable bz2, the reading speeds of the outer layer should be vastly faster than those for the inner xz files. Therefore, it would be helpful, if, for recursive mounting, the nested archives could be analyzed in parallel.
Note that the performance improvements by this are moot if every backend could and had been parallelized. Because then, there would always be a bottlenecking layer, which would hog all processing cores and increasing parallelization over mutliple archives would not amount to anything or might even make things worse. However, some formats are very hard to parallelize like single-block xz and zstd files. I started a parallelized gzip decoder prototype, which I'm kinda close to getting a working prototype but it might turn out to be more difficult than thought and the single-core performance is worse than other implementations, which is demotivating. There should be sufficient edge-cases for this to still make sense even after gzip has been parallelized. And implementing this should also be much easier.
It basically optimizes the same use-cases as #79 and therefore might have even smaller benefits after #79 has been implemented, namely only for the first mounting.
The text was updated successfully, but these errors were encountered:
Usecase: Mounting a TAR with a lot of compressed (single block and therefore not parallelizable) xz archives. Such data can be created by archiving logfiles that have been logrotated and compressed with (single block) xz or gzip, which are either not parallelizable or have not been parallelized yet.
Because the outer layer is uncompressed, a simple folder, or because it might be compressed with parallel decompressable bz2, the reading speeds of the outer layer should be vastly faster than those for the inner xz files. Therefore, it would be helpful, if, for recursive mounting, the nested archives could be analyzed in parallel.
Note that the performance improvements by this are moot if every backend could and had been parallelized. Because then, there would always be a bottlenecking layer, which would hog all processing cores and increasing parallelization over mutliple archives would not amount to anything or might even make things worse. However, some formats are very hard to parallelize like single-block xz and zstd files. I started a parallelized gzip decoder prototype, which I'm kinda close to getting a working prototype but it might turn out to be more difficult than thought and the single-core performance is worse than other implementations, which is demotivating. There should be sufficient edge-cases for this to still make sense even after gzip has been parallelized. And implementing this should also be much easier.
It basically optimizes the same use-cases as #79 and therefore might have even smaller benefits after #79 has been implemented, namely only for the first mounting.
The text was updated successfully, but these errors were encountered: