You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on HDF5Arrays.jl, for arrays that have their data stored on disk in HDF5 format. I'm thinking of working on something like a TiledIndex or a ChunkedIndex (with corresponding TiledIndices and IndexTiled <: IndexStyle) With those, efficient functions for everything from eachindex to broadcasting to reduce could be written, that iterate tile by tile.
The start of a TiledIndex implementation might look something like:
struct TiledIndex{N}
tile::NTuple{N, Int}# which tile
size::NTuple{N, Int}# size of tile
index::NTuple{N, Int}# index within tileendTuple(t::TiledIndex) = t.index .+ (t.tile .-1) .* t.size
CartesianIndex(t::TiledIndex) =CartesianIndex(Tuple(t))
struct TiledIndices{N}
tiles::NTuple{N, UnitRange{Int}}# cartesian indices of tiles
size::NTuple{N, Int}# size of each tile
indices::NTuple{N, CustomRange}# Represents indices within tilesend# iteration of custom range with start = 2, stop = 1, repeat = 2, length = 3 yields the following# 2, 3, 1, 2, 3, 1, 2, 3, 1
DiskArrays.jl has already done some of this work on the iteration front, though it's missing the concept of a chunked or tiled index. Also, I'm also not sure if forcing users to subtype something like an AbstractTiledArray is the best path forward. Hence, I'm thinking about using the IndexStyle trait.
One issue is that functions like _mapreducedim! don't currently dispatch on the IndexStyle trait. This is an issue I'm not sure how to deal with.
I'd be happy to help work on this, but I'm not sure how best to proceed.
Should iterating over arrays tile by tile be implemented via something like an AbstractTiledArray, or via the existing IndexStyle trait?
If using traits does indeed make more sense, how do we make efficient reducing functions, given that the base functions don't dispatch on IndexStyle?
The text was updated successfully, but these errors were encountered:
Edit: I think having the size of the tile be part of the index is a bad idea. When indices need to be converted to cartesian indices, you can get the tile size from the array, rather than the index.
I think the solution to rerouting base julia functions is to add an AbstractTiledArray for now, for which we can reroute reduce, broadcast, etc. operations without breaking anything. Long term we'll see about trying to generalise it further to any AbstractArray using traits like IndexStyle.
I'm working on HDF5Arrays.jl, for arrays that have their data stored on disk in HDF5 format. I'm thinking of working on something like a
TiledIndex
or aChunkedIndex
(with correspondingTiledIndices
andIndexTiled <: IndexStyle
) With those, efficient functions for everything fromeachindex
to broadcasting toreduce
could be written, that iterate tile by tile.The start of a
TiledIndex
implementation might look something like:DiskArrays.jl has already done some of this work on the iteration front, though it's missing the concept of a chunked or tiled index. Also, I'm also not sure if forcing users to subtype something like an
AbstractTiledArray
is the best path forward. Hence, I'm thinking about using theIndexStyle
trait.One issue is that functions like
_mapreducedim!
don't currently dispatch on theIndexStyle
trait. This is an issue I'm not sure how to deal with.I'd be happy to help work on this, but I'm not sure how best to proceed.
AbstractTiledArray
, or via the existingIndexStyle
trait?The text was updated successfully, but these errors were encountered: