Jagged is a collection of wait-free concurrent data-structures.
The data-structures in this collection exploit the same central idea, the jagged array, and as a result exhibit the same guarantees, and limitations.
What Jagged intends to offer:
- A convenient & safe API.
- No synchronization between readers.
- Minimum synchronization overhead between reader and writer.
- Maximum portability -- including minimum dependencies.
What Jagged does NOT intend to offer:
- A hotbed of experimental features. No nightly here, and thus no const generics.
Unproven.
Subjective:
- Confident in correctness of memory ordering.
- Less confident in proper use of
unsafe
, and there's a LOT of it.
Objective:
- Well tested.
- Unit-tests cover over 99% of core code.
- Unit-tests run cleanly under MIRI.
- Unit-tests run cleanly under valgrind.
- Not fuzzed.
- Not audited.
- Not proven.
Use at your own risks, reviews are welcome.
This crate contains 3 sets of data-structures:
Vector<T>
, and its accompanyingReader
andSnapshot
.HashMap<K, V>
, and its accompanyingReader
andSnapshot
.HashSet<T>
, and its accompanyingReader
andSnapshot
.
Each set follows the same pattern:
- The main data-structure allows concurrent writes.
- The
Reader
allows concurrent reads, and reflects updates. - The
Snapshot
allows concurrent reads, but does not reflect updates.
All Jagged data-structures share the same core properties -- guarantees and limitations:
- Wait-free: insertion and reads are wait-free, and it possible to take a snapshot of the data-structure in O(1) to eschew part of or all of the synchronization based on the data-structure.
- Contention-free reads: readers only contend with the writer, and not with any other reader.
- Single writer: only the main data-structure allows concurrent insertions, and it cannot be accessed from multiple threads concurrently.
- Multiple readers: multiple references and snapshots can be created from the main data-structure, and allow wait-free concurrent read access.
Those properties are supported by memory stability. Once inserted into a Jagged data-structure, an element will never move. This allows keeping references to the element while simultaneously inserting more elements.
The Jagged library intends to be maximally portable, whilst still being convenient to use.
In order to so, two compilation modes are supported:
- By default, the Jagged library depends on
std
. - By NOT specifying the
with-std
feature, the Jagged library only depends oncore
.
Note: ideally, it would also optionally depend on alloc
, unfortunately this
crate is still nightly-only.
The Jagged library has no non-dev dependencies, and intends to remain lightweight.
Memory stability is a key concept for efficient sharing. While wrapping an
element in an Arc
allows sharing it effortlessly, it incurs a cost: an atomic
increment and decrement each time the Arc
is cloned and dropped.
While a single increment/decrement is relatively inexpensive, by itself, it does lead to contention on the cache-lines. Having readers contend between them is unfortunate, and limits scaling to multiple cores and, especially, multiple sockets.
The key insights behind the Jagged Array are:
- Memory stability: all the way down, even internals should be stable in memory.
- Amortized allocations: arrays are good for your cache.
This is achieved by using two layers of arrays:
- An outer array of static size. Something like:
[InnerArray<T>; N]
. - An inner array of fixed dynamic size. Something like:
&[UnsafeCell<MaybeUninit<T>>]
.
And having each new inner array doubling the capacity of the whole structure.
That is, the capacities of the inner arrays follow the pattern
[1, 1, 2, 4, 8, 16, 32, ...]
.
An N
of 22 means that the maximum capacity that can be achieved is 2**21
times the capacity of the first inner array, or about 2 millions.
The exact value of N
depends on the data-structure, and is an implementation
detail.
The representation of the memory layout is jagged.