Skip to content

Commit

Permalink
persist: reintroduce in-mem blob cache
Browse files Browse the repository at this point in the history
Originally introduced in MaterializeInc#19614 but reverted in MaterializeInc#19945 because we were
seeing segfaults in the lru crate this was using. I've replaced it with
a new simple implementation of an lru cache.

This is particularly interesting to revisit now because we might soon be
moving to a world in which each machine has attached disk and this is a
useful stepping stone to a disk-based cache that persists across process
restarts (and thus helps rehydration). The original motivation is as
follows.

A one-time (skunkworks) experiment showed that showed an environment
running our demo "auction" source + mv got 90%+ cache hits with a 1 MiB
cache. This doesn't scale up to prod data sizes and doesn't help with
multi-process replicas, but the memory usage seems unobjectionable
enough to have it for the cases that it does help.

Possibly, a decent chunk of why this is true is pubsub. With the low
pubsub latencies, we might write some blob to s3, then within
milliseconds notify everyone in-process interested in that blob, waking
them up and fetching it. This means even a very small cache is useful
because things stay in it just long enough for them to get fetched by
everyone that immediately needs them. 1 MiB is enough to fit things like
state rollups, remap shard writes, and likely many MVs (probably less so
for sources, but atm those still happen in another cluster).
  • Loading branch information
danhhz committed Jan 5, 2024
1 parent e941223 commit fa64c99
Show file tree
Hide file tree
Showing 2 changed files with 607 additions and 42 deletions.
7 changes: 7 additions & 0 deletions src/persist-client/proptest-regressions/internal/cache.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Seeds for failure cases proptest has generated in the past. It is
# automatically read and these particular cases re-run before any
# novel cases are generated.
#
# It is recommended to check this file in to source control so that
# everyone who runs the test benefits from these saved cases.
cc 520a1ce380cba2b6a303454a884b5feecbf32e3628eae0f2840b793c9a75b78a # shrinks to state = [Insert { key: 235, weight: 0 }, Get { key: 235 }]
Loading

0 comments on commit fa64c99

Please sign in to comment.