Add in-memory caching #994

Swatinem · 2023-01-27T12:02:24Z

#979 changes the internals of the Cacher to use moka.

That crate also allows us to keep a number cache items in memory for some time, avoiding disc access and loading of cache items, which depending on the item type can be expensive.

To fully take advantage of this, we would need to add the following things:

Properly implement RFC: Re-design cache keys (filesystem paths) #983 so that our caches are truely immutable.
Figure out a way how to "weigh" cache items, especially though that are variable-size.
Decide on how to do expiration. We currently expose the expiration_time of the underlying cache file to the CacheItemRequest. The idea originally was to then combine that with the different sources that lead to the cache. But with RFC: Re-design cache keys (filesystem paths) #983, that might not be necessary anymore, as caches will be truely immutable. Therefore, we can keep the expiration_time completely internal to the Cacher abstraction and use that.

To inform us about the "weight" of the cache item, and a guadance on how to size these caches, here is a table of the different caches that we have and a few properties of them:

Cache	Accesses	Cost to load	Weight
object_meta	very high: on every cfi/symcache/sourcebundle request, for every candidate	low: parsing tiny json	low: just a tiny json
objects	medium: when needed for conversion, for sourcebundles	mixed: depending on type, zlib, parsing manifest JSON for sourcebundle	mixed: an open FD, parsed obj container, parsed manifest JSON for sourcebundle
cfi	medium: for minidumps that have cfi	high: parsing the breakpad format	high: in-memory structures of parsed breakpad
symcache	high: for every symbolication request	low: mmap, validating headers	medium: an open FD, parsed header
auxdifs	mixed: for every existing symcache, depending on candidates	mixed: does XML parsing on every request	medium: an open FD, only parsed when needed
il2cpp	mixed: for every existing symcache, depending on candidates	low: just mmap	medium: an open FD, only parsed when needed
portable pdb	medium: just for .NET stack traces	low: mmap, validating headers	medium: an open FD, parsed header

Some conclusions from this table:

The item most worth caching are object_meta, as they are being accessed for every cfi/symcache/sourcebundle request, for every candidate. They are relatively cheap to parse, and cheap to hold in memory.

CFI items might be worth caching mostly as parsing them is expensive. CFI for public symbols (Microsoft, Apple) are also accessed very frequently.

Objects might not be worth caching. Objects used for cfi/symcache conversion are infrequently used. Sourcebundles are used a lot however, but they are project specific and public sourcebundles shared across all requests pretty much don’t exist. However they are quite expensive to parse.

SymCache / PortablePdb are optimized for fast parsing. Both can be shared across all requests for public symbols though.

Auxdifs / il2cpp: This needs more investigation. Probably low priority?

We should definitely cache object_meta as that has by far the most accesses, and has low weight in-memory.
CFI is worth caching, but needs more work to measure / expose its weight.

The text was updated successfully, but these errors were encountered:

Fixes #994 by adding the infrastructure and defaults for in-memory caching. This implements weighing of cache items (based on a bunch of `size_of`s), and per-item TTL based on the `ExpirationTime`. Each cache defaults to ~100k in-memory size, which should be roughly ~1k items. Except for `object_meta` which defaults to ~100M in-memory and is very hot, and `cficaches` which defaults to ~400M and are expensive to parse.

Swatinem mentioned this issue Jan 27, 2023

Clean up internal Error / Cache propagation #922

Closed

ashwoods added the enhancement New feature or request label Jan 27, 2023

ashwoods assigned Swatinem Feb 6, 2023

Swatinem mentioned this issue Feb 13, 2023

feat: Add support for in-memory caching #1028

Merged

Swatinem closed this as completed in #1028 Feb 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add in-memory caching #994

Add in-memory caching #994

Swatinem commented Jan 27, 2023

Add in-memory caching #994

Add in-memory caching #994

Comments

Swatinem commented Jan 27, 2023