Nested buckets vs key namespacing? #422

joyrexus · 2015-09-15T19:15:27Z

Are there any performance advantages to using nested buckets instead of a suitable key namespace for prefix/range scans?

Context: I'm working on a simple backend web service designed to store/retrieve JSON metadata associated with a set of hierarchically related resources (studies > trials > files). To streamline storing/retrieving this stuff in a bucket, I've started working on a little wrapper for bolt, mentioned in #421. The wrapper provides a PrefixScanner, which makes it relatively straightforward to work with a subset of prefixed keys. Anyway, wondering now if I would have been any better off utilizing nested buckets instead.

hryx · 2015-09-18T22:19:19Z

I'm also interested to know. Not so much for performance reasons, but I don't yet understand what the intended purpose of nesting buckets is. Maybe a one-liner in the readme would help clear that up?

In my case, I'm working on a CLI application that stores relatively little data (metadata caching, latest server response messages, etc.).

benbjohnson · 2015-09-20T19:43:07Z

@joyrexus If you have a large number of keys in the sub-bucket then you can save space by not having to prefix each one. Bolt doesn't do key compression like LSMs do. However, if you have a small number of keys then you'll likely have better performance with simple prefixing since Bolt won't have to traverse down another level into the sub-bucket's b-tree.

@hryx The original use was to essentially to allow sorting on two fields. We had an application that stored user events so we needed users grouped together (bucket level 1) and then their events sorted by timestamp (bucket level 2). The same could be accomplished with a single bucket and prefixing but nested buckets made the code a lot easier to read/write.

Another common use case for nested buckets is for multi-tenancy. For example, if you want to help ensure that one customer doesn't accidentally see another customer's data, you can store each customer's data in a different sub-bucket.

hryx · 2015-09-20T19:51:19Z

That clears it up for me 120%. Megathanks @benbjohnson!

benbjohnson · 2015-09-20T19:54:04Z

👍

joyrexus · 2015-09-21T18:18:40Z

Got it. Thanks @benbjohnson!

benbjohnson closed this as completed Sep 20, 2015

sdboyer mentioned this issue Sep 20, 2017

gps: source cache: protobuf integration golang/dep#1127

Merged

xeoncross mentioned this issue Sep 11, 2018

Performance considerations of using multiple (nested) buckets? etcd-io/bbolt#120

Closed

vivekpatani mentioned this issue Jan 20, 2021

Namespace quota for multi-tenancy etcd-io/etcd#10084

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested buckets vs key namespacing? #422

Nested buckets vs key namespacing? #422

joyrexus commented Sep 15, 2015

hryx commented Sep 18, 2015

benbjohnson commented Sep 20, 2015

hryx commented Sep 20, 2015

benbjohnson commented Sep 20, 2015

joyrexus commented Sep 21, 2015

Nested buckets vs key namespacing? #422

Nested buckets vs key namespacing? #422

Comments

joyrexus commented Sep 15, 2015

hryx commented Sep 18, 2015

benbjohnson commented Sep 20, 2015

hryx commented Sep 20, 2015

benbjohnson commented Sep 20, 2015

joyrexus commented Sep 21, 2015