Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

an example of adding a retire list and a recycle function #111

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

oconnor663
Copy link

This is a toy PR for discussion, not code ready to land. I've seen a couple threads (#52 (comment) and #86 (comment)) where you talked about "retiring" slots, and I wanted to flesh that out a bit.

The main downside of the retirement strategy seems to be that, in exchange for the promise that we'll never see any logic errors (which I really really want), we give up on the idea that our app can run forever. And since generations are included in serialization, this is true even if we save the world to disk and restart our app. Even though the leak is really slow, this is a bummer.

But if we include a "recycle" function, I think we can have the best of both worlds. The contract with the caller would be that they have to go through all the keys in their app and delete all the keys that are dangling (or replace them with null keys) before they call recycle, which resets the version of all free and retired slots to 0. If they get that wrong, then they're exposed to the same logic errors as in the wraparound strategy, just much sooner. Since the leak is so slow in practice, most applications might not bother to recycle at all, but the fact that they have the option could provide "peace of mind".

Aside: We could consider an even more aggressive recycle function, which in addition to setting all empty slots to version 0 also resets all occupied slots to version 1. The caller is already iterating over all their keys, so it might not be any extra work to fix up non-dangling keys at the same time. However, that would introduces a new restriction: If you loop over the same non-dangling key more than once, the first iteration might make that key appear to be dangling, and then the second iteration might incorrectly delete/null the key. So you need the additional guarantee that you're only going to touch each key once. I think most cycle-detection algorithms don't make that tight guarantee, and either way this seems like it would be super error-prone. The additional value of doing this over just recycling empty slots seems super low, so I don't think it's worth introducing the complexity, but it's interesting to think about. I guess it also competes with the more general "copy everything into a brand new slotmap" strategy, which fills holes in addition to resetting versions, and which doesn't require any special support from this crate.

@orlp
Copy link
Owner

orlp commented Nov 3, 2023

But if we include a "recycle" function, I think we can have the best of both worlds. The contract with the caller would be that they have to go through all the keys in their app and delete all the keys that are dangling (or replace them with null keys) before they call recycle, which resets the version of all free and retired slots to 0. If they get that wrong, then they're exposed to the same logic errors as in the wraparound strategy, just much sooner.

I don't agree that this is the best of both worlds. I think this would actually increase the risk significantly due to logic errors. I can't imagine any program that would call retire that wouldn't also be well-served by wrapping version numbers.

The main downside of the retirement strategy seems to be that, in exchange for the promise that we'll never see any logic errors (which I really really want), we give up on the idea that our app can run forever.

Well, yes and no. The default I'm currently leaning towards is u32 index, u32 version (non-wrapping, so retiring). This means that you can use every memory location ~2.1 billion times before it leaks. I think this is a sane default for most programs.

However, this is only a default. The change would come with allowing you to set your own sizes. So you could use a 64-bit non-wrapping version field. Suppose the items you store in the slotmap are 64 bytes (a rather large type, much larger than usual), and you do, say, 100 billion insertions / removals per second (which, for the record, would mean you have 46 terabit/s throughput), then even in this absolutely ridiculous scenario you would be leaking 22 bytes / year worth of memory. Slot in a whopping 2KiB extra in your machine and you're good to go for another century.

@oconnor663
Copy link
Author

I can't imagine any program that would call [recycle] that wouldn't also be well-served by wrapping version numbers.

I think in order to get the same "no logic bugs" guarantee from wrapping version numbers, the program would have to make sure that it never retains dangling references. But if it can guarantee that (or if it can tolerate the logic errors), then I don't think it needs generational indexes at all, and it might be more efficient to just use a slab?

@orlp
Copy link
Owner

orlp commented Nov 3, 2023

I think in order to get the same "no logic bugs" guarantee from wrapping version numbers, the program would have to make sure that it never retains dangling references.

No, it simply needs to make sure to not retain dangling references for more than 2^31 operations.

I just can't imagine a program that would be interested in calling recycle, but would at the same time do it with gaps of more than 2^31 operations at a time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants