Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Add a buffered StorageAPI helper #5294

Merged
merged 6 commits into from
Jun 7, 2022

Conversation

gibson042
Copy link
Member

As suggested at #5258 (comment)

Description

This PR extends #5292, and lays some groundwork for #4558. c67df20 is the most relevant commit.

Security Considerations

None that I can see.

Documentation Considerations

As commented, I wonder if StorageAPI functions should be extracted into their own module.

Testing Considerations

makeBufferedStorage really should be tested. Any pointers on a good analog?

@gibson042 gibson042 added the SwingSet package: SwingSet label May 4, 2022
@gibson042 gibson042 requested review from warner, michaelfig and FUDCo May 4, 2022 23:48
@gibson042 gibson042 self-assigned this May 4, 2022
changedKeys = new Set();
// eslint-disable-next-line require-yield
*getKeys(_start, _end) {
throw new Error('not implemented');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm expecting to introduce this for #4558.

Copy link
Contributor

@FUDCo FUDCo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems legit.

Copy link
Member

@warner warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way swingset manages the kvStore is pretty old and creaky, so refactoring is welcome. That said, I'm slightly skeptical that swingset is a good source for a generic "buffered KV store" library (as your "should this live in a different package?" comment notes). And I don't really understand how cosmic-swingset uses its own buffering layer, so I can't effectively argue for or against sharing an implementation. I'm not excited about deep imports (it imposes hidden constraints on future internal refactoring). But most of those concerns might go away if this moved into a separate package some day.

So I guess I'm -0 on the change.. it's a good refactoring, it feels slightly weird, but it's probably a net win overall.

The activityhash write sequencing definitely needs to change to match the previous commit boundaries. Our tests on activityhash should be observing the writes made to the backing store.. if they aren't, we should improve them, and maybe they should also test two cranks in a row (which might have caught the inclusion of activityhash in the hash).

@@ -202,22 +90,15 @@ export function buildCrankBuffer(
hasher.add('\n');
hasher.add(crankhash);
hasher.add('\n');

// Store the new activityhash.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using the buffered kvStore for this following kvStore.set('activityhash'), the hash gets put into the pending buffer, not the actual store. To match the previous behavior, I think this needs an additional commit() call afterwards, or perhaps it'd be safe to move the one commit() from the start of this method to the end.

Also, we haven't been including the set('activityhash') in the activityhash itself. I don't know how to best achieve that with this new layering.. making onPendingSet ignore the one key feels weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kvStore in this scope is the backing StorageAPI instance, not the crankBuffer wrapping it. These bindings are identical to those currently in buildCrankBuffer, and so is the call sequencing AFAICT.

Current: https://github.com/Agoric/agoric-sdk/blob/6a5f377c5ba115576a0e813cc6266039452732c0/packages/SwingSet/src/kernel/state/storageWrapper.js

  • crankBuffer.set() and crankBuffer.delete() update additions and deletions and feed consensus-relevant data to crankhasher.
  • commitCrank() sends a sequence of calls to the backing kvStore:
    1. kvStore.set(key, value) for each addition
    2. kvStore.delete(key) for each deletion
    3. oldActivityhash = kvStore.get('activityhash')
    4. kvStore.set('activityhash', activityhash), where activityhash is a digest of data including oldActivityhash and crankhasher.digest() (before the replacement of crankhasher between steps 2 and 3).

Proposed: https://github.com/Agoric/agoric-sdk/blob/45406c68018b36f94cfd5cd380be888ca31ed3ac/packages/SwingSet/src/kernel/state/storageWrapper.js

I've added clarifying comments:

// Get the old activityhash directly from (unbuffered) backing storage.
let oldActivityhash = kvStore.get('activityhash');
if (oldActivityhash === undefined) {
oldActivityhash = '';
}
// Digest the old activityhash and new crankhash into the new activityhash.
const hasher = createSHA256();
hasher.add('activityhash');
hasher.add('\n');
hasher.add(oldActivityhash);
hasher.add('\n');
hasher.add(crankhash);
hasher.add('\n');
// Store the new activityhash directly into (unbuffered) backing storage.
const activityhash = hasher.finish();
kvStore.set('activityhash', activityhash);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, gotcha, I was confused by the overlapping names (and I never remember which way around JS's destructuring assignment works). The extra comments helped a lot.

@gibson042 gibson042 requested a review from warner May 6, 2022 15:04
@gibson042 gibson042 force-pushed the gibson-2022-05-storageapi-buffering branch from 80afe86 to 5109cca Compare May 13, 2022 15:19
@gibson042 gibson042 requested review from warner and removed request for warner May 18, 2022 17:53
Copy link
Member

@warner warner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentatively approving. Please get @michaelfig 's opinion on the chain-main.js changes to make sure I didn't miss something, and please find a way to write a test for it. packages/SwingSet/test/test-state.js has tests for buildCrankBuffer, maybe you could crib something from there to build a new packages/cosmic-swingset/test/test-chain-storage.js.

if (additions.has(key)) return additions.get(key);
if (deletions.has(key)) return undefined;
const value = kvStore.get(key);
if (onGet !== undefined) onGet(key, value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any particular reason to not use if (onGet) onGet(key, value) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the behavior would be equivalent either way. Is the coercing form preferred?

// Enqueue a write of any retrieved value, to handle callers like mailbox.js
// that expect local mutations to be automatically written back.
onGet(key, value) {
buffered.set(key, value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't quite trace what's happening here. There's a weird property that I think we need to maintain: if the caller does x = s.get('foo'); x.prop = 'changed'; s.commit(), then we need to write back the modified .prop to the backing store. (I'm embarrassed that this seems to be necessary, and I'm sure that it's because I made a really weird API for the original mailbox structure, and my only defense is that this was some of the earliest Swingset code ever written).

Does this code maintain that property? I can tell there's a cache here at this level (storage), and a buffer of pending writes at the makeBufferedStorage level, but I'd like the expert to tell me if this buffered.set is sufficient to get the same value into both places.

I like makeBufferedStorage on the swingset side, it seems like a nice refactoring. I'm a little dubious about the utility using it here if maintaining that funky mutable-read-data behavior requires something weird. But if it works, and it supports the next phase of your work, I'm ok with it.

I would feel better if there were tests on this code, and I believe there are not. makeBufferedStorage should be testable within swingset, and this storage definition should be testable in isolation (from swingset and from the chain's backing store).

Copy link
Member Author

@gibson042 gibson042 Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this code maintain that property?

The short answer is "yes". This onGet configuration for the makeBufferedStorage() call that wraps the chainSend()-backed storage interface results in every get(key) call on the returned storage interface being immediately followed with a set(key, fetchedValue) that adds a reference to its additions map (where commit() will find it and translate to a set(key, valueWithAnyMutations) against the chainSend()-backed storage—which performs its own translation to a stringified {method: 'set', key, value} message). All of which is intended to match current behavior, complete with idiosyncrasies.

@michaelfig
Copy link
Member

I'd like to test this PR against actual chain+ag-solo, but I think I have to wait for you, @gibson042, to resolve the conflicts and bring up-to-date with current master.

@gibson042 gibson042 force-pushed the gibson-2022-05-storageapi-buffering branch from 5109cca to 50e3d7d Compare June 6, 2022 23:02
Copy link
Member

@michaelfig michaelfig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!

@gibson042 gibson042 added the automerge:squash Automatically squash merge label Jun 7, 2022
@gibson042 gibson042 removed the automerge:squash Automatically squash merge label Jun 7, 2022
@gibson042 gibson042 added the automerge:squash Automatically squash merge label Jun 7, 2022
@mergify mergify bot merged commit 46bda4c into master Jun 7, 2022
@mergify mergify bot deleted the gibson-2022-05-storageapi-buffering branch June 7, 2022 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge:squash Automatically squash merge SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants