Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statediff for full node #6

Merged
merged 9 commits into from
Feb 21, 2019
4 changes: 4 additions & 0 deletions cmd/geth/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,10 @@ func makeFullNode(ctx *cli.Context) *node.Node {
if ctx.GlobalIsSet(utils.ConstantinopleOverrideFlag.Name) {
cfg.Eth.ConstantinopleOverride = new(big.Int).SetUint64(ctx.GlobalUint64(utils.ConstantinopleOverrideFlag.Name))
}
if ctx.GlobalBool(utils.StateDiffFlag.Name) {
cfg.Eth.StateDiff = true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to combine this with the RegisterStateDiffService call, if both are based off of the call to ctx.GlobalBool(utils.StateDiffFlag.Name)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, good call!

}

utils.RegisterEthService(stack, &cfg.Eth)

if ctx.GlobalBool(utils.DashboardEnabledFlag.Name) {
Expand Down
83 changes: 63 additions & 20 deletions core/blockchain.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,10 +71,11 @@ const (
// CacheConfig contains the configuration values for the trie caching/pruning
// that's resident in a blockchain.
type CacheConfig struct {
Disabled bool // Whether to disable trie write caching (archive node)
TrieCleanLimit int // Memory allowance (MB) to use for caching trie nodes in memory
TrieDirtyLimit int // Memory limit (MB) at which to start flushing dirty trie nodes to disk
TrieTimeLimit time.Duration // Time limit after which to flush the current in-memory trie to disk
Disabled bool // Whether to disable trie write caching (archive node)
TrieCleanLimit int // Memory allowance (MB) to use for caching trie nodes in memory
TrieDirtyLimit int // Memory limit (MB) at which to start flushing dirty trie nodes to disk
TrieTimeLimit time.Duration // Time limit after which to flush the current in-memory trie to disk
ProcessStateDiffs bool
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly does this flag entail? Hint hint, there are comments on the rest of them ;)

}

// BlockChain represents the canonical chain given a database with a genesis
Expand Down Expand Up @@ -136,6 +137,8 @@ type BlockChain struct {

badBlocks *lru.Cache // Bad block cache
shouldPreserve func(*types.Block) bool // Function used to determine whether should preserve the given block.

stateDiffsProcessed map[common.Hash]int
}

// NewBlockChain returns a fully initialised block chain using information
Expand All @@ -155,24 +158,26 @@ func NewBlockChain(db ethdb.Database, cacheConfig *CacheConfig, chainConfig *par
blockCache, _ := lru.New(blockCacheLimit)
futureBlocks, _ := lru.New(maxFutureBlocks)
badBlocks, _ := lru.New(badBlockLimit)

stateDiffsProcessed := make(map[common.Hash]int)
bc := &BlockChain{
chainConfig: chainConfig,
cacheConfig: cacheConfig,
db: db,
triegc: prque.New(nil),
stateCache: state.NewDatabaseWithCache(db, cacheConfig.TrieCleanLimit),
quit: make(chan struct{}),
shouldPreserve: shouldPreserve,
bodyCache: bodyCache,
bodyRLPCache: bodyRLPCache,
receiptsCache: receiptsCache,
blockCache: blockCache,
futureBlocks: futureBlocks,
engine: engine,
vmConfig: vmConfig,
badBlocks: badBlocks,
chainConfig: chainConfig,
cacheConfig: cacheConfig,
db: db,
triegc: prque.New(nil),
stateCache: state.NewDatabaseWithCache(db, cacheConfig.TrieCleanLimit),
quit: make(chan struct{}),
shouldPreserve: shouldPreserve,
bodyCache: bodyCache,
bodyRLPCache: bodyRLPCache,
receiptsCache: receiptsCache,
blockCache: blockCache,
futureBlocks: futureBlocks,
engine: engine,
vmConfig: vmConfig,
badBlocks: badBlocks,
stateDiffsProcessed: stateDiffsProcessed,
}

bc.SetValidator(NewBlockValidator(chainConfig, bc, engine))
bc.SetProcessor(NewStateProcessor(chainConfig, bc, engine))

Expand Down Expand Up @@ -922,6 +927,20 @@ func (bc *BlockChain) WriteBlockWithoutState(block *types.Block, td *big.Int) (e
return nil
}

func (bc *BlockChain) AddToStateDiffProcessedCollection(hash common.Hash) {
count, ok := bc.stateDiffsProcessed[hash]
if count > 1 {
log.Error("count is too high", "count", count, "hash", hash.Hex())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this error need to be handled in any additional capacity? Right now it still falls through to the ok check.

Copy link
Author

@elizabethengelman elizabethengelman Feb 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. This log statement was more for me as informational to see if AddToStateDiffProcessedCollection would be called more than twice (once when the hash was current, and once when it was the parent). I don't think there is an if its called more than twice, it would mean that a state diff was processed using this hash more than twice. So, I may remove this if statement unless there's some an edge case that I'm missing.

}

if ok {
count++
bc.stateDiffsProcessed[hash] = count
} else {
bc.stateDiffsProcessed[hash] = 1
}
Copy link

@m0ar m0ar Feb 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent:

count, _ := bc.stateDiffsProcessed[hash]
bc.stateDiffsProcessed[hash] = count + 1

If !ok, count will have the zero value of int: 0 anyway. I assume this is run very frequently, would save some computation.
https://play.golang.org/p/EAba1v8NRcw

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

}

// WriteBlockWithState writes the block and all associated state to the database.
func (bc *BlockChain) WriteBlockWithState(block *types.Block, receipts []*types.Receipt, state *state.StateDB) (status WriteStatus, err error) {
bc.wg.Add(1)
Expand Down Expand Up @@ -994,6 +1013,30 @@ func (bc *BlockChain) WriteBlockWithState(block *types.Block, receipts []*types.
bc.triegc.Push(root, number)
break
}

if bc.cacheConfig.ProcessStateDiffs {
count, ok := bc.stateDiffsProcessed[root.(common.Hash)]
//if we haven't processed the statediff for a given state root and it's child, don't dereference it yet
if !ok {
log.Debug("Current root NOT found root in stateDiffsProcessed", "root", root.(common.Hash).Hex())
bc.triegc.Push(root, number)
break
}
if count < 2 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth putting 1 and 2 behind variables/constants to clarify what's going on here? My understanding is that a hash needs to be added once as the current hash, and once as the parent hash - and then we're done. Is that right? Do reorgs potentially complicate that expectation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really good catch. Yeah, my intention was that when we processes a state diff for a given hash, it gets added to the stateDiffsProcessed collection. It then gets added again when the state diff is processed when it is the parent, so we'll need to make sure it isn't pruned before these diffs have been processed.

Considering reorgs is a really good thought, I'm honestly not sure how that would affect this expectation. I think I'd need to spend some time digging into this - do you think it makes sense to hold off merging this in? Or, merging it, and creating a new story to take a look in the future?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely sounds like another story to me, and a bit hard to simulate without getting to the head of the chain and watching it execute in the middle of one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it definitely will be tricky. I've added another story so we can make sure to circle back: https://makerdao.atlassian.net/browse/VDB-393

log.Debug("Current root has not yet been processed for it's child", "root", root.(common.Hash).Hex())
bc.triegc.Push(root, number)
break
} else {
log.Debug("Current root found in stateDiffsProcessed collection with a count of 2, okay to dereference",
"root", root.(common.Hash).Hex(),
"blockNumber", uint64(-number),
"size of stateDiffsProcessed", len(bc.stateDiffsProcessed))

delete(bc.stateDiffsProcessed, root.(common.Hash))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we clean up the map when we're finished with a certain diff? Was just about to ask about memory leaks here, but probably okay if that's the case!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, if a root is okay to be dereferenced, we removed it from the stateDiffsProcessed collection just to make sure that that collection doesn't get huge. And then it will fall through to triedb.Dereference(root.(common.Hash)) which removes it from the in-memory trie database.

}
}

log.Debug("Dereferencing", "root", root.(common.Hash).Hex())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to retain all these logging statements when we merge?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably a good idea to remove some of them πŸ‘

triedb.Dereference(root.(common.Hash))
}
}
Expand Down
81 changes: 81 additions & 0 deletions core/blockchain_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1483,3 +1483,84 @@ func BenchmarkBlockChain_1x1000Executions(b *testing.B) {

benchmarkLargeNumberOfValueToNonexisting(b, numTxs, numBlocks, recipientFn, dataFn)
}

func TestProcessingStateDiffs(t *testing.T) {
defaultTrieCleanCache := 256
defaultTrieDirtyCache := 256
defaultTrieTimeout := 60 * time.Minute
cacheConfig := &CacheConfig{
Disabled: false,
TrieCleanLimit: defaultTrieCleanCache,
TrieDirtyLimit: defaultTrieDirtyCache,
TrieTimeLimit: defaultTrieTimeout,
ProcessStateDiffs: true,
}
db := ethdb.NewMemDatabase()
genesis := new(Genesis).MustCommit(db)
numberOfBlocks := triesInMemory
engine := ethash.NewFaker()
blockchain, _ := NewBlockChain(db, cacheConfig, params.AllEthashProtocolChanges, engine, vm.Config{}, nil)
blocks := makeBlockChain(genesis, numberOfBlocks+1, engine, db, canonicalSeed)
_, err := blockchain.InsertChain(blocks)
if err != nil {
t.Fatalf("failed to create pristine chain: %v", err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✨ pristine ✨

}
defer blockchain.Stop()

//when adding a root hash to the collection, it will increment the count
firstStateRoot := blocks[0].Root()
blockchain.AddToStateDiffProcessedCollection(firstStateRoot)
value, ok := blockchain.stateDiffsProcessed[firstStateRoot]
if !ok {
t.Error("state root not found in collection")
}
if value != 1 {
t.Error("state root count not correct", "want", 1, "got", value)
}

blockchain.AddToStateDiffProcessedCollection(firstStateRoot)
value, ok = blockchain.stateDiffsProcessed[firstStateRoot]
if !ok {
t.Error("state root not found in collection")
}
if value != 2 {
t.Error("state root count not correct", "want", 2, "got", value)
}

moreBlocks := makeBlockChain(blocks[len(blocks)-1], 1, engine, db, canonicalSeed)
_, err = blockchain.InsertChain(moreBlocks)

//a root hash can be dereferenced when it's state diff and it's child's state diff have been processed
//(i.e. it has a count of 2 in stateDiffsProcessed)
nodes := blockchain.stateCache.TrieDB().Nodes()
if containsRootHash(nodes, firstStateRoot) {
t.Errorf("stateRoot %s in nodes, want: %t, got: %t", firstStateRoot.Hex(), false, true)
}

//a root hash should still be in the in-mem db if it's child's state diff hasn't yet been processed
//(i.e. it has a count of 1 stateDiffsProcessed)
secondStateRoot := blocks[1].Root()
blockchain.AddToStateDiffProcessedCollection(secondStateRoot)
if !containsRootHash(nodes, secondStateRoot) {
t.Errorf("stateRoot %s in nodes, want: %t, got: %t", secondStateRoot.Hex(), true, false)
}

//the stateDiffsProcessed collection is cleaned up once a hash has been dereferenced
_, ok = blockchain.stateDiffsProcessed[firstStateRoot]
if ok {
t.Errorf("stateRoot %s in stateDiffsProcessed collection, want: %t, got: %t",
firstStateRoot.Hex(),
false,
ok,
)
}
}

func containsRootHash(collection []common.Hash, hash common.Hash) bool {
for _, n := range collection {
if n == hash {
return true
}
}
return false
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff shows how we should write our tests instead of that damn Ginkgo! Looks so clean, and probably runs way faster. 🐌 Why are we even using gomega assertions when we have good 'ol bool

Could probably be split up in smaller tests, but I figure the setup is a bit annoying.

9 changes: 5 additions & 4 deletions eth/backend.go
Original file line number Diff line number Diff line change
Expand Up @@ -157,10 +157,11 @@ func New(ctx *node.ServiceContext, config *Config) (*Ethereum, error) {
EVMInterpreter: config.EVMInterpreter,
}
cacheConfig = &core.CacheConfig{
Disabled: config.NoPruning,
TrieCleanLimit: config.TrieCleanCache,
TrieDirtyLimit: config.TrieDirtyCache,
TrieTimeLimit: config.TrieTimeout,
Disabled: config.NoPruning,
TrieCleanLimit: config.TrieCleanCache,
TrieDirtyLimit: config.TrieDirtyCache,
TrieTimeLimit: config.TrieTimeout,
ProcessStateDiffs: config.StateDiff,
}
)
eth.blockchain, err = core.NewBlockChain(chainDb, cacheConfig, eth.chainConfig, eth.engine, vmConfig, eth.shouldPreserve)
Expand Down
4 changes: 4 additions & 0 deletions eth/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,8 @@ var DefaultConfig = Config{
Blocks: 20,
Percentile: 60,
},

StateDiff: false,
}

func init() {
Expand Down Expand Up @@ -135,6 +137,8 @@ type Config struct {

// Constantinople block override (TODO: remove after the fork)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ˜‚ 🍴

ConstantinopleOverride *big.Int

StateDiff bool
}

type configMarshaling struct {
Expand Down
Loading