-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(node): caching implementation for node operations #830
base: master
Are you sure you want to change the base?
Conversation
- Add caching for high performance Sets/Gets. The CPU usage and performance of GetNode/SetNode calls are significantly reduced after this change. - Improve JSON marshalling performance using go-json. This helps squeeze out the last bit of performance particularly for embedded devices. - Shortcut expensive heap allocations. The repetitive heap allocations may increase GC pressure and increase CPU usage when being called frequently. For the node cache implementation, the applications that use the `ytypes` library maintain the cache(s) on their own. The applications can optionally pass in a pointer of the `node cache` and get a performance boost. Passing in a pointer of the `node cache` is relatively cheap and has a negligible performance impact. This implementation adds optional fields to the APIs and the node arguments and doesn't introduce breaking changes. **Note**: unit tests need to be added for `caching enabled` method calls.
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
// - encoding.TextMarshalers are marshaled | ||
// | ||
// - integer keys are converted to strings | ||
func uniquePathRepresentation(path *gpb.Path) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest using util.PathToString
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's ygot.PathToString
? We might need to compare the performance of these two if that matters. Previously path.String
(encoding/prototext/encode.go
) was used. Its performance was more than doubled after switching to json.Marshal
(json.Marshal
is deterministic so it's fine to use). After switching to go-json
its was again largely improved.
Because computing the unique path representation is one of the most expensive, if not the most expensive parts of the node cache, and it's used by all cache operations, the performance of this function affects the performance of the node cache.
default: | ||
// Only TypedValue is supported by the node cache for now. | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this return an error instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. I think this type switch is used to check the type of val
and see if it's supported by the node cache. Maybe we don't want to return an error since that will turn into a SetNode
error and block the Set
.
The node cache doesn't support all types that the ordinary SetNode
supports so the intention is to let the SetNode
continue (which will still modify the node, without the performance boost) if the val
is not a supported type. Hope that makes sense.
ytypes/node_cache.go
Outdated
// When the node doesn't exist the unmarshal call will fail. Within the setNodeCache function this | ||
// should not return an error. Instead, this should return setComplete == false to let the ygot library | ||
// continue to go through the node creation process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps check that parent is nil and have an explicit return for this condition, so that the not found error is not conflated into one here?
If the cached node is valid, there is no reason that the unmarshal should fail, so I think we can return an error here so that we catch any bugs from any unforseen behaviour.
ygot/struct_validation_map.go
Outdated
// mapPathsPool helps reduce heap allocations. | ||
var mapPathsPool = sync.Pool{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more details on why this helps avoid heap allocations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an experiment in this frequently called function about how much GC load sync.Pool
can reduce, mostly using reference to the documentation in https://go.dev/src/sync/pool.go. In this use case I don't think there was significant GC pressure reduced (much less than the change from switching to go-json
). If this is overly complex we can revert the use of sync.Pool
for now. Thanks!
ygot/struct_validation_map.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this file and render.go be split into a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will do, that's a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes you mentioned were split into #831.
// IsGetOrCreateNodeOpt implements the GetOrCreateNodeOpt interface. | ||
func (*NodeCacheOpt) IsGetOrCreateNodeOpt() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this missing IsDelNodeOpt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for catching this, I think the node.DeleteNode
should be fixed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial cache implementation was a singleton in the ygot
library. When it was refactored I think some bugs were introduced too. It was used by a PoC earlier but I plan to do some major work soon to improve the tests for the node cache to help us catch such issues easier.
ytypes/gnmi_test.go
Outdated
@@ -18,6 +18,7 @@ func TestUnmarshalSetRequest(t *testing.T) { | |||
inUnmarshalOpts []UnmarshalOpt | |||
want ygot.GoStruct | |||
wantErr bool | |||
resetNodeCache bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of reusing existing tests that weren't designed for the cached node option, I think it might make for more effective testing if there were a dedicated nodeCache test that mixes Get/GetOrCreate/Set/Delete and to see it working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will be working on that over the weekend.
By the way thanks for posting this contribution! This looks like something that can benefit others and overall this implementation looks good. |
Thanks for supporting this change! Let's get the other 3 PRs completed and I'll use some after work time to clean up this PR and add better tests 😄 |
This is also a split change from the combined PR of node cache implementation (openconfig#830).
... and add initial `node_cache_test`.
…sues - Fixed the node cache not handling the container type `unmarshalGeneric` correctly - Added a temp node cache work around for memory address changes of slice heads when appending to a slice
Check nil for node cache stored `schema` and `parent`. An `Internal` error code is returned if either `schema` or `parent` is nil.
What
First of all, thank you for maintaining and sharing such an amazing and useful library, I hope we can give back to the community while making use of this project. This squashed commit was ported from our local fork after we made performance improvement changes for our use case. And my apologies if the size of the combined change is large for review. Please suggest any changes that could improve this patch, and ask for any clarifications.
Why
In one of our use cases of the
ygot
library, we found that there are significant performance benefits if we add a caching layer to thenode
calls. Most of the time spent on calling these methods, especially when the config tree is very large, is to perform tree traversal and find the nodes to operate on. This involves using relatively expensive calls toreflect
methods, string comparisons and manipulations, and is mostly repetitive computation, if the configuration tree's structure and paths are relatively static but the values may change.How (high-level)
The idea to implement the caching layer is to be able to use the unique node paths to locate a node immediately once its address is cached. This provides a shortcut to the tree-traversal method and brings in performance gains. In our use case, this feature addition along with the changes we made in other perf PRs brought the computation time down to <= 10ms from ~2000ms, under our benchmark tests with large config trees and high node modification stress.
There could be many other alternative implementations to achieve this performance gain, this is what we chose to implement and it worked well for our use case.
This optimization does come with a cost of complexity though, where the cache operations have to be managed properly. It also resulted in changes we had to make in
reflect.go
to make sure that the node's address doesn't change when its value is modified.In addition, we hope to make sure that the
ygot
library is stateless, and there's no breaking change to the exported APIs after adding the caching mechanism. As a result, the applications that use theytypes
library maintain the cache(s) on their own, and they can optionally pass in a pointer of thenode cache
to the call option struct and get a performance boost.Other Changes
JSON Marshaling Performance
When trying to squeeze the last bit of performance out, we decided to use
go-json
at a few places instead ofencoding/json
. There was a bit of hesitation in doing so because of adding a 3rd party dependency. But the performance boost was significant, especially on the hot paths.Procedure Shortcuts and
reflect
CallsAt the very last bit, but not least, we try to optimize the use of
reflect
as much as possible, and early return from continuing unnecessary heap allocations (GC pressure at scale) and computations. Both of which could be considered refactoring with performance gains so no breaking change was introduced. Such code changes can be found, for example, inygot/render.go/prependmodsJSON
.Summary of Change
Add caching for high performance Sets/Gets. The CPU usage and performance of GetNode/SetNode calls are significantly reduced after this change.
Improve JSON marshaling performance using go-json. This helps squeeze out the last bit of performance particularly for embedded devices.
Shortcut expensive heap allocations. The repetitive heap allocations may increase GC pressure and increase CPU usage when being called frequently.
Note: tests need to be added and completed for
caching enabled
method calls.Performance
Here are some rough reference numbers from e2e testing sending 2000 leaf-list updates in a SetRequest:
Limitations
Currently, inserting a new element to a list uses
reflect.Append
inutil/reflect.go
. This changes the memory address of the slice head which results in node cache synchronization issues. Because of this,list
andleaf-list
types are not handled by the node cache for now.