Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

goccy/go-json dependency causes unnecessary memory allocations #36765

Closed
ptodev opened this issue Dec 10, 2024 · 2 comments · Fixed by #36807
Closed

goccy/go-json dependency causes unnecessary memory allocations #36765

ptodev opened this issue Dec 10, 2024 · 2 comments · Fixed by #36807

Comments

@ptodev
Copy link
Contributor

ptodev commented Dec 10, 2024

Component(s)

No response

Describe the issue you're reporting

The github.com/goccy/go-json module contains an init() function which warms up a cache even if the module is never used. I believe this causes around 20 MB of memory per Collector instance. This is an issue for users who run many instances of the Collector. If you run hundreds of instances, 20 MB per instance adds up to a lot.

Currently, github.com/goccy/go-json seems to be used only by the Splunk HEC Exporter, Stanza, and OTTL. I suppose all other functionality doesn't need the cache.

There is a PR opened upstream to improve the cache so that it is loaded lazily - only if goccy/go-json is used.

There are a few solutions to this problem:

  • Wait for the upstream fix to be merged and update the github.com/goccy/go-json version which Collector uses.
  • Use a fork of goccy/go-json with a lazy cache, similar to grafana/grafana and grafana/alloy.
  • Add build directives so that github.com/goccy/go-json is not used for certain builds. This is similar to the goccy_gojson build tag used in datadog-api-client-go and the arrow_json_stdlib build tag. I suspect that's not an option for the Collector though. I suspect this is not an option for the collector though.

To reproduce the issue, run a configuration like this one:

extensions:
  pprof:
  
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: "0.0.0.0:4320"

exporters:
  loadbalancing:
    routing_key: traceID
    resolver:
      static:
        hostnames:
        - localhost:55690
    protocol:
      otlp:
        tls:
          insecure: true

service:
  extensions: [pprof]
  pipelines:
    traces:
      receivers: [otlp]
      processors: []
      exporters: [loadbalancing]

Then get a memory pprof:

curl localhost:1777/debug/pprof/heap -o heap.pprof

Then view it in Pyroscope:
Screenshot 2024-12-10 at 15 16 24

@atoulme
Copy link
Contributor

atoulme commented Dec 11, 2024

Upstream fix is merged. We will adopt it in the next release of goccy/go-json.

@toddtreece
Copy link
Contributor

@atoulme goccy/[email protected] was released, and i opened a PR with the update: #36807

sbylica-splunk pushed a commit to sbylica-splunk/opentelemetry-collector-contrib that referenced this issue Dec 17, 2024
…#36807)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Updates `github.com/goccy/go-json` with fix from
goccy/go-json#490

Additional details via
open-telemetry#36765:

> The github.com/goccy/go-json module contains an init() function which
warms up a cache even if the module is never used. I believe this causes
around 20 MB of memory per Collector instance. This is an issue for
users who run many instances of the Collector. If you run hundreds of
instances, 20 MB per instance adds up to a lot.
> 
> Currently, github.com/goccy/go-json seems to be used only by the
Splunk HEC Exporter, Stanza, and OTTL. I suppose all other functionality
doesn't need the cache.
> 
> There is a goccy/go-json#490 opened upstream
to improve the cache so that it is loaded lazily - only if goccy/go-json
is used.

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes
open-telemetry#36765
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants