Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable logging in ES client #2862

Merged
merged 5 commits into from
Mar 10, 2021
Merged

Conversation

albertteoh
Copy link
Contributor

@albertteoh albertteoh commented Mar 5, 2021

Signed-off-by: albertteoh [email protected]

Which problem is this PR solving?

Short description of the changes

  • Enables logging options in ES client.

Testing

Manually tested with the following combination of flags on jaeger-collector:

  • ES_LOG_LEVEL=debug only: logs ES queries as expected
  • ES_LOG_LEVEL=debug LOG_LEVEL=debug: logs both ES queries and usual collector debug logs as expected
  • ES_LOG_LEVEL=debug LOG_LEVEL=error: logs ES queries as expected

Manually tested with the following combination of flags on jaeger-query:

  • ES_LOG_LEVEL=debug only: only logs primary ES client queries.
  • ES_LOG_LEVEL=debug ES_ARCHIVE_LOG_LEVEL=debug: logs both primary ES client as well as archive ES client queries.

Example logs: jaeger-query
With ES_LOG_LEVEL=debug:

# Request
{"level":"debug","ts":1614933783.168494,"caller":"zapgrpc/zapgrpc.go:94","msg":"POST /jaeger-service-2021-03-05%2Cjaeger-service-2021-03-04%2Cjaeger-service-2021-03-03%2Cjaeger-service-2021-03-02/_search?ignore_unavailable=true&rest_total_hits_as_int=true HTTP/1.1\r\nHost: 127.0.0.1:9200\r\nUser-Agent: elastic/6.2.35 (darwin-amd64)\r\nTransfer-Encoding: chunked\r\nAccept: application/json\r\nContent-Type: application/json\r\nAccept-Encoding: gzip\r\n\r\n5d\r\n{\"aggregations\":{\"distinct_services\":{\"terms\":{\"field\":\"serviceName\",\"size\":9999}}},\"size\":0}\r\n0\r\n\r\n\n"}

# Response
{"level":"debug","ts":1614933783.1721869,"caller":"zapgrpc/zapgrpc.go:94","msg":"HTTP/1.1 200 OK\r\nContent-Type: application/json; charset=UTF-8\r\n\r\n{\"took\":2,\"timed_out\":false,\"_shards\":{\"total\":5,\"successful\":5,\"skipped\":0,\"failed\":0},\"hits\":{\"total\":9,\"max_score\":null,\"hits\":[]},\"aggregations\":{\"distinct_services\":{\"doc_count_error_upper_bound\":0,\"sum_other_doc_count\":0,\"buckets\":[{\"key\":\"jaeger-query\",\"doc_count\":9}]}}}\n"}

With ES_LOG_LEVEL=info:

# Request + response
{"level":"info","ts":1614933693.67656,"caller":"zapgrpc/zapgrpc.go:94","msg":"POST http://127.0.0.1:9200/jaeger-service-2021-03-05%2Cjaeger-service-2021-03-04%2Cjaeger-service-2021-03-03%2Cjaeger-service-2021-03-02/_search?ignore_unavailable=true&rest_total_hits_as_int=true [status:200, request:0.003s]"}

Example logs: jaeger-collector
With ES_LOG_LEVEL=debug:

# Request
{"level":"debug","ts":1614933923.268276,"caller":"zapgrpc/zapgrpc.go:94","msg":"POST /_bulk HTTP/1.1\r\nHost: 127.0.0.1:9200\r\nUser-Agent: elastic/6.2.35 (darwin-amd64)\r\nContent-Length: 3985\r\nAccept: application/json\r\nContent-Type: application/x-ndjson\r\nAccept-Encoding: gzip\r\n\r\n{\"index\":{\"_index\":\"jaeger-service-2021-03-05\",\"_id\":\"5d2c6ee5873dd4a6\"}}\n{\"serviceName\":\"jaeger-query\",\"operationName\":\"/api/services\"}\n{\"index\":{\"_index\":\"jaeger-span-2021-03-05\"}}\n{\"traceID\":\"667fd072e817ec39\",\"spanID\":\"667fd072e817ec39\",\"flags\":1,\"operationName\":\"/api/services\",\"references\":[],\"startTime\":1614933922387422,\"startTimeMillis\":1614933922387,\"duration\":3363,\"tags\":[{\"key\":\"sampler.type\",\"type\":\"string\",\"value\":\"const\"},...


# Response
{"level":"debug","ts":1614933923.308722,"caller":"zapgrpc/zapgrpc.go:94","msg":"HTTP/1.1 200 OK\r\nContent-Type: application/json; charset=UTF-8\r\n\r\n{\"took\":38,\"errors\":false,\"items\":[{\"index\":{\"_index\":\"jaeger-service-2021-03-05\",\"_type\":\"_doc\",\"_id\":\"5d2c6ee5873dd4a6\",\"_version\":2,\"result\":\"updated\",\"_shards\":{\"total\":2,\"successful\":2,\"failed\":0}...

With ES_LOG_LEVEL=info:

# Request + response
{"level":"info","ts":1614934150.339668,"caller":"zapgrpc/zapgrpc.go:94","msg":"POST http://127.0.0.1:9200/_bulk [status:200, request:0.011s]"}

Signed-off-by: albertteoh <[email protected]>
@albertteoh albertteoh requested a review from a team as a code owner March 5, 2021 08:52
@albertteoh albertteoh requested a review from vprithvi March 5, 2021 08:52
@albertteoh
Copy link
Contributor Author

cc @dekimsey

@codecov
Copy link

codecov bot commented Mar 5, 2021

Codecov Report

Merging #2862 (62cdf8b) into master (6072ef2) will decrease coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2862      +/-   ##
==========================================
- Coverage   95.98%   95.95%   -0.03%     
==========================================
  Files         223      223              
  Lines        9679     9685       +6     
==========================================
+ Hits         9290     9293       +3     
- Misses        322      325       +3     
  Partials       67       67              
Impacted Files Coverage Δ
plugin/storage/es/options.go 100.00% <100.00%> (ø)
cmd/collector/app/server/zipkin.go 73.07% <0.00%> (-3.85%) ⬇️
pkg/config/tlscfg/cert_watcher.go 92.20% <0.00%> (-2.60%) ⬇️
cmd/query/app/static_handler.go 96.77% <0.00%> (-1.62%) ⬇️
plugin/storage/badger/spanstore/reader.go 96.08% <0.00%> (+0.71%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6072ef2...62cdf8b. Read the comment docs.

case logger.Core().Enabled(zap.DebugLevel):
l = zapgrpc.NewLogger(logger, zapgrpc.WithDebug())
options = append(options, elastic.SetTraceLog(l))
case logger.Core().Enabled(zap.InfoLevel):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is a good idea. Our default log level is info, so we're going to be adding a log line for every span written in the collector.

Copy link
Contributor Author

@albertteoh albertteoh Mar 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fair point about the collector. The motivation for this PR was to log ES queries so I think just showing the tracelog when debug logging is enabled is enough, and the rest, including info level in jaeger, should default to ErrorLevel logging in ES client.

It also brings up the question of whether the debug logs for collector would be too much as it would log the entire span's content which can be very large. The only log line the collector writes when log-level=debug is:

{"level":"debug","ts":1615007007.183977,"caller":"app/span_processor.go:149","msg":"Span written to the storage by the collector","trace-id":"4a9afb787bbb4f3f","span-id":"4a9afb787bbb4f3f"}

which I can see a use case for when debugging ingestion. With ES queries in the logging mix, it may be hard to see the above logs without grepping. As such, maybe we need to isolate this feature to jaeger-query and not jaeger-collector, though I'm not sure how easy this would be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably better positioned to answer this since you're running this setup in prod.

Another option instead of piggy-backing on the main log level is to add just an --es. flag for log level, then people can go nuts if they want to, all the way to full payload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we don't use jaeger-collector with ES as the backing store; although I do like your suggestion of having a separate flag (I propose --es.log-level), which I'll aim to do in the coming day or so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yurishkuro I've added the --es.log-level (and --es-archive.log-level) flag and tested it together with --log-level in different combinations of values. Both the --es and non---es flags are decoupled, working independently of one another.

@mergify mergify bot dismissed yurishkuro’s stale review March 6, 2021 05:44

Pull request has been modified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Log ES queries
3 participants