Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gRPC metrics to agent #1180

Merged
merged 4 commits into from
Nov 16, 2018

Conversation

pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Nov 14, 2018

Part of #773

Add gRPC metrics to agent.

  • added reporter wrapper with metrics
  • added sampling/baggage manager with metrics
  • old metrics should be the same

@codecov
Copy link

codecov bot commented Nov 14, 2018

Codecov Report

Merging #1180 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1180   +/-   ##
======================================
  Coverage     100%    100%           
======================================
  Files         157     159    +2     
  Lines        7104    7126   +22     
======================================
+ Hits         7104    7126   +22
Impacted Files Coverage Δ
cmd/agent/app/httpserver/collector_proxy.go 100% <100%> (ø) ⬆️
cmd/agent/app/reporter/tchannel/builder.go 100% <100%> (ø) ⬆️
cmd/agent/app/reporter/tchannel/collector_proxy.go 100% <100%> (ø) ⬆️
cmd/agent/app/httpserver/config_mgr_metrics.go 100% <100%> (ø)
cmd/agent/app/reporter/grpc/collector_proxy.go 100% <100%> (ø) ⬆️
cmd/agent/app/reporter/metrics.go 100% <100%> (ø)
cmd/agent/app/reporter/tchannel/reporter.go 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 56ec296...1f13c3e. Read the comment docs.

cmd/agent/app/httpserver/manager_metrics.go Outdated Show resolved Hide resolved
cmd/agent/app/httpserver/manager_metrics.go Outdated Show resolved Hide resolved
cmd/agent/app/reporter/metrics.go Outdated Show resolved Hide resolved
cmd/agent/app/reporter/metrics.go Show resolved Hide resolved
cmd/agent/app/httpserver/manager_metrics.go Outdated Show resolved Hide resolved
reporter: NewReporter(conn, logger),
manager: NewSamplingManager(conn)}, nil
reporter: aReporter.WrapWithMetrics(NewReporter(conn, logger), "grpc-reporter", mFactory),
manager: httpserver.WrapWithMetrics(NewSamplingManager(conn), mFactory)}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • not sure why namespace is passed to reporter.WrapWithMetrics but not to httpserver.WrapWithMetrics
  • also, strictly speaking you can just do mFactory.Namespace("grpc-reporter", nil) here and not expose WrapWithMetrics to an extra parameter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why namespace is passed to reporter.WrapWithMetrics but not to httpserver.WrapWithMetrics

First of all I wanted to keep the old metric names for tchannel reporter. I am wondering whether scoping them based on the propag adds any value. We can make them generic or scope them. I am not sure what is better. The propag itself could also add its specific metrics.

cc @objectiser

also, strictly speaking you can just do mFactory.Namespace("grpc-reporter", nil) here and not expose WrapWithMetrics to an extra parameter.

Can do, passing a string namespace makes it cleaner that it has to be scoped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have examples of the metrics for reporter and manager to look at? I assume the reporter will be jaeger_agent_grpc_reporter... similar to jaeger_agent_tc_reporter...?

I think just passing the factory i.e. mFactory.Namespace("grpc-reporter", nil) is better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reporter from master emits this:

jaeger_agent_tchannel_reporter_batch_size{format="jaeger"} 0
jaeger_agent_tchannel_reporter_batch_size{format="zipkin"} 0
jaeger_agent_tchannel_reporter_batches_failures
jaeger_agent_tchannel_reporter_batches_failures{format="jaeger"} 0
jaeger_agent_tchannel_reporter_batches_failures{format="zipkin"} 0
jaeger_agent_tchannel_reporter_batches_submitted
jaeger_agent_tchannel_reporter_batches_submitted{format="jaeger"} 0
jaeger_agent_tchannel_reporter_batches_submitted{format="zipkin"} 0
jaeger_agent_tchannel_reporter_spans_failures
jaeger_agent_tchannel_reporter_spans_failures{format="jaeger"} 0
jaeger_agent_tchannel_reporter_spans_failures{format="zipkin"} 0
jaeger_agent_tchannel_reporter_spans_submitted
jaeger_agent_tchannel_reporter_spans_submitted{format="jaeger"} 0
jaeger_agent_tchannel_reporter_spans_submitted{format="zipkin"} 0

manager:

jaeger_agent_collector_proxy{endpoint="baggage",result="err"} 0
jaeger_agent_collector_proxy{endpoint="baggage",result="ok"} 0
jaeger_agent_collector_proxy{endpoint="sampling",result="err"} 0
jaeger_agent_collector_proxy{endpoint="sampling",result="ok"} 0

Currently the manager does not scope it based on propagation. So do we want to have
jaeger_agent_<propag>_reporter_spans or jaeger_agent_reporter_spans and analogically the same for collector_proxy/manager (might be a subject to renaming)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for making <propag> a label, so jaeger_agent_reporter_spans.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks then this seems like a right way to go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed, passing the factory is better 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of propag I have used protocol

Copy link
Member Author

@pavolloffay pavolloffay Nov 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although we use protocol tag to distinguish thrift binary/compat.

jaeger_agent_thrift_udp_server_packet_size{model="jaeger",protocol="binary"} 0
jaeger_agent_thrift_udp_server_packet_size{model="jaeger",protocol="compact"} 0
jaeger_agent_thrift_udp_server_packet_size{model="zipkin",protocol="compact"} 0
jaeger_agent_thrift_udp_t_processor_close_time_bucket{model="jaeger",protocol="binary",le="0.25"}
...

We should find a different tag key. Any suggestions? rpc, propag, source ?..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here (UDP server) this refers to inbound protocol. In the Reporter it's outbound, so still works for me.

cmd/agent/app/httpserver/collector_proxy.go Show resolved Hide resolved
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
Signed-off-by: Pavol Loffay <[email protected]>
jaeger_agent_collector_proxy{endpoint="baggage",protocol="tchannel",result="err"} 0
jaeger_agent_collector_proxy{endpoint="baggage",protocol="tchannel",result="ok"} 0
jaeger_agent_collector_proxy{endpoint="sampling",protocol="tchannel",result="err"} 0
jaeger_agent_collector_proxy{endpoint="sampling",protocol="tchannel",result="ok"} 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if we remove "EmitZipkinBatch" in #1193, we need to change the changelog since the new metrics won't be partitioned by the "format".

@yurishkuro yurishkuro merged commit 9f8d324 into jaegertracing:master Nov 16, 2018
@ghost ghost removed the review label Nov 16, 2018
@pavolloffay pavolloffay mentioned this pull request Nov 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants