-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A79: Non-per-call Metrics Architecture #421
Conversation
65b9646
to
a50dd67
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good! Comments are mostly cosmetic things.
Please let me know if you have any questions. Thanks!
5722ac7
to
c2f3f3f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Yash!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one substantive comment remaining. The rest are cosmetic.
Please let me know if you have any questions. Thanks!
* Metrics from load-balancing policies should be nested under `grpc.lb.`, e.g. | ||
`grpc.lb.pick_first.`. | ||
* Metrics from resolvers should be nested under `grpc.resolver.`, e.g. | ||
`grpc.resolver.dns.`. | ||
* Metrics from transports should be nested under `grpc.transport.`, e.g., | ||
`grpc.transport.http2.` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearly by this you mean "Metrics from first-party gRPC ___s". Any guidance for third-party plugins for these components?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only recommendation from gRPC for third-party plugins is to NOT use the grpc.
namespace. In general, the same recommendations from OpenTelemetry (https://opentelemetry.io/docs/specs/semconv/general/metrics/#general-guidelines) should be followed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Java APIs for gRPC Non Per Call Metric Architecture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I have, before I get on a flight.
} | ||
|
||
interface MetricDescriptor { | ||
long index; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DNVindhya, for the actual implementation, uf this has data, it can't be an interface. But conceptually this is fine. For the actual implementation we'd probably also move name, description, and other common aspects of the descriptors here. But that's all invisible to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. Overlooked the part of data member being final and static.
How about I add common methods for descriptors like getIndex()
, getName()
in an interface and common data members like name, description and unit in an abstract class. Then LongCounterDescriptor
can implement the interface and extend the abstract class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an interface with the methods is fine. We can still have an abstract class, but it can be an implementation detail (not public
).
public List<Object> getMetricsMeasures(); | ||
|
||
/** Records a value for a long counter measure. */ | ||
default void recordLongCounter(MetricDescriptor counterDescriptor, Long value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DNVindhya, should the first argument be LongCounterDescriptor
? Similar for the next method.
Should value
be a long
, not the boxed Long
?
New metrics should start as experimental. Once they are stable, they can | ||
optionally be made on-by-default. The distinction between on-by-default and | ||
off-by-default metrics allows gRPC to support metrics that may be desirable in | ||
some cases but too expensive to collect by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume "too expensive to collect" is about CPU cost. But there is a storage cost argument as well, after it is . Are we only concerned with CPU cost here? (There could be memory costs, but in general I assume we won't have metrics that are too expensive in memory.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we are mostly worried about collection costs here which I assume would be mostly CPU, maybe memory as well, if, for example, the metric now requires to store some information for each RPC.
* Enables metrics specified in the set along with metrics that are enabled | ||
* by default. | ||
*/ | ||
public Builder enableMetrics(Set<String> enableMetrics); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DNVindhya, users won't have a set for this already. We'll probably want to use Collection
or Iterable
instead.
/** Disable metrics specified in the set. */ | ||
public Builder disableMetrics(Set<String> disableMetrics); | ||
|
||
/** Disable all metrics that are enabled by default. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DNVindhya, Does this only disable metrics disabled by default, or if someone had called enabledMetrics, would it remove those as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked C++, this will disable all metrics even the ones added by enableMetrics
. I will update the comment.
Update record APIs to take concrete Metric Descriptors
In the second approach, stats plugins that are to be scoped to a channel are | ||
registered with the corresponding channel directly, instead of the global stats | ||
plugin registry. On channel creation time, stats plugins from the registry are | ||
combined with the list of stats plugins registered directly on the channel to | ||
form the complete list of stats plugins for this channel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have plans to enable the second approach for Core? If not can we mention that Core only support first approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible that we will support it in the future.
No description provided.