Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add document defining an OpenTelemetry Collector #4313

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

codeboten
Copy link
Contributor

Changes

Adds a definition of an OpenTelemetry Collector


- An OpenTelemetry Collector _MUST_ accept a OpenTelemetry Collector Config file.
- An OpenTelemetry Collector _MUST_ be able to be compiled with any and all
additional Collector plugins that the user wishes to include.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this specification define what an OpenTelemetry Collector plugin is? Is it any component of type receiver, processor, exporter, extension, or config map provider?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specification needs to define all the terms used in this definition, otherwise it does not remove ambiguity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed plugin to component and added a section to define OpenTelemetry Collector component.

specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Show resolved Hide resolved
Comment on lines 24 to 25
For a library to be considered an OpenTelemetry Collector component, it _MUST_
implement the [Component interface](https://github.com/open-telemetry/opentelemetry-collector/blob/main/component/component.go)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Collector also accepts confmap.Providers and confmap.Converters, which do not accept this interface. Do we consider those out of scope?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they do need to be considered in scope. Interoperability of those components is important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Collector also accepts confmap.Providers and confmap.Converters, which do not accept this interface. Do we consider those out of scope?

I wonder if including them would allow us to avoid having to include a definition for a config file, wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Providers handle configuration abstractly so they help remove the need for how configuration should be represented, but they don't solve the schema part (which I don't feel like we need to solve tbh)

specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Outdated Show resolved Hide resolved
Comment on lines +9 to +12
The goal of this document is for users to be able to easily switch between
OpenTelemetry Collector Distros while also ensuring that components produced by
the OpenTelemetry Collector SIG are able to work with any vendor who claims
support for an OpenTelemetry Collector.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this goal. If a vendor produces a collector distribution that has a subset of available components because those are the components relevant to their service offerings and that they're willing to support, where do any other components (whether hosted in an OTel repo or not) fit into that picture? Do we mean that a distribution must offer end users the ability to modify its source and create their own build? We should be explicit about that if that is the case.

Given that the licensing of the collector's source code does not require that distribution of derivative works happen in source form I'm not sure that we have much ability here to enforce such a requirement. We can certainly try to use the "OpenTelemetry" mark as a cudgel, but I'm not sure it'll be as effective as may be desirable since the terms "collector" and "distribution" are very broad. It could perhaps be argued that "OpenTelemetry Collector" is a protectable mark and maybe even that "Collector" has acquired secondary meaning in this limited scope, but protecting such a mark against genericization is going to be a Sisyphean task.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this definition as separation from the term Distribution defined below. A Distribution is a specific compiled OpenTelemetry Collector with a specific set of OpenTelemetry Collector Components that the maintainer (the user in this case) decided to add. It is a OpenTelemetry Collector bc the maintainer was able to bring their chosen OpenTelemetry Collector components to it.

Something is not an OpenTelemetry Collector if it cannot support OpenTelemetry Collector Components. Maybe the word additional below is unnecessary and could be removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the licensing of the collector's source code does not require that distribution of derivative works happen in source form I'm not sure that we have much ability here to enforce such a requirement

We potentially have leverage over:

  • Trademark usage if "OpenTelemetry Collector" becomes a trademark
  • What we list on our registry and website and what we promote
  • What wording can be used in 'official' OTel events

I think we have enough leverage here to make this worth it

specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Show resolved Hide resolved
Comment on lines 24 to 25
For a library to be considered an OpenTelemetry Collector component, it _MUST_
implement the [Component interface](https://github.com/open-telemetry/opentelemetry-collector/blob/main/component/component.go)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they do need to be considered in scope. Interoperability of those components is important.

to: collector/README.md
--->

# OpenTelemetry Collector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenTelemetry Collector is never defined. Is it a source code artifact? A binary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one of the things I was trying to get at here. Since there's no binary plugin mechanism it seems that the source would need to be available for it to be extended in the manner contemplated, but that's not clear or explicit in the current state.

Comment on lines 16 to 18
- An OpenTelemetry Collector _MUST_ be able to include any and all
additional [Collector components](#opentelemetry-collector-components) that
the user wishes to include.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when the user wishes to include two different components that both claim the same type string? Does that then make every collector implementation non-compliant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this as an edge case that would be up to the components to sort out, unless there's a future where type is made a unique namespace somehow

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this needs to be resolved, but I think it does need to be resolved since as written this would make it trivial to create a situation where there can be no OpenTelemetry Collectors at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some details in the components section to provide a way to resolve conflicts for components, PTAL

Copy link
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising a hypothetical case with a few suggestions that might help address it.

First, I found the expression "any and all" to be woeful-- with today's unstable Collector APIs it is not possible to compile just any component, you need a specific set of versions to make it work. Therefore, I think we want some kind of version-compatibility statement which enables future major versions and/or future breaking API changes. A collector only has to support the components that are compatible with the versions/interfaces it supports, I think.

The hypothetical-- What if someone decided to start a new OpenTelemetry Collector code base in another language and gained the blessing of the Collector SIG? To be concrete, there is definitely an interest in a Rust-based Collector that could take advantage of many exciting Rust libraries (e.g., DataFusion). If our goal is to specify what an OpenTelemetry Collector is, we should be able to re-implement a Collector that matches this specification in another language. (Of course, we expect it to follow the same configuration scheme at least when we refer to the YAML representation?)

Such a hypothetical Rust OpenTelemetry Collector could be created, I think, with a Rust-based component interface. Original components could be crafted against the Rust-based component interface: a Collector Distro derived from the Rust Collector would be required to support components that are compatible with its own component interface. As a matter of practicality, I would expect the Rust collector to support components depending on the Golang component interface, but that's a "MAY" requirement. The Golang collector would not be required to support the Rust component interfaces, but it could do so optionally.

specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Outdated Show resolved Hide resolved
specification/collector/README.md Show resolved Hide resolved
@jpkrohling
Copy link
Member

a Collector Distro derived from the Rust Collector would be required to support components that are compatible with its own component interface

This goes against the spirit of this document, which is to ensure people making investments in creating OTel Collector components can easily port them among distributions. IIRC, a "Rust" Collector was even explicitly mentioned by @tedsuo as "something else", not OTel Collector.

@tedsuo
Copy link
Contributor

tedsuo commented Dec 5, 2024

Yeah a codebase that cannot consume the Go plugins should not be a Collector. It would be extremely confusing to have multiple ecosystems implemented in multiple languages and call all of that the same thing.

Comment on lines 57 to 58
Distribution _SHOULD_ provide users with tools and/or documentation for adding
their own components to the Distribution.
Copy link
Member

@reyang reyang Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. It looks like in this PR as well as in https://opentelemetry.io/docs/concepts/distributions/ we believe "distro is a binary artifact", then adding a source component and recompile the entire thing would be a different distro as it is a different binary artifact.

This is important from the security perspective as we need to have clear guidance on how to handle CVEs.

Comment on lines +40 to +43
Components require a unique identfier as a `type` string to be included in an OpenTelemetry
Collector. It is possible that multiple components use the same identifier, in which
case the two components cannot be used simultaneously in a single OpenTelemetry Collector. In
order to resolve this, the clashing components must use a different identifier.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the mechanism for this? AIUI the collector framework does not provide a mechanism to dynamically change the config type used by a component. They're typically defined statically in the component's factory initializer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the mechanism is: use a different name if you were late to the party. We can support a better mechanism if this becomes a concern (via ocb or via some configuration mechanism)

Comment on lines +9 to +12
The goal of this document is for users to be able to easily switch between
OpenTelemetry Collector Distros while also ensuring that components produced by
the OpenTelemetry Collector SIG are able to work with any vendor who claims
support for an OpenTelemetry Collector.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the licensing of the collector's source code does not require that distribution of derivative works happen in source form I'm not sure that we have much ability here to enforce such a requirement

We potentially have leverage over:

  • Trademark usage if "OpenTelemetry Collector" becomes a trademark
  • What we list on our registry and website and what we promote
  • What wording can be used in 'official' OTel events

I think we have enough leverage here to make this worth it

Comment on lines +40 to +43
Components require a unique identfier as a `type` string to be included in an OpenTelemetry
Collector. It is possible that multiple components use the same identifier, in which
case the two components cannot be used simultaneously in a single OpenTelemetry Collector. In
order to resolve this, the clashing components must use a different identifier.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the mechanism is: use a different name if you were late to the party. We can support a better mechanism if this becomes a concern (via ocb or via some configuration mechanism)

Copy link

@jaronoff97 jaronoff97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two minor suggestions, otherwise this looks great! Thank you Alex!

Comment on lines +25 to +32
```yaml
receivers:
processors:
exporters:
connectors:
extensions:
service:
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to specify more about the structure for service? Everything else is very up to the user based on the components they include, but service has more specific requirements for the pipeline that may be worth calling out. That being said, maybe this is like this until v1 is complete?

specification/collector/README.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.