Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weekly Report: 2024-05-29 - 2024-06-05 #1

Closed
LucaLanziani opened this issue Jun 5, 2024 · 0 comments
Closed

Weekly Report: 2024-05-29 - 2024-06-05 #1

LucaLanziani opened this issue Jun 5, 2024 · 0 comments
Labels

Comments

@LucaLanziani
Copy link
Owner

Format

  • {CATEGORY}: {COUNT} ({CHANGE_FROM_PREVIOUS_WEEK})

Issues Report

  • New issues: 25
    Issues
    • Memory-leak related to the resourcetotelemetry codepath? (#33383)
    • TLS config in TA receiver doesn't take effect (#33370)
    • [pkg/stanza] container parser should add k8s metadata as resource attributes (#33341)
    • [receiver/hostmetricsreceiver] Gopsutil error on windows with multiple processor groups. (#33340)
    • [receiver/hostmetricsreceiver] Gopsutil error on windows with multiple processor groups. (#33339)
    • Filter Processor Not Working as per the Doc/TestData (#33333)
    • Weekly Report: 2024-05-26 - 2024-06-02 (#33329)
    • Not able to filter by nil timestamp (#33327)
    • [exporter/elasticsearch] Use a single instance of esExporter for all event types (#33326)
    • [prometheusremotewriteexporter] memory leak when downstream prometheus endpoint is slow/non-responsive leads to GC and "out of order" errors (#33324)
    • attributes processor bot worked (#33343)
    • OpenTelemetry Collector Prometheus exporter fails with "was collected before with the same name and label values" (#33310)
    • [receiver/kafkametrics] Kafka consumer offset gives a random value instead of an actual offset (#33309)
    • [processor/resource] k8s pod metadata not insert when using loki (#33307)
    • [receiver/dockerstats] not generating per container metrics (#33303)
    • [docs]: Is the healthcheckv2 extension actually active, or just baked into the release but not available? (#33301)
    • syslog exporter does not format structured data with multiple fields properly (#33300)
    • [processor/transform] Set attribute values using connection context (#33288)
    • Endpoint from file (#33287)
    • deltatocumulativeprocessor segfaults at stream limit (#33285)
    • syslog receiver Error expecting a Stamp timestamp [col 5] (#33344)
    • Unable to export traces to elastic search engine backend (#33294)
    • [attributeprocessor] How to create a new span attribute from span's resource attribute (#33279)
    • Specify a role_arn for awscloudwatchlogs exporter (#33278)
    • deltatocumulative: Number of buckets in exponential histograms should be capped (#33277)
  • Issues needing triage: 129
    Issues
    • Support macOS Metrics When CGO Is On (#33393)
    • [receiver/hostmetricsreceiver] Gopsutil error on windows with multiple processor groups. (#33340)
    • [prometheusremotewriteexporter] memory leak when downstream prometheus endpoint is slow/non-responsive leads to GC and "out of order" errors (#33324)
    • attributes processor bot worked (#33343)
    • OpenTelemetry Collector Prometheus exporter fails with "was collected before with the same name and label values" (#33310)
    • [receiver/kafkametrics] Kafka consumer offset gives a random value instead of an actual offset (#33309)
    • [receiver/dockerstats] not generating per container metrics (#33303)
    • syslog exporter does not format structured data with multiple fields properly (#33300)
    • Endpoint from file (#33287)
    • Unable to export traces to elastic search engine backend (#33294)
    • Specify a role_arn for awscloudwatchlogs exporter (#33278)
    • kafkaexporter - support injecting headers (#33260)
    • Enabling WAL is not exporting metrics to Mimir backend using Prometheus remote write exporter (#33238)
    • New component: eG Innovations Telemetry Exporter (#33219)
    • New component: X.509 Certificate Monitoring (#33215)
    • The Source property seems to be the wrong one for mapping client IP to (#33210)
    • [AWS components] aws-sdk-go v1 usage should be upgraded to v2 (#33208)
    • [receiver/kafka]: support receiving from multiple topics (#33204)
    • [exporter/datadog] invalid api token causes failure when using logs agent exporter (#33195)
    • testbed not working with transform processor in config (#33193)
    • experimental_metricsgeneration divide calculation is not correct (#33179)
    • [exporter/elasticsearch] Bulk indexer error: an id must be provided if version type or value are set (#33139)
    • Replace the RemoteWriteQueue and WAL with the exporterhelper queue (sending_queue) in Prometheusremotewriteexporter (#33137)
    • azureblobreceiver not reading the logs & traces (#33132)
    • Getting ERROR Could not get bootstrap info from the Collector: collector's OpAMP client never connected to the Supervisor (#33129)
    • [exporter/prometheus] Allow setting custom "job" and "instance" attributes (#33118)
    • Access to journal files running in container on k8s (#33104)
    • [exporter/splunkhec] Integration test failing: HTTP response to HTTPS client (#33097)
    • Keep aws cloudwatch metadata (#33080)
    • OpenTelemetry Contrib using Mongodbatlasreceiver prompts "server busy" (#33024)
    • [receiver/filestats] Size of folder should reflect actual used bytes of the folder containing files similar to du (#33016)
    • Add support for Windows Authentication for direct connection to SQL Server instance (#32986)
    • Otel Processor metric filter is not working as expected (#32982)
    • [extension/k8sobserver] ingress ressources (#32971)
    • jmx receiver autodiscover targets in kubernetes? (#32965)
    • Windows event_data format is difficult to consume (#32952)
    • [pkg/ottl] Split ConvertCase function to explicit functions for each case (#32942)
    • I would like to ask if there are any official plans to support the RocketMQ receiver? (#32938)
    • [exporter/prometheusremotewriteexporter] Allow to set batch_send_deadline (#32891)
    • Why trace clickhouse exporter always shows "Exporting Failed: The column Timestamp is not present" (#32886)
    • Easy scaling when using non push based receivers (#32869)
    • system.cpu.time and system.cpu.utilization metrics seem incorrect when running collector on a Windows operating system (#32867)
    • Add support for DigitalOcean droplets to resourcedetetorprocessor (#32858)
    • [CI/CD] Cache Go step failing on windows (#32844)
    • Proposal: Adaptive Filter Processor (#32841)
    • [prometheusreceiver] metric datapoints attributed to two different ServiceMonitor jobs (#32828)
    • spanmetrics: Add a default namespace (#32818)
    • Allow setting of storage policy for clickhouse exporter (#32816)
    • New component: Trace Reshape Processor (#32796)
    • Flaky test: sumologicextension/extension.go:810 (#32785)
    • Enable exporters as Azure Log Analytics Workspace or Azure Application Insight. (#32765)
    • exporter/clickhouse cannot operate without create database, create table permissions (#32738)
    • Exporter/clickhouse support for distributed table (#32736)
    • [receiver/kafka]: Replace "topic" setting by "traces_topic", "logs_topic" and "metrics_topic" (#32735)
    • How should we handle imports of time/tzdata by dependencies? (#32688)
    • [pkg/translator/prometheusremotewrite] Introduce API based around hash based metrics identifiers (#32666)
    • New component: processor for external/remote processing (#32664)
    • [receiver/mongodb] Failing integration test due to timeout (#32658)
    • [receiver/elasticsearch] TestIntegration test times out intermittently (#32656)
    • [receiver/kafka] Ability to provide custom encoders (#32633)
    • [extension/oauth2clientauth] Enable dynamically reading ClientID and ClientSecret from command (#32602)
    • [k8sattributesprocessor] The sources.from types add enum metric_attribute (#32596)
    • otlpjsonfilereceiver: support compressed files from fileexporter (#32565)
    • Prometheus receiver fails on federate endpoint when job and instance labels are missing (#32555)
    • [exporter/prometheus] does not show metrics from otlp receiver (#32552)
    • sqlqueryreceiver - create one metric per row returned (#32546)
    • [receiver/hostmetrics] Process scrape integration test failing (#32536)
    • [exporter/clickhouse] Integration test hits a panic (#32530)
    • [exporter/prometheus] Support Prometheus Created Timestamp feature (#32521)
    • prometheus exporter precision error with histogram bucket (#32514)
    • [exporter/prometheusremotewrite] Permanent error: Permanent error: context deadline exceeded (#32511)
    • [receiver/chronyreceiver] Receiver is not scraping dial unixgram /var/run/chrony/chronyd.sock (#32487)
    • Resource attribute "service.instance.id" is converted to label "instance", conflicting with auto-generated prometheus label (#32484)
    • loadbalancingexporter makes the collector accept data to produce a reject otelcol_receiver_refused_spans (#32482)
    • [azuremonitorexporter] Duplicate logs on Kubernetes (#32480)
    • Collector fails to restart with persistent queue and health check enabled (#32456)
    • [CI] Unit tests are failing due to timeout for setup-go (#32445)
    • New component: DNS Cache Extension for OpenTelemetry (#32410)
    • [receiver/googlecloudspanner] Test TestItemCardinalityFilter_Filter fails intermittently on Windows runs (#32397)
    • [connector/spanmetrics] Test TestConnectorConsumeTracesExpiredMetrics fails intermittently on actuated ARM runners (#32395)
    • error encoding and sending metric family: write tcp 172.31.204.123:8889->172.31.42.221:60282: write: broken pipe (#32371)
    • Support SNMP Traps in snmp receiver (#32358)
    • Custom Sampler (#32353)
    • Unable to get instance details through mongodb receiver (#32350)
    • [statsdreceiver] fail to parse payloads with empty tag data (#32337)
    • [statsdreceiver] no metamonitoring information emitted by receiver (#32335)
    • [receiver/windowsperfcounters] When collecting instances with multiple matches, data is lost (#32319)
    • [exporter/clickhouse] Integration test failing due to time out (#32275)
    • Why Does The Kafka Exporter's Raw Marshaler Marshal Everything Except Raw Bytes To JSON? (#32237)
    • the metric of target_info has too mach labels that i not need (#32235)
    • [receiver/awscloudwatch] Missing log stream events (#32231)
    • [Hostmetrics Receiver] Add read and write character values to the process scraper (#32218)
    • Azure monitor exporter authentication (#32163)
    • Failed to connect to opensearch in TLS mode (#32139)
    • Update module github.com/kineticadb/kinetica-api-go to v0.0.4 breaks tests (#32115)
    • Optimize OTEL agent memory usage (#32035)
    • awskinesisexporter: Add support for partitioning records by traceId (#32027)
    • [receiver/datadog] Grafana Cloud Operations table is not detailed by endpoint (#31938)
    • [receiver/httpcheck] Support log pipeline for httpcheck events (#31933)
    • [Makefile.Common] Files under submodule will cause the result of all-pkgs to be empty. (#31928)
    • Jaeger UI SPM spanmetrics not working in 0.95.0 (#31922)
    • prometheusremotewrite context deadline exceeded (#31910)
    • [connector/servicegraph] New labels for service disambiguation and identification (#31889)
    • [exporter/prometheus] Wait for final scrape during collector shutdown (#31887)
    • [connector/spanmetrics] Add maximum span duration metric (#31885)
    • Azure Monitor Exporter Role Name differences (#31884)
    • splunkhecexporter field extraction truncates at 1000 characters (#31817)
    • Support cross account log collection through IAM roles (#31810)
    • Add support for TLS in memcachedreceiver (#31729)
    • Generate gauge metrics from traces (#31696)
    • prometheus receiver: support collectd's binary network protocol (#31546)
    • Load client certificate from hardware security device with Pkcs11 protocol (#31536)
    • [receiver/awscontainerinsight] Gather instance metadata parameters from Kubernetes API when EC2 instance metadata is not accessible (#31511)
    • [exporter/azuremonitor] Forward net.* attributes to Application Insights (#31438)
    • Dynamic selection of log_group_name and log_stream_name in aws cloudwatch logs exporter (#31382)
    • Add Windows Service status metrics (#31377)
    • [processor/resourcedetection] AWS Lambda faas.instance and aws.log.* attributes not set (#31359)
    • Http semantic convention breaking changes in 1.23 (#30935)
    • Add support for Docker container health checks to the collector image (#30798)
    • Add metrics to understand cost of telemetry (#30729)
    • prometheusremotewrite exporter with histogram is causing metrics export failure due to high memory (90%) (#30675)
    • Exporter Feature: OpenSearch Metrics (#30556)
    • Generate logs from trace pipeline (#30459)
    • [receiver/redisreceiver] Flaky cluster integration test (#30411)
    • Otel-collector-contrib with prometheus exporter missing exemplars (TraceId and SpanId) (#30197)
    • get a full list of all attributes per resource with full qualified attribute name e.g. from metadata., auth. (#30180)
    • [exporter/clickhouse] exporter fails with IO timeout error under load (#30175)
    • Add 'memory request' feature (#29347)
    • Rogue Parent ID generate in Azure Container App (#28870)
  • Issues ready to merge: 2
    Issues
    • [exporter/kafkaexporter] added an option to disable kerberos PA-FX-FAST negotiation (#33086)
    • [connector/servicegraph] Fix failed label does not work leads to servicegraph metrics error (#32019)
  • Issues needing sponsorship: 20
    Issues
    • New component: eG Innovations Telemetry Exporter (#33219)
    • New component: X.509 Certificate Monitoring (#33215)
    • I would like to ask if there are any official plans to support the RocketMQ receiver? (#32938)
    • Proposal: Adaptive Filter Processor (#32841)
    • New component: AWS ApplicationSignals Processor (#32808)
    • New component: Trace Reshape Processor (#32796)
    • New component: processor for external/remote processing (#32664)
    • New component: DNS Cache Extension for OpenTelemetry (#32410)
    • slurm processor (#32312)
    • Data Quality Connector (#31909)
    • New component: DaprExporter and DaprReceiver (#31634)
    • New component: migratecheckpoint (#30656)
    • New component: Fluent Forward Exporter (#29413)
    • New component: IPFIX Lookup (#28692)
    • New component: AWS Lambda Telemetry API Receiver (#26254)
    • New component: Vault Config Source (#24173)
    • New component: Log-based metrics processor (#18269)
    • New component: crash report extension (#16598)
    • New component: AWS CloudWatch metrics receiver (#15667)
    • New component: prometheus remotewrite receiver (#14751)
  • New issues needing sponsorship: 0

Components Report

@LucaLanziani LucaLanziani reopened this Jun 5, 2024
LucaLanziani pushed a commit that referenced this issue Jun 6, 2024
…pen-telemetry#33353)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Container parser should add k8s metadata as resource attributes and not
as log record attributes.

**Link to tracking Issue:** <Issue number if applicable> Fixes
open-telemetry#33341

**Testing:** <Describe what testing was performed and which tests were
added.>
Manual testing on local k8s cluster:

```console
2024-06-04T06:40:08.219Z	info	ResourceLog #0
Resource SchemaURL: 
Resource attributes:
     -> k8s.pod.uid: Str(d5ecc924-e255-4525-b5be-6437939b1e4d)
     -> k8s.container.name: Str(busybox)
     -> k8s.namespace.name: Str(default)
     -> k8s.pod.name: Str(daemonset-logs-dhzcq)
     -> k8s.container.restart_count: Str(0)
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-06-04 06:40:08.007370503 +0000 UTC
Timestamp: 2024-06-04 06:40:07.855932421 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 06:40:07)
Attributes:
     -> logtag: Str(F)
     -> key2: Map({"key_in":"val2"})
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-dhzcq_d5ecc924-e255-4525-b5be-6437939b1e4d/busybox/0.log)
     -> time: Str(2024-06-04T06:40:07.855932421Z)
     -> log.iostream: Str(stdout)
Trace ID: 
Span ID: 
Flags: 0
LogRecord #1
ObservedTimestamp: 2024-06-04 06:40:08.007451031 +0000 UTC
Timestamp: 2024-06-04 06:40:07.957875321 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 06:40:07)
Attributes:
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-dhzcq_d5ecc924-e255-4525-b5be-6437939b1e4d/busybox/0.log)
     -> log.iostream: Str(stdout)
     -> time: Str(2024-06-04T06:40:07.957875321Z)
     -> key2: Map({"key_in":"val2"})
     -> logtag: Str(F)
Trace ID: 
Span ID: 
Flags: 0
```

**Documentation:** <Describe the documentation added.> ~

---------

Signed-off-by: ChrsMark <[email protected]>
LucaLanziani pushed a commit that referenced this issue Nov 9, 2024
…try.log_response_body` config (open-telemetry#33854)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
- Add `telemetry.log_request_body` and `telemetry.log_response_body`
config for debugging. Debug log will contain field `request_body` and/or
`response_body` in the same log line instead of separate lines to avoid
interleaved log lines.
- Change "Request failed" log level to debug.

Output:
```
2024-07-02T14:09:24.983+0100	debug	elasticsearchexporter/elasticsearch_bulk.go:67	Request roundtrip completed.	{"kind": "exporter", "data_type": "logs", "name": "elasticsearch", "response_body": "{\"version\":{\"number\":\"1.2.3\"}}\n", "path": "/", "method": "GET", "duration": 0.000865486, "status": "200 OK"}
2024-07-02T14:09:24.984+0100	debug	elasticsearchexporter/elasticsearch_bulk.go:67	Request roundtrip completed.	{"kind": "exporter", "data_type": "logs", "name": "elasticsearch", "request_body": "{\"create\":{\"_index\":\"logs-test-idx\"}}\n{\"@timestamp\":\"2024-07-02T13:09:24.970187592Z\",\"Attributes\":{\"a\":\"test\",\"b\":5,\"batch_index\":\"batch_1\",\"c\":3,\"d\":true,\"item_index\":\"item_1\"},\"Body\":\"Load Generator Counter #0\",\"Scope\":{\"name\":\"\",\"version\":\"\"},\"SeverityNumber\":11,\"SeverityText\":\"INFO3\",\"TraceFlags\":1}\n{\"create\":{\"_index\":\"logs-test-idx\"}}\n{\"@timestamp\":\"2024-07-02T13:09:24.970187592Z\",\"Attributes\":{\"a\":\"test\",\"b\":5,\"batch_index\":\"batch_1\",\"c\":3,\"d\":true,\"item_index\":\"item_2\"},\"Body\":\"Load Generator Counter #1\",\"Scope\":{\"name\":\"\",\"version\":\"\"},\"SeverityNumber\":11,\"SeverityText\":\"INFO3\",\"TraceFlags\":1}\n", "response_body": "{\"took\":0,\"errors\":false,\"items\":[{\"create\":{\"_index\":\"logs-test-idx\",\"_id\":\"\",\"_version\":0,\"result\":\"\",\"status\":201,\"_seq_no\":0,\"_primary_term\":0,\"_shards\":{\"total\":0,\"successful\":0,\"failed\":0},\"error\":{\"type\":\"\",\"reason\":\"\",\"caused_by\":{\"type\":\"\",\"reason\":\"\"}}}},{\"create\":{\"_index\":\"logs-test-idx\",\"_id\":\"\",\"_version\":0,\"result\":\"\",\"status\":201,\"_seq_no\":0,\"_primary_term\":0,\"_shards\":{\"total\":0,\"successful\":0,\"failed\":0},\"error\":{\"type\":\"\",\"reason\":\"\",\"caused_by\":{\"type\":\"\",\"reason\":\"\"}}}}]}\n", "path": "/_bulk", "method": "POST", "duration": 0.000539979, "status": "200 OK"}
```

Required config to log
```
exporters:
  elasticsearch:
    telemetry:
      log_request_body: true
      log_response_body: true
    
service:
  telemetry:
    logs:
      level: debug
```

For easier analysis, limit the size of request body size. Use
`num_workers`=1 and lower `flush.bytes` and/or `flush.interval`.

**Link to tracking Issue:** <Issue number if applicable>

**Testing:** <Describe what testing was performed and which tests were
added.>

Manually verified with a modified integration test.

**Documentation:** <Describe the documentation added.>
LucaLanziani pushed a commit that referenced this issue Nov 9, 2024
… Histo --> Histogram (open-telemetry#33824)

## Description

This PR adds a custom metric function to the transformprocessor to
convert exponential histograms to explicit histograms.

Link to tracking issue: Resolves open-telemetry#33827

**Function Name**
```
convert_exponential_histogram_to_explicit_histogram
```

**Arguments:**

- `distribution` (_upper, midpoint, uniform, random_)
- `ExplicitBoundaries: []float64`

**Usage example:**

```yaml
processors:
  transform:
    error_mode: propagate
    metric_statements:
    - context: metric
      statements:
        - convert_exponential_histogram_to_explicit_histogram("random", [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]) 
```

**Converts:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: ExponentialHistogram
     -> AggregationTemporality: Delta
ExponentialHistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-31 09:35:25.212037 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
Bucket (32.000000, 64.000000], Count: 10
Bucket (64.000000, 128.000000], Count: 22
Bucket (128.000000, 256.000000], Count: 12
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

**To:**

```
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope  
Metric #0
Descriptor:
     -> Name: response_time
     -> Description: 
     -> Unit: 
     -> DataType: Histogram
     -> AggregationTemporality: Delta
HistogramDataPoints #0
Data point attributes:
     -> metric_type: Str(timing)
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2024-07-30 21:37:07.830902 +0000 UTC
Count: 44
Sum: 999.000000
Min: 40.000000
Max: 245.000000
ExplicitBounds #0: 10.000000
ExplicitBounds #1: 20.000000
ExplicitBounds #2: 30.000000
ExplicitBounds #3: 40.000000
ExplicitBounds #4: 50.000000
ExplicitBounds #5: 60.000000
ExplicitBounds #6: 70.000000
ExplicitBounds #7: 80.000000
ExplicitBounds #8: 90.000000
ExplicitBounds #9: 100.000000
Buckets #0, Count: 0
Buckets #1, Count: 0
Buckets #2, Count: 0
Buckets #3, Count: 2
Buckets #4, Count: 5
Buckets #5, Count: 0
Buckets #6, Count: 3
Buckets #7, Count: 7
Buckets #8, Count: 2
Buckets #9, Count: 4
Buckets #10, Count: 21
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}
```

### Testing

- Several unit tests have been created. We have also tested by ingesting
and converting exponential histograms from the `statsdreceiver` as well
as directly via the `otlpreceiver` over grpc over several hours with a
large amount of data.

- We have clients that have been running this solution in production for
a number of weeks.

### Readme description:

### convert_exponential_hist_to_explicit_hist

`convert_exponential_hist_to_explicit_hist([ExplicitBounds])`

the `convert_exponential_hist_to_explicit_hist` function converts an
ExponentialHistogram to an Explicit (_normal_) Histogram.

`ExplicitBounds` is represents the list of bucket boundaries for the new
histogram. This argument is __required__ and __cannot be empty__.

__WARNING:__

The process of converting an ExponentialHistogram to an Explicit
Histogram is not perfect and may result in a loss of precision. It is
important to define an appropriate set of bucket boundaries to minimize
this loss. For example, selecting Boundaries that are too high or too
low may result histogram buckets that are too wide or too narrow,
respectively.

---------

Co-authored-by: Kent Quirk <[email protected]>
Co-authored-by: Tyler Helmuth <[email protected]>
LucaLanziani pushed a commit that referenced this issue Nov 9, 2024
…etry#35544)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

As described at
open-telemetry#35491,
it is useful to provide the option to the users for defining
`receiver_creator`'s templates per container.

In this regard, the current PR introduces a new type of Endpoint called
`PodContainer` that matches the rule type `pod.container`. This Endpoint
is emitted for each container of the Pod similarly to how the `Port`
Endpoints are emitted per container that defines a port.

A complete example on how to use this feature to apply different parsing
on each of the Pod's container is provided in the `How to test this
manually` section.

**Link to tracking Issue:** <Issue number if applicable> Fixes
open-telemetry#35491

**Testing:** <Describe what testing was performed and which tests were
added.> TBA

**Documentation:** <Describe the documentation added.> TBA


### How to test this manually

1. Use the following values file to deploy the Collector's Helm chart
```yaml
mode: daemonset

image:
  repository: otelcontribcol-dev
  tag: "latest"
  pullPolicy: IfNotPresent

command:
  name: otelcontribcol

clusterRole:
  create: true
  rules:
   - apiGroups:
     - ''
     resources:
     - 'pods'
     - 'nodes'
     verbs:
     - 'get'
     - 'list'
     - 'watch'
   - apiGroups: [ "" ]
     resources: [ "nodes/proxy"]
     verbs: [ "get" ]
   - apiGroups:
       - ""
     resources:
       - nodes/stats
     verbs:
       - get
   - nonResourceURLs:
       - "/metrics"
     verbs:
       - get

extraVolumeMounts:
 - name: varlogpods
   mountPath: /var/log/pods
   readOnly: true

extraVolumes:
  - name: varlogpods
    hostPath:
      path: /var/log/pods

config:
  extensions:
    k8s_observer:
      auth_type: serviceAccount
      node: ${env:K8S_NODE_NAME}
      observe_nodes: true
  exporters:
    debug:
      verbosity: basic
  receivers:
    receiver_creator/logs:
      watch_observers: [ k8s_observer ]
      receivers:
        filelog/busybox:
          rule: type == "pod.container" && pod.labels["otel.logs"] == "true" && container_name == "busybox"
          config:
            include:
              - /var/log/pods/`pod.namespace`_`pod.name`_`pod.uid`/`container_name`/*.log
            include_file_name: false
            include_file_path: true
            operators:
              - id: container-parser
                type: container
              - type: add
                field: attributes.log.template
                value: busybox
        filelog/lazybox:
          rule: type == "pod.container" && pod.labels["otel.logs"] == "true" && container_name == "lazybox"
          config:
            include:
              - /var/log/pods/`pod.namespace`_`pod.name`_`pod.uid`/`container_name`/*.log
            include_file_name: false
            include_file_path: true
            operators:
              - id: container-parser
                type: container
              - type: add
                field: attributes.log.template
                value: lazybox
  service:
    extensions: [health_check, k8s_observer]
    pipelines:
      logs:
        receivers: [receiver_creator/logs]
        processors: [batch]
        exporters: [debug]
```
2. Follow the logs of the Collector's Pod i.e: `k logs -f
daemonset-opentelemetry-collector-agent-2hrg5`
3. Deploy a sample Pod which consists of 2 different containers:

```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: daemonset-logs
  labels:
    app: daemonset-logs
spec:
  selector:
    matchLabels:
      app.kubernetes.io/component: migration-logger
      otel.logs: "true"
  template:
    metadata:
      labels:
        app.kubernetes.io/component: migration-logger
        otel.logs: "true"
    spec:
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      containers:
        - name: lazybox
          image: busybox
          args:
            - /bin/sh
            - -c
            - while true; do echo "otel logs at $(date +%H:%M:%S)" && sleep 0.1s; done
        - name: busybox
          image: busybox
          args:
            - /bin/sh
            - -c
            - while true; do echo "otel logs at $(date +%H:%M:%S)" && sleep 0.1s; done
```

Verify in the logs that only 2 filelog receivers are started, one per
container:

```console
2024-10-02T12:05:17.506Z	info	[email protected]/observerhandler.go:96	starting receiver	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/lazybox", "endpoint": "10.244.0.13", "endpoint_id": "k8s_observer/01543800-cfea-4c10-8220-387e60f65151/lazybox"}
2024-10-02T12:05:17.508Z	info	adapter/receiver.go:47	Starting stanza receiver	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/lazybox/receiver_creator/logs{endpoint=\"10.244.0.13\"}/k8s_observer/01543800-cfea-4c10-8220-387e60f65151/lazybox"}
2024-10-02T12:05:17.508Z	info	[email protected]/observerhandler.go:96	starting receiver	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/busybox", "endpoint": "10.244.0.13", "endpoint_id": "k8s_observer/01543800-cfea-4c10-8220-387e60f65151/busybox"}
2024-10-02T12:05:17.510Z	info	adapter/receiver.go:47	Starting stanza receiver	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/busybox/receiver_creator/logs{endpoint=\"10.244.0.13\"}/k8s_observer/01543800-cfea-4c10-8220-387e60f65151/busybox"}
2024-10-02T12:05:17.709Z	info	fileconsumer/file.go:256	Started watching file	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/lazybox/receiver_creator/logs{endpoint=\"10.244.0.13\"}/k8s_observer/01543800-cfea-4c10-8220-387e60f65151/lazybox", "component": "fileconsumer", "path": "/var/log/pods/default_daemonset-logs-sz4zk_01543800-cfea-4c10-8220-387e60f65151/lazybox/0.log"}
2024-10-02T12:05:17.712Z	info	fileconsumer/file.go:256	Started watching file	{"kind": "receiver", "name": "receiver_creator/logs", "data_type": "logs", "name": "filelog/busybox/receiver_creator/logs{endpoint=\"10.244.0.13\"}/k8s_observer/01543800-cfea-4c10-8220-387e60f65151/busybox", "component": "fileconsumer", "path": "/var/log/pods/default_daemonset-logs-sz4zk_01543800-cfea-4c10-8220-387e60f65151/busybox/0.log"}
```

In addition verify that the proper attributes are added per container
according to the 2 different filelog receiver definitions:


```console
2024-10-02T12:23:55.117Z	info	ResourceLog #0
Resource SchemaURL: 
Resource attributes:
     -> k8s.pod.name: Str(daemonset-logs-sz4zk)
     -> k8s.container.restart_count: Str(0)
     -> k8s.pod.uid: Str(01543800-cfea-4c10-8220-387e60f65151)
     -> k8s.container.name: Str(lazybox)
     -> k8s.namespace.name: Str(default)
     -> container.id: Str(63a8e69bdc6ee95ee7918baf913a548190f32838adeb0e6189a8210e05157b40)
     -> container.image.name: Str(busybox)
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-10-02 12:23:54.896772888 +0000 UTC
Timestamp: 2024-10-02 12:23:54.750904381 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 12:23:54)
Attributes:
     -> log.iostream: Str(stdout)
     -> logtag: Str(F)
     -> log: Map({"template":"lazybox"})
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-sz4zk_01543800-cfea-4c10-8220-387e60f65151/lazybox/0.log)
Trace ID: 
Span ID: 
Flags: 0
ResourceLog #1
Resource SchemaURL: 
Resource attributes:
     -> k8s.container.restart_count: Str(0)
     -> k8s.pod.uid: Str(01543800-cfea-4c10-8220-387e60f65151)
     -> k8s.container.name: Str(busybox)
     -> k8s.namespace.name: Str(default)
     -> k8s.pod.name: Str(daemonset-logs-sz4zk)
     -> container.id: Str(47163758424f2bc5382b1e9702301be23cab368b590b5fbf0b30affa09b4a199)
     -> container.image.name: Str(busybox)
ScopeLogs #0
ScopeLogs SchemaURL: 
InstrumentationScope  
LogRecord #0
ObservedTimestamp: 2024-10-02 12:23:54.897788935 +0000 UTC
Timestamp: 2024-10-02 12:23:54.749885634 +0000 UTC
SeverityText: 
SeverityNumber: Unspecified(0)
Body: Str(otel logs at 12:23:54)
Attributes:
     -> log.file.path: Str(/var/log/pods/default_daemonset-logs-sz4zk_01543800-cfea-4c10-8220-387e60f65151/busybox/0.log)
     -> logtag: Str(F)
     -> log.iostream: Str(stdout)
     -> log: Map({"template":"busybox"})
Trace ID: 
Span ID: 
Flags: 0
```

Signed-off-by: ChrsMark <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant