Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty tag is being added after conversion from zipkin #2595

Closed
mzahor opened this issue Oct 28, 2020 · 4 comments · Fixed by #2596
Closed

Empty tag is being added after conversion from zipkin #2595

mzahor opened this issue Oct 28, 2020 · 4 comments · Fixed by #2596
Labels

Comments

@mzahor
Copy link
Contributor

mzahor commented Oct 28, 2020

Describe the bug
When receiving zipkin spans, a tag with empty tag name is being added.

To Reproduce
Steps to reproduce the behavior:

  1. Configure jaeger to use ElasticSearch as a storage backend. Make sure you set tags-as-fields: all: true.
  2. Send a zipkin span in json format. It not have a kind property (which happens for "internal" spans) and have a remoteEndpoint property != null. The's an example attached.
  3. As result, this span won't be saved in ElasticSearch, and you'll get an error in the logs, saying that "field name cannot be an empty string" (details below)

Expected behavior
Spans should be saved in ES, empty tags shouldn't be added.

I did some debugging, and seems like the issue is in spanV2ToThrift function, specifically here:
https://github.com/jaegertracing/jaeger/blob/8388432/cmd/collector/app/zipkin/jsonv2.go#L82
There's no default case in this switch in remoteEndpToThrift function:
https://github.com/jaegertracing/jaeger/blob/8388432/cmd/collector/app/zipkin/jsonv2.go#L108
So for "internal" spans it will be empty.
I'd suggest not to add tags with empty key name, maybe check it here: https://github.com/jaegertracing/jaeger/blob/8388432/cmd/collector/app/zipkin/jsonv2.go#L87

Version (please complete the following information):

  • OS: Linux (same on MacOS), issue isn't related to particular OS
  • Jaeger version: 1.20.0
  • Deployment: docker, bare-metal (doesn't really matter)

Additional context
Jaeger config:

es:
    server-urls: 'http://localhost:9200'
    use-aliases: true
    tags-as-fields:
        all: true

Sample zipkin span:

[
    {
      "timestamp": 1603846186206637,
      "duration": 948030,
      "traceId": "b67eec3d0a95860762b699bcca5b5b44",
      "id": "43ddcfd6bbb09064",
      "parentId": "17e9588ebf5e0c0d",
      "name": "get /alerts/definitions/",
      "localEndpoint": { "serviceName": "aspecto-alerts" },
      "remoteEndpoint": { "ipv6": "127.0.0.1" },
      "tags": {
        "express.runtime.layers": "{\"name\":\"root\",\"stack\":[{\"path\":\"/\",\"params\":{},\"name\":\"urlencodedParser\"},{\"path\":\"/\",\"params\":{},\"name\":\"jsonParser\"},{\"path\":\"/\",\"params\":{},\"name\":\"corsMiddleware\"},{\"path\":\"/alerts/definitions/\",\"params\":{},\"name\":\"bound dispatch\",\"stack\":[{\"name\":\"authMiddleware\",\"method\":\"get\"},{\"name\":\"asyncUtilWrap\",\"method\":\"get\"}]}]}",
        "http.flavor": "1.1",
        "http.host": "host.docker.internal:8083",
        "http.method": "GET",
        "http.path": "/alerts/definitions",
        "http.request.body": "",
        "http.scheme": "http",
        "http.status_code": "200",
        "http.target": "/alerts/definitions",
        "otlp.instrumentation.library.name": "@aspecto/opentelemetry-plugin-express",
        "otlp.instrumentation.library.version": "0.0.73",
        "span.kind": "internal",
        "status.code": "STATUS_CODE_OK",
        "telemetry.sdk.language": "nodejs"
      }
    }
  ]

Jaeger logs when trying to process such span:

{
   "msg"  :  "Elasticsearch part of bulk request failed" ,
   "caller"  :  "config/config.go:137" ,
   "map-key"  :  "index" ,
   "stacktrace"  :  "github.com/jaegertracing/jaeger/pkg/es/config.(*Configuration).NewClient.func2
github.com/jaegertracing/jaeger/pkg/es/config/config.go:137
github.com/olivere/elastic.(*bulkWorker).commit
github.com/olivere/[email protected]+incompatible/bulk_processor.go:588
github.com/olivere/elastic.(*bulkWorker).work
github.com/olivere/[email protected]+incompatible/bulk_processor.go:501" ,
   "level"  :  "error" ,
   "response"  : {
     "_index"  :  "jaeger-span-000027" ,
     "_type"  :  "_doc" ,
     "_id"  :  "qnXwanUBvSeIyqwub0QY" ,
     "status"  :  400 ,
     "error"  : {
       "type"  :  "mapper_parsing_exception" ,
       "reason"  :  "failed to parse" ,
       "caused_by"  : {
         "reason"  :  "field name cannot be an empty string" ,
         "type"  :  "illegal_argument_exception" 
      }
    }
  },
   "ts"  :  1603816943.3957188 
}
@yurishkuro
Copy link
Member

If this is internal span, why does it specify remote endpoint?

@yurishkuro
Copy link
Member

yurishkuro commented Oct 28, 2020

the simpler fix is to add default: return nil, nil to that switch statement

switch kind {
case models.SpanKindCLIENT:
key = zipkincore.SERVER_ADDR
case models.SpanKindSERVER:
key = zipkincore.CLIENT_ADDR
case models.SpanKindCONSUMER, models.SpanKindPRODUCER:
key = zipkincore.MESSAGE_ADDR
}

and handle nil in the caller

rAddrAnno, err := remoteEndpToThrift(s.RemoteEndpoint, s.Kind)
if err != nil {
return nil, err
}
tSpan.BinaryAnnotations = append(tSpan.BinaryAnnotations, rAddrAnno)

@mzahor
Copy link
Contributor Author

mzahor commented Oct 28, 2020

We're using opentelemetry collector with zipkin exporter, it gets spans from opentelemetry-js SDK. I think there can also be an issue in one of them (collector or js sdk). But in any case it worth fixing it here too. Took me quite some time to understand what's going on.
Regarding the fix (returning nil, nil), that's smth I imagined. I can create a PR, if you think it would be a good fix.

@yurishkuro
Copy link
Member

it's worth fixing here, a switch without default clause is not the most robust approach anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants