Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for kafka emitter producer secrets to be masked in logs #15485

Conversation

TSFenwick
Copy link
Contributor

@TSFenwick TSFenwick commented Dec 4, 2023

This change will allow for kafka producer config values that should be secrets to not show up in the logs. This will enhance the security of the people who use the kafka emitter to use this if they want to. This is opt in and will not affect prior configs for this emitter

Fixes an issue where kafka producer configs that would be used for securing the producer to the kafka topic would be logged in plaintext.

Description

add a new config option to the kafka emitter config that will leverage DynamicConfigProvider.java to mask a map of configs for the kafka producer.

If there are duplicate keys in the secrets config and the not masked config the secrets config value will be the one used.

for example the runtime.properties for the the config can look like this when leveraging the MapStringDynamicConfigProvider.java

druid.emitter.kafka.producer.config={"sasl.mechanism": "PLAIN", "security.protocol": "SASL_SSL"}
druid.emitter.kafka.producer.hiddenProperties={"config":{"sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\\"KV...NI\\" password=\\"gA3...n6a/\\";"}}

also made minor change to DynamicConfigProviderUtils to use putAll instead of iterating and putting in that class.

how the properties now look in the logs

2023-12-05T15:58:28,661 INFO [main] org.apache.druid.cli.CliPeon - * druid.emitter: kafka
2023-12-05T15:58:28,661 INFO [main] org.apache.druid.cli.CliPeon - * druid.emitter.kafka.bootstrap.servers: localhost:9092
2023-12-05T15:58:28,662 INFO [main] org.apache.druid.cli.CliPeon - * druid.emitter.kafka.event.types: ["metrics"]
2023-12-05T15:58:28,662 INFO [main] org.apache.druid.cli.CliPeon - * druid.emitter.kafka.metric.topic: test_12_4_23
2023-12-05T15:58:28,662 INFO [main] org.apache.druid.cli.CliPeon - * druid.emitter.kafka.producer.config: {"sasl.mechanism": "PLAIN", "security.protocol": "SASL_SSL"}
2023-12-05T15:58:28,662 INFO [main] org.apache.druid.cli.CliPeon - * druid.emitter.kafka.producer.hiddenProperties: <masked>

3 test cases:
Tested this with valid config and secret value so it's masked. Worked fine
Tested this with the secrets left in the normal producer config and no line for the secrets config. Worked fine
Tested this with the secrets left in the normal producer config and the mapString was empty. Worked fine.

Release note


Key changed/added classes in this PR
  • KafkaEmitterConfig
  • KafkaEmitter

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

… of being visible

This change will allow for kafka producer config values that should be secrets to not show up in the logs.
This will enhance the security of the people who use the kafka emitter to use this if they want to.
This is opt in and will not affect prior configs for this emitter
@@ -83,7 +87,8 @@ public KafkaEmitterConfig(
@Nullable @JsonProperty("request.topic") String requestTopic,
@Nullable @JsonProperty("segmentMetadata.topic") String segmentMetadataTopic,
@JsonProperty("clusterName") String clusterName,
@JsonProperty("producer.config") @Nullable Map<String, String> kafkaProducerConfig
@JsonProperty("producer.config") @Nullable Map<String, String> kafkaProducerConfig,
@JsonProperty("producer.secrets.config") @Nullable DynamicConfigProvider<String> kafkaProducerSecrets
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

property name could use some work imo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related configs that currently exist are druid.startup.logging.maskProperties and druid.server.hiddenProperties. Maybe something like producer.hiddenProperties?

@TSFenwick TSFenwick marked this pull request as ready for review December 5, 2023 16:07
@abhishekrb19
Copy link
Contributor

Thanks, @TSFenwick! Masking sensitive values makes sense to me. I think we can mask the sensitive keys/values directly in druid.emitter.kafka.producer.config instead of splitting up the config into secrets and non-secret configs. To do that, I think:

  • We can extend the masking logic in getMaskedCommand to also account for parsing value strings, i.e., splits[1] (as with sasl.jaas.config). To keep it simple, we can consider only values of type Map for now.
  • Then, recurse through the values in the map to see if a property can be masked.

This would more generally help mask any sensitive properties in the config. What do you think?

@TSFenwick
Copy link
Contributor Author

TSFenwick commented Dec 7, 2023

@abhishekrb19

We can extend the masking logic in getMaskedCommand to also account for parsing value strings, i.e., splits[1] (as with sasl.jaas.config). To keep it simple, we can consider only values of type Map for now.

I like this idea, but i think it becomes more complicated when needing to consider generic things, like how far down a map should we go. Also the less handwritten parsing code i write the better for everyone :) The other question is defining the values to be masked in core druid code isn't probably the best. Using the dynamic config provider means it can mask any and all values stored in the configProvider's map.

I think the solution i have implemented in this PR suffices enough and moves the needle forward in protecting secrets. Also this change follows the practices Druid uses elsewhere to secure maps of values like in the Kafka consumer https://github.com/apache/druid/blob/master/extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaRecordSupplier.java#L263

That being said it would be an improvement to mask the only the values of the map instead of the entire map so its easier to debug if you missed an entire thing

Copy link
Contributor

@abhishekrb19 abhishekrb19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is reasonable. I left a couple of tangential ideas as suggestions to plumb in credentials that we can explore separately. In the mean time, let's keep the added property undocumented. Thanks, @TSFenwick!

@abhishekrb19 abhishekrb19 merged commit 901ebbb into apache:master Dec 15, 2023
83 checks passed
@TSFenwick TSFenwick deleted the feature-allow-for-masked-producer-configs-for-kafka-emitter branch January 11, 2024 22:34
@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants