Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch async workflow request encoding from json to thrift #5907

Conversation

taylanisikdemir
Copy link
Member

@taylanisikdemir taylanisikdemir commented Apr 11, 2024

What changed?
While testing async workflows in dev environment I noticed the workflow input bytes are not carried over to the final sync call. It's because request object is json marshaled by async api handlers before sent to the queue. Certain fields in our request payloads such as this one are excluded in json output because they may contain PII and can get logged.

Request objects implement FilteredRequestBody interface which exposes SerializeForLogging() function to authorizers (example) and authorizer implementation can choose to log response of SerializeForLogging(). This is used in uber internally for audit purposes.

To avoid breaking that existing "PII hiding from logs of jsonified requests" flow, I am switching encoding of async workflow requests to thrift.

Also updating the cherry pick logic of headers by carrying over all original headers as is to avoid follow up work to support context/trace propagation work for async APIs.

Why?
Carry over request and headers of async call to sync call as is.

How did you test it?
Unit tests

@coveralls
Copy link

coveralls commented Apr 12, 2024

Pull Request Test Coverage Report for Build 018ed3ca-9639-46a5-bb67-485c3ea5d103

Details

  • 21 of 23 (91.3%) changed or added relevant lines in 2 files are covered.
  • 66 unchanged lines in 10 files lost coverage.
  • Overall coverage decreased (-0.02%) to 67.357%

Changes Missing Coverage Covered Lines Changed/Added Lines %
service/frontend/api/handler.go 8 10 80.0%
Files with Coverage Reduction New Missed Lines %
service/history/handler/handler.go 2 52.41%
service/matching/taskListManager.go 2 80.2%
service/history/queue/timer_queue_processor_base.go 3 77.82%
common/persistence/statsComputer.go 3 96.07%
service/history/task/fetcher.go 5 85.57%
service/history/task/transfer_standby_task_executor.go 8 87.01%
service/history/task/cross_cluster_task_processor.go 8 80.79%
service/frontend/api/handler.go 8 61.98%
service/history/engine/engineimpl/historyEngine.go 9 68.34%
service/history/execution/mutable_state_task_refresher.go 18 63.29%
Totals Coverage Status
Change from base Build 018ed376-cbc0-45ff-ac2f-730a1871d338: -0.02%
Covered Lines: 98459
Relevant Lines: 146174

💛 - Coveralls

Copy link

codecov bot commented Apr 12, 2024

Codecov Report

Merging #5907 (5eb0e66) into master (b258e62) will increase coverage by 0.00%.
The diff coverage is 82.60%.

❗ Current head 5eb0e66 differs from pull request most recent head 2cc4d24. Consider uploading reports for the commit 2cc4d24 to get more accurate results

Additional details and impacted files
Files Coverage Δ
...n/asyncworkflow/queue/consumer/default_consumer.go 92.51% <100.00%> (+0.10%) ⬆️
common/codec/version0_thriftrw.go 73.91% <ø> (ø)
...r/asyncworkflow/async_workflow_consumer_manager.go 91.26% <ø> (ø)
service/frontend/api/handler.go 38.28% <60.00%> (-0.18%) ⬇️

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b258e62...2cc4d24. Read the comment docs.

@Groxx
Copy link
Member

Groxx commented Apr 12, 2024

Ah. Yes, we should not be relying on json except for explicit opt-in types for that exact purpose :| It's WILDLY error-prone for reasons like this.

That said, why not use thrift or proto? We already have these types, and they have well-understood behavior, known versioning/etc behavior, and are far more compact. Gob's sorta terrible for many purposes.

@taylanisikdemir taylanisikdemir force-pushed the taylan/request_serialization branch from 5d3c740 to 6da0680 Compare April 12, 2024 17:52
@taylanisikdemir taylanisikdemir changed the title Switch async workflow request encoding from json to gob Switch async workflow request encoding from json to thrift Apr 12, 2024
@taylanisikdemir
Copy link
Member Author

Ah. Yes, we should not be relying on json except for explicit opt-in types for that exact purpose :| It's WILDLY error-prone for reasons like this.

That said, why not use thrift or proto? We already have these types, and they have well-understood behavior, known versioning/etc behavior, and are far more compact. Gob's sorta terrible for many purposes.

Switched to thrift as it the most commonly used format in the codebase for similar purposes.

@taylanisikdemir taylanisikdemir force-pushed the taylan/request_serialization branch from 6da0680 to 5eb0e66 Compare April 12, 2024 19:01
@taylanisikdemir taylanisikdemir enabled auto-merge (squash) April 12, 2024 19:01
@taylanisikdemir taylanisikdemir force-pushed the taylan/request_serialization branch from 5eb0e66 to 2cc4d24 Compare April 12, 2024 19:30
@taylanisikdemir taylanisikdemir merged commit ba39678 into cadence-workflow:master Apr 12, 2024
18 checks passed
@taylanisikdemir taylanisikdemir deleted the taylan/request_serialization branch April 12, 2024 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants