Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align Postgres workflow_id Column with Cassandra #6520

Conversation

samuel-lindgren
Copy link
Contributor

@samuel-lindgren samuel-lindgren commented Nov 24, 2024

What changed?
The workflow_id column in the PostgreSQL schema has been updated from VARCHAR(255) to TEXT, removing the 255-character limit and aligning it with Cassandra’s schema, which allows longer workflow identifiers.

Why?
The 255-character limit restricted how much context we could include in workflow identifiers, which was limiting in certain cases. Updating the column to TEXT aligns PostgreSQL with Cassandra (reference) and allows higher-level enforcement of length restrictions if needed.

How did you test it?

  • POSTGRES=1 make test has been run successfully with some minor adjustments.
  • I have also tested it in my own development environment with long ids (exceeding 255) and it works fine.

Potential risks

  • Longer workflow identifiers could increase database row sizes, potentially impacting performance for workflows with large identifiers.
  • Applications relying on the 255-character limit might need adjustments to enforce length restrictions at a higher level.

Release notes

  • Schema update: workflow_id column in PostgreSQL now uses TEXT instead of VARCHAR(255).
  • Users relying on the 255-character limit should enforce it at the application level if needed.

Documentation Changes
Not sure if documentation is needed, since these changes will make the PostgreSQL schema more aligned with the Cassandra schema.

Detailed Description
These changes relax the constraints on the workflow_id columns and should not cause issues during upgrades.

Impact Analysis

  • Backward Compatibility: These changes only alter the column constraints, not the data. Downgrades should be fine as long as workflow_id values remain within the previous 255-character limit. If any exceed this limit, users would have already encountered errors in PostgreSQL.
  • Forward Compatibility: Forward compatibility is unaffected since this only relaxes constraints. Users with very long workflow_id values may need to monitor memory usage, storage, and performance.

Testing Plan

  • Unit Tests: Unit tests likely do not cover PostgreSQL schema changes.
  • Persistence Tests: No new tables are added, so current persistence tests should suffice.
  • Integration Tests: Existing tests that create and start workflows should already cover this. Additional tests for workflow_id values longer than 255 characters could be added but may not be necessary.
  • Compatibility Tests: Compatibility should be fine since the data remains unchanged. However, if users upgrade and use longer workflow_id values, downgrades will require truncating those IDs manually. This scenario is rare, and users can handle it themselves if needed.

Rollout Plan
What is the rollout plan?
Rollout as usual.
Does the order of deployment matter?
No, even if cadence_visibility still uses the old schema, it should still work since old workflow_ids should be covered by the old column type VARCHAR(255)
Is it safe to rollback?
Should be safe to rollback, unless workflow_id configuration has been changed, resulting in ids exceeding 255 character limit.
Does the order of rollback matter?
Not to my knowledge.
Is there a kill switch to mitigate the impact immediately?
Should not be necessary.

@samuel-lindgren samuel-lindgren changed the title change workflow_id from varchar 255 to text in postgresql Align Postgres workflow_id Column with Cassandra Nov 24, 2024
@samuel-lindgren samuel-lindgren force-pushed the postgres-increase-worklow-id-length branch 2 times, most recently from 62d6382 to 9fe00aa Compare November 26, 2024 09:20
@samuel-lindgren samuel-lindgren force-pushed the postgres-increase-worklow-id-length branch from 9fe00aa to 127edea Compare November 26, 2024 09:22
@@ -467,11 +467,11 @@ func TestListBatchJobs(t *testing.T) {
expectedOutput: []map[string]string{
{
"jobID": "example-workflow-1",
"startTime": "1970-01-01T00:00:01Z",
"startTime": "1970-01-01T01:00:01+01:00",
Copy link
Contributor Author

@samuel-lindgren samuel-lindgren Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this is just a temporary patch to ensure POSTGRES=1 make test works locally for me and will likely be excluded in the final changes. I will just keep this change temporarily, only to show what worked for me. Furthermore, this has nothing to do with the other changes in this PR, as I have verified that the issue persist in the master branch.

Copy link
Member

@natemort natemort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once schema files are updated, thanks for your work on this!

@@ -0,0 +1,19 @@
ALTER TABLE executions ALTER workflow_id TYPE text;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also update cadence/schema.sql and visibility/schema.sql to reflect these changes? During first time setup we run only the schema files, the versions are used only when an existing cluster is upgraded. The schema files should match the outcome of running all the version upgrades.

@natemort natemort merged commit e5e553f into cadence-workflow:master Dec 5, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants