Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing matching:TestCheckIdleTaskList test flackiness #5494

Merged
merged 1 commit into from
Dec 19, 2023

Conversation

dkrotx
Copy link
Member

@dkrotx dkrotx commented Dec 18, 2023

For some reason it almost always fails in IDE (4/5 cases, goland),
but passes in console. At the same time, the test clearly had issues:

  • it's been 3 subtests in one
  • only the first subtest checked task-list became stopped when unused
  • the third subtest (the one which has been failing) never succeeded
    with tlm.AddTask since writer never started - instead of tlm.Start()
    we used special version which supposed to mimic it, but never started
    the writer.

The test is still problematic since it continues using time.Sleep all
around, but so do all the other tests for matching :-(

More comments and error-messages also bring more clarity to what the
test is actually for.

fix test flackiness in IDE

To be able to run tests in GoLand

make test + running all tests in GoLand

No risks, it is only the test been improved

Release notes

Documentation Changes

// mimic tlm.Start() but avoid calling notifyEvent
tlm.liveness.Start()
tlm.startWG.Done()
go tlm.taskReader.dispatchBufferedTasks(defaultTaskBufferIsolationGroup)
Copy link
Member

@davidporter-id-au davidporter-id-au Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, small brain following along here, are we ensuring these daemons are started in the new test?

I 100% agree these tests are flakey and need some refactoring, but I'd also hate for us to lose testing of functionality, so it might just take me a minute to ensure they're equivalent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the code - these daemons are starting among with tlm.taskReader.Start():

for g := range tr.taskBuffers {
	go tr.dispatchBufferedTasks(g)
}
go tr.getTasksPump()

I think the previous approach where we started selected things bypassing the tlm methods actually a bad approach. We were trying to avoid notifyEvent (which is, I believe, no longer happening in .Start), but also missed some important changes like starting the writer.

I think it is really important for us to cover the happy-paths as well. Otherwise tests like TestAddTaskStandby pass unintentionally: it assumed we had to stop the writer though it's never been started!

For some reason it almost always fails in IDE (4/5 cases, goland),
but passes in console. At the same time, the test clearly had issues:
 - it's been 3 subtests in one
 - only the first subtest checked task-list became stopped when unused
 - the third subtest (the one which has been failing) never succeeded
   with `tlm.AddTask` since writer never started - instead of tlm.Start()
   we used special version which supposed to mimic it, but never started
   the writer.

The test is still problematic since it continues using time.Sleep all
around, but so do all the other tests for matching :-(

More comments and error-messages also bring more clarity to what the
test is actually for.
@dkrotx dkrotx force-pushed the fix-matching-test-flakiness branch from c7d4ed1 to efa4696 Compare December 19, 2023 11:29
@dkrotx dkrotx merged commit 2829cbb into cadence-workflow:master Dec 19, 2023
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants