Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It's expensive to re-run integration tests #2243

Closed
Tracked by #2774
squaremo opened this issue Nov 22, 2022 · 2 comments · Fixed by #2831
Closed
Tracked by #2774

It's expensive to re-run integration tests #2243

squaremo opened this issue Nov 22, 2022 · 2 comments · Fixed by #2831
Assignees
Labels
area/tests impact/quality kind/engineering Work that is not visible to an external user resolution/fixed This issue was fixed
Milestone

Comments

@squaremo
Copy link
Contributor

The way the integration tests (run by workflow build among others) are run, if the set of tests for a language fails, it cannot be rerun by itself. This means wasted resources, but perhaps more importantly, a longer turnaround time and (in the presence of flaky tests) a much larger chance of repeated failures.

@EronWright
Copy link
Contributor

EronWright commented Jan 11, 2024

Possible improvements may include:

  • use a local cluster rather than a cloud cluster (though some tests may require the latter)
  • CI refactoring, e.g. finer-grained workflow steps and/or be able to re-run a specific suite

@rquitales rquitales added this to the 0.100 milestone Jan 30, 2024
rquitales added a commit that referenced this issue Mar 12, 2024
### Proposed changes

This PR switches the underlying test infrastructure to provision KinD
clusters for PR checks. This will drastically speed up the feedback loop
from the current ~45 mins to 25 mins. Furthermore, this also pushes the
cluster creation step into the test job to enable re-running only the
failed job should a test flake occur.

## Changes made:

- Updated the GHA workflow to run PR checks in Kind clusters to reduce
feedback loop
- Note: post-submit checks still run against a GKE cluster as some test
cases can't don't pass in KinD
- Updated existing tests to either work with Kind if they are simple, or
to be skipped if the test needs a full cloud k8s cluster
- Broke apart the Golang tests to run concurrently as GH Action workers
now do not OOM
- Fixed some hard-coded test logic that breaks on newer k8s versions

### Related issues (optional)

Fixes: #2243
@pulumi-bot pulumi-bot added the resolution/fixed This issue was fixed label Mar 12, 2024
rquitales added a commit that referenced this issue Mar 13, 2024
### Proposed changes

This PR switches the underlying test infrastructure to provision KinD
clusters for PR checks. This will drastically speed up the feedback loop
from the current ~45 mins to 25 mins. Furthermore, this also pushes the
cluster creation step into the test job to enable re-running only the
failed job should a test flake occur.

## Changes made:

- Updated the GHA workflow to run PR checks in Kind clusters to reduce
feedback loop
- Note: post-submit checks still run against a GKE cluster as some test
cases can't don't pass in KinD
- Updated existing tests to either work with Kind if they are simple, or
to be skipped if the test needs a full cloud k8s cluster
- Broke apart the Golang tests to run concurrently as GH Action workers
now do not OOM
- Fixed some hard-coded test logic that breaks on newer k8s versions

### Related issues (optional)

Fixes: #2243
(cherry picked from commit 101fcfb)
@EronWright
Copy link
Contributor

Addressed by switching to KinD: #2831

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tests impact/quality kind/engineering Work that is not visible to an external user resolution/fixed This issue was fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants