Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add create_ignore_already_exists to workload_identity #2142

Conversation

flozzone
Copy link
Contributor

Adds create_ignore_already_exists to workload-identity module to ignore an already existing google-service-account when hitting the Provider produced inconsistent result after apply issue.

Fixes #2141

@flozzone flozzone requested review from ericyz, gtsorbo and a team as code owners October 17, 2024 06:44
Copy link

google-cla bot commented Oct 17, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@flozzone flozzone force-pushed the bugfix/2141-create_ignore_already_exists branch 2 times, most recently from c018f10 to 04efb9c Compare October 17, 2024 06:51
@apeabody
Copy link
Collaborator

/gcbrun

Copy link
Collaborator

@apeabody apeabody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @flozzone!

From the linter:

Running terraform fmt
modules/workload-identity/main.tf
--- old/modules/workload-identity/main.tf
+++ new/modules/workload-identity/main.tf
@@ -42,11 +42,11 @@
 resource "google_service_account" "cluster_service_account" {
   count = var.use_existing_gcp_sa ? 0 : 1
 
-  account_id                    = local.gcp_given_name
-  display_name                  = coalesce(var.gcp_sa_display_name, substr("GCP SA bound to K8S SA ${local.k8s_sa_project_id}[${local.k8s_given_name}]", 0, 100))
-  description                   = var.gcp_sa_description
-  project                       = var.project_id
-  create_ignore_already_exists  = var.create_ignore_already_exists
+  account_id                   = local.gcp_given_name
+  display_name                 = coalesce(var.gcp_sa_display_name, substr("GCP SA bound to K8S SA ${local.k8s_sa_project_id}[${local.k8s_given_name}]", 0, 100))
+  description                  = var.gcp_sa_description
+  project                      = var.project_id
+  create_ignore_already_exists = var.create_ignore_already_exists
 }
 
 resource "kubernetes_service_account" "main" {
Error: terraform fmt failed with exit code 3
Check the output for diffs and correct using terraform fmt <dir>

@flozzone flozzone force-pushed the bugfix/2141-create_ignore_already_exists branch 2 times, most recently from 2cde06f to 791a97c Compare October 18, 2024 04:47
@flozzone flozzone requested a review from apeabody October 21, 2024 09:05
@flozzone flozzone force-pushed the bugfix/2141-create_ignore_already_exists branch from 791a97c to c8b2903 Compare October 22, 2024 10:30
@apeabody
Copy link
Collaborator

/gcbrun

@flozzone flozzone force-pushed the bugfix/2141-create_ignore_already_exists branch from c8b2903 to 203a0f6 Compare October 23, 2024 06:48
@flozzone
Copy link
Contributor Author

Hi @apeabody, I've corrected linting errors but now the build is failing and I cannot see any logs at https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/pull/2142/checks?check_run_id=31930708825 and I rebased now to see if that caused the issue.

@apeabody
Copy link
Collaborator

/gcbrun

@apeabody
Copy link
Collaborator

Hi @apeabody, I've corrected linting errors but now the build is failing and I cannot see any logs at https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/pull/2142/checks?check_run_id=31930708825 and I rebased now to see if that caused the issue.

I've triggered the test, but we've been having unrelated intermittent test failures (timeout) that we are also working to resolve.

@apeabody
Copy link
Collaborator

/gcbrun

@apeabody apeabody self-assigned this Oct 23, 2024
@apeabody
Copy link
Collaborator

/gcbrun

@apeabody
Copy link
Collaborator

/gcbrun

Copy link
Collaborator

@apeabody apeabody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @flozzone!

A few small notes before we merge.

modules/workload-identity/variables.tf Outdated Show resolved Hide resolved
modules/workload-identity/variables.tf Outdated Show resolved Hide resolved
@flozzone flozzone requested a review from apeabody October 25, 2024 16:12
@apeabody
Copy link
Collaborator

/gcbrun

@apeabody apeabody changed the title fix: Add create_ignore_already_exists to workload_identity (#2141) feat: Add create_ignore_already_exists to workload_identity (#2141) Oct 25, 2024
@apeabody apeabody changed the title feat: Add create_ignore_already_exists to workload_identity (#2141) feat: Add create_ignore_already_exists to workload_identity Oct 25, 2024
@apeabody apeabody merged commit 76d779c into terraform-google-modules:master Oct 25, 2024
4 checks passed
@flozzone flozzone deleted the bugfix/2141-create_ignore_already_exists branch October 28, 2024 09:47
@inverse-panda
Copy link

I thought the workflow is to use use_existing_gcp_sa to basically use an exising google service account.
Now I got the this error in terraform cloud
image

Is there something I need to do as a user of this module? Below you can see my terraform object

module "dataray_mlflow_stage_indentity" {
  source       = "terraform-google-modules/kubernetes-engine/google//modules/workload-identity"
  name         = "dataray-mlflow-stage"
  namespace    = "stage"
  cluster_name = var.cluster_name
  location     = var.zone
  project_id   = var.project
  roles        = ["roles/storage.admin", "roles/storage.objectCreator", "roles/cloudsql.client", "roles/bigquery.dataOwner", "roles/bigquery.user"]
}

@apeabody
Copy link
Collaborator

I thought the workflow is to use use_existing_gcp_sa to basically use an exising google service account. Now I got the this error in terraform cloud image

Is there something I need to do as a user of this module? Below you can see my terraform object

module "dataray_mlflow_stage_indentity" {
  source       = "terraform-google-modules/kubernetes-engine/google//modules/workload-identity"
  name         = "dataray-mlflow-stage"
  namespace    = "stage"
  cluster_name = var.cluster_name
  location     = var.zone
  project_id   = var.project
  roles        = ["roles/storage.admin", "roles/storage.objectCreator", "roles/cloudsql.client", "roles/bigquery.dataOwner", "roles/bigquery.user"]
}

Hi @inverse-panda - Can you share the version of the provider (terraform version) and module you are using? The output of terraform plan mightal also be useful.

I would also strongly recommend specifying the major version of the module such as:

module "workload_identity_existing_ksa" {
  source  = "terraform-google-modules/kubernetes-engine/google//modules/workload-identity"
  version = "~> 34.0"

@flozzone
Copy link
Contributor Author

flozzone commented Oct 31, 2024

@inverse-panda I get the point, it might be a bit misleading now. But gcp_sa_create_ignore_already_exists is used in case Terraform fails to create the SA due to GCP IAM eventual consistency principle. See hashicorp/terraform-provider-google#18087

And today I found out that recent versions of the Google Provider have fixed that problem on provider end https://github.com/hashicorp/terraform-provider-google/releases/tag/v6.7.0 with hashicorp/terraform-provider-google#19727

@flozzone
Copy link
Contributor Author

@apeabody It seems I should have checked provider version compatibility and bump the required version

@apeabody
Copy link
Collaborator

@apeabody It seems I should have checked provider version compatibility and bump the required version

Hi @flozzone v34 was specifically released to still support TPGv5. The upcoming V35 will be TPGv6 only and likely require at least TPG v6.7+. Do you think we should revert this addition in v35?

@flozzone
Copy link
Contributor Author

flozzone commented Nov 1, 2024

Hi @apeabody, I see create_ignore_already_exists has been introduced in TPGv5.12 with #16927, so TPGv5 already supports it. So I think @inverse-panda issue is caused by TPG minimum version constraint of >= 3.39.0 in required_providers. If we would bump it to >= v5.12.0, it would still be TPGv5 compatible.

So people having the Provider produced inconsistent result after apply when creating service accounts would either have to use TPGv6.7 for a proper fix or use create_ignore_already_exists workaround from TPGv5.12.

@apeabody
Copy link
Collaborator

apeabody commented Nov 1, 2024

Hi @apeabody, I see create_ignore_already_exists has been introduced in TPGv5.12 with #16927, so TPGv5 already supports it. So I think @inverse-panda issue is caused by TPG minimum version constraint of >= 3.39.0 in required_providers. If we would bump it to >= v5.12.0, it would still be TPGv5 compatible.

So people having the Provider produced inconsistent result after apply when creating service accounts would either have to use TPGv6.7 for a proper fix or use create_ignore_already_exists workaround from TPGv5.12.

Thanks @flozzone! Would you be able to open a PR bumping the version to >= v5.12.0?

flozzone added a commit to flozzone/terraform-google-kubernetes-engine that referenced this pull request Nov 1, 2024
@flozzone
Copy link
Contributor Author

flozzone commented Nov 1, 2024

Here we go #2170 @apeabody

@gustysap
Copy link

gustysap commented Nov 2, 2024

Hi, I use TPG ~> 4.0 and got this

│   on .terraform/modules/wi-name/modules/workload-identity/main.tf line 49, in resource "google_service_account" "cluster_service_account":
│   49:   create_ignore_already_exists = var.gcp_sa_create_ignore_already_exists

I need to upgrade the TPG or what ? currently I manually remove the

create_ignore_already_exists = var.gcp_sa_create_ignore_already_exists
in the main.tf of module

@apeabody
Copy link
Collaborator

apeabody commented Nov 4, 2024

Hi, I use TPG ~> 4.0 and got this

│   on .terraform/modules/wi-name/modules/workload-identity/main.tf line 49, in resource "google_service_account" "cluster_service_account":
│   49:   create_ignore_already_exists = var.gcp_sa_create_ignore_already_exists

I need to upgrade the TPG or what ? currently I manually remove the

create_ignore_already_exists = var.gcp_sa_create_ignore_already_exists in the main.tf of module

Hi @gustysap In v34 the workload-identity sub-module actually requires TPG >-=5.12.0. You may be able to use terraform init -upgrade.

@inverse-panda
Copy link

Thanks a lot @flozzone, updating the version to TGP = "~> 5.12.0" solved my issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Service account: Provider produced inconsistent result after apply
4 participants