Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node_pool.# count is incorrect "2" => "1" (forces new resource) #3107

Closed
baboune opened this issue Feb 22, 2019 · 8 comments
Closed

node_pool.# count is incorrect "2" => "1" (forces new resource) #3107

baboune opened this issue Feb 22, 2019 · 8 comments

Comments

@baboune
Copy link

baboune commented Feb 22, 2019

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
  • If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

terraform v0.11.10

  • provider.google v1.20.0
  • provider.google-beta v2.0.0

Affected Resource(s)

  • google_container_node_pool
  • google_container_cluster

Terraform Configuration Files

locals {
  gke_version = "1.11.3-gke.18"
}

resource "google_container_cluster" "gke_cluster" {
  name    = "test-gke-cluster-1"
  zone    = "europe-north1-a"
  
  additional_zones = [
    "europe-north1-b",
  ]

  node_pool {
    name                = "default-pool"
    version             = "${local.gke_version}"

    initial_node_count  = "1"
    node_config {
      machine_type      = "n1-standard-1"

      oauth_scopes = [
        "https://www.googleapis.com/auth/compute",
        "https://www.googleapis.com/auth/cloud-platform",
        "https://www.googleapis.com/auth/datastore",
        "https://www.googleapis.com/auth/devstorage.full_control",
        "https://www.googleapis.com/auth/logging.write",
        "https://www.googleapis.com/auth/monitoring",
      ]

      labels {
        env  = "dev"
        k8s = "testing"
        terraform = "infra"
      }

      tags = ["dev", "infra", "testing", "k8s"]
    }
    
  }
}

resource "google_container_node_pool" "monitoring_nodes" {
  # Use beta
  provider   = "google-beta"
  
  name       = "gke-monitoring-pool"
  zone       = "${var.default_zone}"
  cluster    = "${google_container_cluster.gke_cluster.name}"
  version    = "${local.gke_version}"

  node_count = 1

  node_config {
    
    machine_type = "n1-standard-1"

    oauth_scopes = [
      "https://www.googleapis.com/auth/compute",
      "https://www.googleapis.com/auth/devstorage.read_only",
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    labels {
        category    = "monitoring"
        env         = "dev"
        k8s         = "testing"
        terraform   = "infra"
    }
    tags = ["dev", "monitoring", "infra", "testing"]
    taint {
         key    = "reserved",
         value  = "monitoring", 
         effect = "NO_SCHEDULE"
    }
  }
}

Debug Output

output for plan:
https://gist.github.com/baboune/be8d446b1656fc98f598ccadc41a6c9d

Panic Output

No panic

Expected Behavior

Once applied succesfully, if I terraform plan again, no changes should be needed.

Actual Behavior

terraform apply should create a stable deployment.

Instead, after terraform apply is applied the resources are created, but on any subsequent apply/plan terraform wants to destroy and re-create the node pool.

Steps to Reproduce

  1. terraform apply
  2. terraform apply or terraform plan

Important Factoids

References

Seem similar to #2115

ie that the pb might be because of the node_config section. However in this case, the default node pool is used so it is not clear how to deal with the node_config section in the google_container_cluster part?

@ghost ghost added the bug label Feb 22, 2019
@rileykarson
Copy link
Collaborator

Hey @baboune!

The cause of this is that Terraform isn't able to tell you've defined a separate "fine-grained" node pool using the google_container_node_pool resource. Seeing that you've defined a single node pool inline (in google_container_cluster), Terraform wants to "correct" what it sees in the API- leading to this diff.

If possible, our recommendation is to use exclusively fine-grained node pools such as in this example, removing the default pool with remove_default_node_pool = true.

I believe it's possible to use the default node pool as a fine-grained node pool by importing it and omitting any node_pool blocks from google_container_cluster; alternatively, you can ignore_changes the node_pool, and Terraform won't perform diffs on that field. Note that using ignore_changes can have unintended effects if/when Terraform recreates or updates the cluster resource.

@baboune
Copy link
Author

baboune commented Feb 22, 2019

@rileykarson It is confusing. The example https://www.terraform.io/docs/providers/google/r/container_node_pool.html#example-usage-2-node-pools-1-separately-managed-the-default-node-pool is similar to what I did, and that one works.

There should be more info in the doc about how this node_config causes problems.

@rileykarson
Copy link
Collaborator

rileykarson commented Feb 22, 2019

Ah yep- the difference is just that if node_pool is omitted in google_container_cluster Terraform performs how we expect.

I'll reopen this to add specific warnings to the top of one/both resources about using that subfield and the finegrained resource in tandem, similar to what we do for IAM ones.

@baboune
Copy link
Author

baboune commented Feb 25, 2019

OK, so when I add:
lifecycle {
ignore_changes = ["node_pool"]
}

Then terraform wants to update the network:

$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

google_container_cluster.gke_cluster: Refreshing state... (ID: tf-test-gke-cluster-1)
data.google_compute_default_service_account.default: Refreshing state...
data.google_compute_network.default: Refreshing state...
google_container_node_pool.monitoring_nodes: Refreshing state... (ID: europe-north1-a/tf-test-gke-cluster-1/gke-monitoring-pool)

------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  ~ update in-place
Terraform will perform the following actions:
  ~ module.flames-service-cluster.google_container_cluster.gke_cluster
      network: "projects/flames-dev/global/networks/default" => "default"
Plan: 0 to add, 1 to change, 0 to destroy.

Over and over again.

Once applied successfully, if I terraform plan again, no changes should be needed.

@rileykarson
Copy link
Collaborator

Ah yeah, we normally suppress that diff but using ignore_changes on a resource makes Terraform ignore diff suppression for the resource; see hashicorp/terraform#18209. This will be fixed with Terraform 0.12.

If you set network to projects/flames-dev/global/networks/default by hand, you'll no longer see a diff.

@baboune
Copy link
Author

baboune commented Feb 26, 2019

ok. Will try.
Looking forward to 0.12 😄

thanks

@rileykarson
Copy link
Collaborator

Looks like there's a warning already; closing this out. https://www.terraform.io/docs/providers/google/r/container_cluster.html#node_pool

@ghost
Copy link

ghost commented Mar 29, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants