-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send the JSON state representation to Cloud backend (when available) #31698
Conversation
internal/command/state_mv.go
Outdated
|
||
schemas, diags := getSchemas(&c.Meta, stateTo, config) | ||
if diags.HasErrors() { | ||
c.Ui.Warn(fmt.Sprintf(failedToLoadSchemasMessage, err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to adjust the control flow here; loadConfig(path)
shouldn't be called if os.Getwd
errors, and we don't want to print the same warning three times. Same for other instances of this pattern.
…hemas along to PersistState in the remote state
go.mod
Outdated
@@ -40,7 +40,7 @@ require ( | |||
github.com/hashicorp/go-multierror v1.1.1 | |||
github.com/hashicorp/go-plugin v1.4.3 | |||
github.com/hashicorp/go-retryablehttp v0.7.1 | |||
github.com/hashicorp/go-tfe v1.7.0 | |||
github.com/hashicorp/go-tfe v1.8.1-0.20220826155809-b28335218b42 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-merge TODO: release go-tfe and bump this ref to the released version
internal/cloud/state.go
Outdated
|
||
delegate remote.State | ||
// Client Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✂️ This can be deleted
internal/cloud/state.go
Outdated
return nil | ||
} | ||
|
||
// statemgr.Persister impl. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should godoc each public method, making a comment like "PersistState uploads a snapshot the latest state as a StateVersion to Terraform Cloud"
"terraform_version": "1.3.0", | ||
"serial": 1, | ||
"lineage": "backend-change", | ||
"outputs": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part of the payload should be only the outputs object.
internal/cloud/testing.go
Outdated
@@ -443,3 +444,54 @@ func testVariables(s terraform.ValueSourceType, vs ...string) map[string]backend | |||
} | |||
return vars | |||
} | |||
|
|||
func TestCloudLocks(t *testing.T, a, b statemgr.Full) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved to state_test.go and probably not called as a helper but exist as a top level test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this test is called at all right now
internal/command/helper.go
Outdated
|
||
const failedToLoadSchemasMessage = ` | ||
Terraform failed to load schemas, which will in turn affect its ability to generate the | ||
external JSON state file. This will not have any adverse effects on Terraforms ability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
external JSON state file. This will not have any adverse effects on Terraforms ability | |
external JSON variant of the state file. This will not have any adverse effects on Terraform's ability |
internal/command/helper.go
Outdated
external JSON state file. This will not have any adverse effects on Terraforms ability | ||
to maintain state information, but may have adverse effects on any external integrations | ||
relying on this format. The file should be created on the next successful "terraform apply" | ||
however, historic state information may be missing if the affected integration relies on that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should end this warning at on the next successful "terraform apply".
because the rest sounds unnecessarily scary. I would appreciate a second opinion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm indeed this message is tricky because it's introducing what are probably a bunch of entirely new concepts from the user's perspective, and doing so at an "anxiety point" where we're telling the user about some potential negative consequences they need to make a decision about. 😖
I don't think there's a perfect way to do this because it's never ideal to introduce entirely new concepts in an error/warning message, but I think I'd suggest reordering the information here a bit and splitting it over multiple paragraphs so it's hopefully a little easier to skim:
Warning: Failed to update data for external integrations
Terraform was unable to generate a description of the updated
state for use with external integrations in Terraform Cloud.
Any integrations configured for this workspace which depend on
information from the state may not work correctly when using the
result of this action.
This problem occurs when Terraform cannot read the schema for
one or more of the providers used in the state. The next successful
apply will correct the problem by re-generating the JSON description
of the state:
terraform apply
(It's hard to get word-wrapping write in the GitHub comment box; I probably didn't wrap the lines above correctly for presentation in a monospace font.)
My goal with this edit was to put the "what happend and why it matters" part of the message first, and then the "why it happened and what to do about it" part second.
As a secondary goal I tried to omit the implementation detail about there being two different state serializations and instead focus on the fact that this missing information is intended for external integrations. I don't expect the typical reader of this message to know that external integrations use our JSON state format or even that there is a JSON state format for integrations, and so mentioning it seems likely to leave the reader unsure of what the consequences of this situation are.
internal/command/import.go
Outdated
if isCloudMode(b) { | ||
schemas, diags = c.GetSchemas(newState, nil) | ||
if diags.HasErrors() { | ||
c.Ui.Warn(fmt.Sprintf(failedToLoadSchemasMessage, err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are going to need to error out. The json marshaling code will panic if a nil schema is pass in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I take that back, digging deeper it looks like there were some nil checks added because that's happened before, and it should error out instead (though I don't think purposely sending nil
is something that was well tested).
return schemas, diags | ||
|
||
} | ||
return nil, diags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it can return nil, nil
, which is a bit unusual. Is that going to be a problem anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From reading the various callsites of this function, I think I share the concern of a "nil with no errors" result being unexpected per typical Go idiom, which may cause confusion for future maintainers.
As a "quick fix" without doing any more significant immediate refactoring, I would suggest renaming this method to MaybeGetSchemas
and mention in its documentation comment that it can potentially return nil
without errors to indicate that there isn't enough information to return the schemas and that the caller must handle the lack of schema information itself accordingly. My hope is that an unusual method name with Maybe
at the start of it will stand out enough to a future reader to cause them to investigate what the "maybe" implies, rather than just assuming this is a normal-ish function which will return errors if it isn't able to do its work.
if err := destinationState.PersistState(); err != nil { | ||
// Fetching schemas during init might be more of a hassle than we want to attempt | ||
// in the case that we're migrating to TFC backend, the initial JSON state won't | ||
// be generated and stored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case it matters for future ideas, the backend is currently handled before providers are installed during init
, so requiring schemas here could lead to a catch-22 where it requires some manual intervention to proceed far enough for provider installation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the "hassle" wording you used could evolve into this observation. Useful insight
// EnableForcePush to allow the remote client to overwrite state | ||
// by implementing remote.ClientForcePusher | ||
func (s *State) EnableForcePush() { | ||
s.forcePush = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some condition that disables this again later, or is forcing just on forever once you enable it?
Specifically, I'm worried about what happens if you use an instance of this type first to perform a "write for migration" with forcing enabled and then later use the same instance for normal writes. Is there a risk of skipping the safety check for the second write in that case?
One possible way to address this would be to make WriteState
always set s.forcePush
back to false
, and make WriteStateForMigration
just always set s.forcePush
to whatever its force
argument is, even if the caller set it to false
. That would then allow the state manager to remember whether the most recent write was forced, rather than remembering whether any previous write was forced.
Does that seem reasonable? Am I missing this getting handled some other way? 🤔
internal/cloud/state_test.go
Outdated
"github.com/hashicorp/terraform/internal/states/statefile" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure why our goimports
PR check isn't catching this 🤷♂️ but the typical convention would be for this import to be grouped in with "github.com/hashicorp/terraform/internal/states/statemgr"
below, so that all of the github.com/hashicorp/terraform
imports are grouped together.
(Same seems to be true for some of the other files here; I won't comment on every one because that'll be too noisy. 😀 )
addrs.NewLegacyProvider("null"): providers.FactoryFixed(p), | ||
addrs.NewLegacyProvider("azurerm"): providers.FactoryFixed(p), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must confess I'm not really sure how this is fitting in with the other stuff here but I'm surprised to see use of addrs.NewLegacyProvider
here because this idea of "legacy providers" was there to support migrating from Terraform v0.12 to v0.13 (when we first introduced the idea of provider namespaces) and so new code shouldn't ever encounter legacy providers.
Should these perhaps be NewDefaultProvider
instead? That would make them match the addresses two real providers hashicorp/null
and hashicorp/azurerm
.
(We don't typically use real provider addresses in unit tests though, because it can make it seem like we're testing against a real provider rather than a fake one as seems to be the case here. If possible I would suggest either using one of the fake providers already present or using providers in an intentionally-fake namespace like the pre-existing one above, and the one added below.)
internal/command/helper.go
Outdated
external JSON state file. This will not have any adverse effects on Terraforms ability | ||
to maintain state information, but may have adverse effects on any external integrations | ||
relying on this format. The file should be created on the next successful "terraform apply" | ||
however, historic state information may be missing if the affected integration relies on that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm indeed this message is tricky because it's introducing what are probably a bunch of entirely new concepts from the user's perspective, and doing so at an "anxiety point" where we're telling the user about some potential negative consequences they need to make a decision about. 😖
I don't think there's a perfect way to do this because it's never ideal to introduce entirely new concepts in an error/warning message, but I think I'd suggest reordering the information here a bit and splitting it over multiple paragraphs so it's hopefully a little easier to skim:
Warning: Failed to update data for external integrations
Terraform was unable to generate a description of the updated
state for use with external integrations in Terraform Cloud.
Any integrations configured for this workspace which depend on
information from the state may not work correctly when using the
result of this action.
This problem occurs when Terraform cannot read the schema for
one or more of the providers used in the state. The next successful
apply will correct the problem by re-generating the JSON description
of the state:
terraform apply
(It's hard to get word-wrapping write in the GitHub comment box; I probably didn't wrap the lines above correctly for presentation in a monospace font.)
My goal with this edit was to put the "what happend and why it matters" part of the message first, and then the "why it happened and what to do about it" part second.
As a secondary goal I tried to omit the implementation detail about there being two different state serializations and instead focus on the fact that this missing information is intended for external integrations. I don't expect the typical reader of this message to know that external integrations use our JSON state format or even that there is a JSON state format for integrations, and so mentioning it seems likely to leave the reader unsure of what the consequences of this situation are.
return schemas, diags | ||
|
||
} | ||
return nil, diags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From reading the various callsites of this function, I think I share the concern of a "nil with no errors" result being unexpected per typical Go idiom, which may cause confusion for future maintainers.
As a "quick fix" without doing any more significant immediate refactoring, I would suggest renaming this method to MaybeGetSchemas
and mention in its documentation comment that it can potentially return nil
without errors to indicate that there isn't enough information to return the schemas and that the caller must handle the lack of schema information itself accordingly. My hope is that an unusual method name with Maybe
at the start of it will stand out enough to a future reader to cause them to investigate what the "maybe" implies, rather than just assuming this is a normal-ish function which will return errors if it isn't able to do its work.
|
||
// Get schemas, if possible, before writing state | ||
var schemas *terraform.Schemas | ||
if isCloudMode(b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels kinda unfortunate that this idea of "cloud mode" is infecting so many codepaths now. I don't really have a better idea to suggest, though. 😖 Hopefully we can think about a more elaborate refactoring of this later, but this surgical approach is probably best for now.
internal/command/untaint.go
Outdated
if isCloudMode(b) { | ||
schemas, diags = c.GetSchemas(state, nil) | ||
if diags.HasErrors() { | ||
c.Ui.Warn(fmt.Sprintf(failedToLoadSchemasMessage, err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really typically use c.Ui.Warn
for new code because it generates yellow text that can be hard to read in some terminals.
In another comment elsewhere I suggested renaming c.GetSchemas
to c.MaybeGetSchemas
to try to be explicit that there's "something odd" about this function -- that it can potentially fail but still return successfully in some sense. In conjunction with that, I wonder about making that function itself return this warning in the tfdiags.Diagnostics
it returns, so that we could then render it with our usual diagnostic formatting codepath. (c.showDiagnostics(diags)
, similar to some of the examples above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the timing and broad scope of this change, @megan07 and I have agreed in a backchannel to move forward with this change in its current state and then potentially address some of the feedback items in subsequent separate PRs that we expect to be more localized, once some background discussions have played out and it seems clearer which path to take.
Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch. |
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions. |
No description provided.