Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use go-getter only for archive fetches #3297

Closed

Conversation

thatsmydoing
Copy link
Contributor

This is another attempt to address #2538. This uses go-getter only to fetch archives via http. Otherwise, it uses the git cloner to clone repositories. Having 2 different ways of cloning git repositories was a big cause of confusion with url formats and we address this by just removing go-getter as an option entirely.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 1, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @thatsmydoing. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 1, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: thatsmydoing
To complete the pull request process, please assign monopole after the PR has been reviewed.
You can assign the PR to them by writing /assign @monopole in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor Author

@thatsmydoing thatsmydoing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reasonably confident this does the right thing and that go-getter will fallback to the git cloner when provided a git repo but succeed when provided with an archive url.

go-getter and git cloning has a number of issues regarding url formats #2538 (comment) and performance #3244 (comment)

@@ -71,17 +71,21 @@ func getRemoteTarget(rs *remoteTargetSpec) error {
log.Fatalf("Error getting wd: %s", err)
}

httpGetter := &getter.HttpGetter{
Netrc: true,
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new(getter.GitHubDetector),
new(getter.GitDetector),
new(getter.BitBucketDetector),
Mode: getter.ClientModeDir,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remote bases must be directories containing a kustomization.yaml file so there's no sense in fetching a single file.

new(getter.GitDetector),
new(getter.BitBucketDetector),
Mode: getter.ClientModeDir,
Detectors: []getter.Detector{},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detectors were only used to transform git style urls but we don't want to handle those here.

@monopole monopole removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 7, 2020
@monopole
Copy link
Contributor

monopole commented Dec 7, 2020

please rebase

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 7, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 8, 2020
@monopole
Copy link
Contributor

The whole getter / cloner flow is getting hard to follow.
I just had to add timeout because it was hanging, and we needed better errors.
Can you clarify the change to add ClientModeDir? Does that mean it never worked?
The tests are hard to follow.

@thatsmydoing
Copy link
Contributor Author

I think the flow is mostly described in

// NewLoader returns a Loader pointed at the given target.
// If the target is remote, the loader will be restricted
// to the root and below only. If the target is local, the
// loader will have the restrictions passed in. Regardless,
// if a local target attempts to transitively load remote bases,
// the remote bases will all be root-only restricted.
func NewLoader(
lr LoadRestrictorFunc,
target string, fSys filesys.FileSystem) (ifc.Loader, error) {
ldr, errGet := newLoaderAtGetter(
target, fSys, nil, git.ClonerUsingGitExec, getRemoteTarget)
if errGet == nil {
return ldr, nil
}
repoSpec, errGit := git.NewRepoSpecFromUrl(target)
if errGit == nil {
// The target qualifies as a remote git target.
return newLoaderAtGitClone(
repoSpec, fSys, nil, git.ClonerUsingGitExec, getRemoteTarget)
}

and
// New returns a new Loader, rooted relative to current loader,
// or rooted in a temp directory holding a git repo clone.
func (fl *fileLoader) New(path string) (ifc.Loader, error) {
if path == "" {
return nil, fmt.Errorf("new root cannot be empty")
}
ldr, errGet := newLoaderAtGetter(path, fl.fSys, nil, fl.cloner, fl.getter)
if errGet == nil {
return ldr, nil
}
repoSpec, errGit := git.NewRepoSpecFromUrl(path)
if errGit == nil {
// Treat this as git repo clone request.
if errGit := fl.errIfRepoCycle(repoSpec); errGit != nil {
return nil, errGit
}
return newLoaderAtGitClone(
repoSpec, fl.fSys, fl, fl.cloner, fl.getter)
}

which is, given a url, try to download it with go-getter first and if it fails, try to use the git cloner.

This PR does not change the flow. All it does is make go-getter only fetch archives and fail on git repositories so that it falls back to the git cloner. While go-getter can handle git repositories, the way it handles urls and cloning behavior is different enough from the cloner that it causes confusion as to what the url format should be and also causes performance issues.

The flow could definitely be improved though. It might make more sense to use the cloner first and then fall back to go-getter for archive urls.

I just had to add timeout because it was hanging

Could you point me to a url where it hangs?

Can you clarify the change to add ClientModeDir? Does that mean it never worked?

The meanings are described here https://godoc.org/github.com/hashicorp/go-getter#ClientMode. It was previously set to ClientModeAny which allows it to download either a file or a directory/archive. Downloading a single file is rarely what we want as remote bases must be directories. ClientModeDir ensures that we only fetch archives (directories only make sense for things like S3 which we don't support). It wasn't broken before but this just tightens the spec.

@yujunz
Copy link
Member

yujunz commented Dec 10, 2020

Downloading a single file is rarely what we want as remote bases must be directories.

I think bases are already merged to resources and it is common use case to refer a single file as resources, e.g. https://github.com/jetstack/cert-manager/releases/download/v1.1.0/cert-manager.yaml .

I don't remember exactly if resources fetching is based on getter or not. Please double check this does not break with the ClientMode change. @thatsmydoing

@thatsmydoing
Copy link
Contributor Author

I think bases are already merged to resources and it is common use case to refer a single file as resources

While bases and resources are merged, I don't think the behavior changed. Only "bases" can be remote in much the same way that only "bases" can be referenced via parent directories.

Neither of

kustomize build https://github.com/jetstack/cert-manager/releases/download/v1.1.0/cert-manager.yaml

cat > kustomization.yaml <<EOF
resources:
 - https://github.com/jetstack/cert-manager/releases/download/v1.1.0/cert-manager.yaml
EOF
kustomize build .

work with kustomize 3.8.7

@yujunz
Copy link
Member

yujunz commented Dec 11, 2020

It seems I gave a bad example because the getter doesn't follow redirection at the moment .

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
 - https://raw.githubusercontent.com/argoproj/argo-cd/master/manifests/install.yaml

The above should work though.

@thatsmydoing
Copy link
Contributor Author

Ah, this feature is described in https://github.com/kubernetes-sigs/kustomize/blob/master/examples/loadHttp.md and the related issue #970

This will most likely break that. I'll look into it.

@thatsmydoing
Copy link
Contributor Author

On further testing, 3.8.7 does support redirects. I don't know why it failed the first time I tried. I might have accidentally used a different version. I also realized that the http fetching is not done by go-getter but by kustomize itself so this PR should not affect fetching remote resources.

The reason that ClientModeDir is necessary is that with ClientModeAny, kustomize build "https://github.com/Liujingfang1/mysql" which go-getter will fetch as an HTML file and then not fallback to the cloner. Requiring a directory makes it properly fail here and fallback to the cloner.

@monopole
Copy link
Contributor

/test kustomize-presubmit-master

@k8s-ci-robot
Copy link
Contributor

@thatsmydoing: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
kustomize-presubmit-master 2188fe9 link /test kustomize-presubmit-master

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@thatsmydoing
Copy link
Contributor Author

/retest

@k8s-ci-robot
Copy link
Contributor

@thatsmydoing: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@thatsmydoing
Copy link
Contributor Author

Ah. There were linting errors so I just fixed them and rebased.

@k8s-ci-robot
Copy link
Contributor

@thatsmydoing: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 6, 2021
@thatsmydoing
Copy link
Contributor Author

This is no longer necessary as of #3586

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants