Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving the fetch logic into 'pkg/module' #291

Merged
merged 68 commits into from
Jul 26, 2018
Merged

Conversation

arschles
Copy link
Member

@arschles arschles commented Jul 18, 2018

This copies the stuff in ./pkg/repo into ./pkg/module, but leaves ./pkg/repo (see #314 and #316 for removing ./pkg/repo).

It also refactors the interface for how we download code from an upstream (VCS), access it, and prepare it for upload to storage/CDN/wherever. During the copy from ./pkg/repo, I only retained the go get implementation in the new package, and not the Github API implementation (see #288 for why)

I am doing this refactor to prepare for #290, where I'll use the new functionality to synchronously fetch packages. I'll do that in a separate PR 😄

TODO:

  • Make the (./pkg/module).diskRef look in the global cache for the .info, .mod, and .zip files (right now I believe it looks in the wrong place)
  • Remove ./pkg/repo (see Removing ./pkg/repo #316)

This is part of #290
There are a bunch of PRs opened against this branch. Check out #290 (comment) for what they are

@arschles arschles requested review from a user and marwan-at-work July 19, 2018 01:37
@arschles arschles added the refactor work to do to refactor code label Jul 19, 2018
.gitignore Outdated
@@ -24,3 +24,6 @@ cmd/proxy/proxy
cmd/olympus/olympus
test-keys.json
tmp
cmd/olympus/bin/olympus
cmd/proxy/bin/proxy
.vs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in #303 this appears to be .vscode, at least by default on linux running non-insiders edition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robjloranger good catch on these lines. I am going to back these out and just wait for #303 to go in, so we can discuss there

@ghost
Copy link

ghost commented Jul 19, 2018

This one is meaty, I'll take a proper look tomorrow morning sometime. There is a failing test though.

@arschles
Copy link
Member Author

@robjloranger thanks! I'll fix that test today and hopefully finish this up as well.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just questions/comments besides the one for the Ref Read method comment in ./pkg/module/ref.go


// read is the Ref interface implementation.
func (d *diskRef) Read() (*storage.Version, error) {
ver := &storage.Version{}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to declare variables of a type, and later on return pass the address of that variable. It becomes more clear in larger functions what is being returned. If you just glance at the end of the function and read the return, what is being returned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robjloranger so would you prefer that I do var ver *storage.Version at the beginning of the function?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because that would still be a pointer type, I find it's more clear to read the return of a variable address in place of returning a variable that is not clearly a pointer. i.e. var ver storage.Version then on return return &ver

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, ok got it. I'll make that change

)

const (
tmpRepoDir = "%s-%s" // owner-repo-ref
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably fine as is, but at first I wasn't sure it the comment was meant to reference three parts of the string. If you agree it's a bit confusing, maybe something like owner-ref, owner-version etc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robjloranger I'll abstract this into a function so that it's much easier to read and more type-safe

// this function may fail, regardless of whether this function returns nil or not
Clear() error
// Read reads the module into memory and returns it. The caller should call
// the returned function.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which function? Something in the storage.Version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, that's outdated documentation. Removing


packagePath := filepath.Join(gopath, "src", "mod", "cache", "download", repoURI, "@v")

o, err := cmd.CombinedOutput()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if instead of the error check and eventual return, we can just have the switch.

Since the default case is the same as the return, and if the cmd output contains the limit error we return our custom error.
The only other check is on the files existing, in which case the only time the default case will be executed is when there is some actual error from the vgo get command.

Copy link
Member

@marpio marpio Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you don't want to do these checks (isLimit and checkFiles) unnecessarily @robjloranger .
The alternative would be to do something like
case err != nil && isLimit
case err != nil && checkFiles
but I'm not sure if it's that much better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified this to not do the switch inside the if. hopefully it's a bit more clear what's going on, even if it's more nested

case isLimitHit(o):
// github quota exceeded
return packagePath, ErrLimitExceeded
case checkFiles(fs, packagePath, version) == nil:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be case checkFiles(fs, packagePath, version) != nil:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, change incoming

@ghost ghost requested review from marpio and marwan-at-work July 25, 2018 22:22
@ghost
Copy link

ghost commented Jul 25, 2018

Just want to double check with you two @marpio and @marwan-at-work that your concerns were addressed.

@marpio
Copy link
Member

marpio commented Jul 26, 2018

Lgtm @robjloranger

Copy link
Contributor

@marwan-at-work marwan-at-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Mostly questions and nits.

I only put a couple of comments about using the new errors package because I'm eager to have this PR merged in. Once it's merged in, I'll go and change most of these files to fully use the errors package.

Plus, I'd like to use this PR to implement go list -m -versions in my List PR.

@arschles I do have one concern though: what happens when you use NewGoGetFetcher to concurrently fetch the same module? Just error out for now?

.travis.yml Outdated
@@ -15,6 +15,7 @@ env:
- MINIO_SECRET_KEY=minio123
- ATHENS_MONGO_STORAGE_URL=mongodb://127.0.0.1:27017
- CODE_COV=1
- GO_BINARY_NAME=vgo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would GO_BINARY_PATH make more sense? Just because you can put put a whole path to binary and not necessary a name of binary in your PATH environment variable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would probably be more clear

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// should always call ref.Clear() on the returned ref
func (g *goGetFetcher) Fetch(mod, ver string) (Ref, error) {
var ref Ref
ref = noopRef{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you wanna make it one line: var ref Ref = noopRef{} or ref := Ref(noopRef{})

}

sourcePath := filepath.Join(repoRoot, "mod.go")
sourceContent := []byte(`package mod // import "mod"`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for // import "mod" since you have a go.mod file saying that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

//
// If an error was returned, the returned ref's Read method will return an error, but you
// should always call ref.Clear() on the returned ref
func (g *goGetFetcher) Fetch(mod, ver string) (Ref, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so why do we always wanna return a Ref? why can't we just return nil? If it's a matter of calling ref.Clear() we can always make sure we clear things inside this function when an error occurs, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why does Fetch takes a mod/ver when NewGoGetFetcher also takes a mod and ver?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why does Fetch takes a mod/ver when NewGoGetFetcher also takes a mod and ver?

good catch, only Fetch should take the mod/ver

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marwan-at-work I guess it can return nil and manually call ref.Clear, yea. Good idea - change incoming

// the location of the module cache so you can delete it if necessary
func getSources(goBinaryName string, fs afero.Fs, gopath, repoRoot, module, version string) (string, error) {
version = strings.TrimPrefix(version, "@")
if !strings.HasPrefix(version, "v") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from my previous review:

you shouldn't need this whole block in the first place. version is always going to include v and therefore you can remove the whole if !strings.HasPrefix(version, "v") block. If you really don't trust the caller, then calling semver.IsValid is probably more wholesome. But imo, just going out and saying go get module@version is good enough because Go will validate the version anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Since the version is ultimately going to come from a REST API call, I want to be defensive against version, so I'll use semver.IsValid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(instead of relying on go get to tell us that the version is invalid when we call)


var (
// ErrLimitExceeded signals that github.com refused to serve the request due to exceeded quota
ErrLimitExceeded = errors.New("github limit exceeded")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be not used anywhere, furthermore it seems that goget.go and goget_test.go` are still around. Can we delete them in this pr?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was used on ~129 after the isLimitHit check

if err != nil {
// github quota exceeded
if isLimitHit(o) {
return packagePath, pkgerrors.WithMessage(err, "github API limit hit")
Copy link
Contributor

@marwan-at-work marwan-at-work Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make sure we propagate the rate limit status all the way back to vgo so it knows to show the right message. You can do return packagePath, errors.E(op, err, http.StatusTooManyRequests) -- we should also add that as a known Kind in the errors package. So in the future we can do something like
if errors.Kind(err) == errors.KindRateLimit or wrap that in a function like if errors.ErrRateLimit(err)

the function in the errors package can look like this:

const KindRateLimit = http.StatusTooManyRequests

func ErrRateLimit(err error) {
  return Kind(err) == KindRateLimit
}


// NewErrModuleAlreadyFetched returns a new ErrModuleAlreadyFetched
func NewErrModuleAlreadyFetched(mod, ver string) error {
return &ErrModuleAlreadyFetched{module: mod, version: ver}
Copy link
Contributor

@marwan-at-work marwan-at-work Jul 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return errors.E(op, errors.M(mod), errors.V(ver), errors.KindAlreadyExists) (passing the op into this function constructor) and pick whichever logrus Level you deem is appropriate. I think error level makes sense here, because why ever fetch something if it's already in storage right? Not passing a level default the error to be logrus.ErrorLevel.

In the errors package, create a new kind enum KindAlreadyExists = http.StatusConflict


// ErrModuleAlreadyFetched is an error returned when you try to fetch the same module@version
// more than once
type ErrModuleAlreadyFetched struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove this struct and its methods :)

package module

// Fetcher fetches module from an upstream source
type Fetcher interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good to me 👍

@ghost ghost removed the request for review from marpio July 26, 2018 17:12
Copy link
Contributor

@marwan-at-work marwan-at-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯 🥇 🔨
Can you remove the pkg/module/error ErrModuleAlreadyFetched struct before merging? No need for a second look on my side!

@arschles
Copy link
Member Author

@marwan-at-work after I fix CI and remove that struct, will do - thanks!

@arschles arschles merged commit ab9a4c0 into gomods:master Jul 26, 2018
@arschles arschles deleted the sync-fill branch July 26, 2018 20:08
@arschles arschles mentioned this pull request Jul 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor work to do to refactor code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants