Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: make imported symbols predictable #29036

Open
rogpeppe opened this issue Nov 30, 2018 · 51 comments
Open

proposal: spec: make imported symbols predictable #29036

rogpeppe opened this issue Nov 30, 2018 · 51 comments
Labels
LanguageChange Suggested changes to the Go language LanguageChangeReview Discussed by language change review committee Proposal
Milestone

Comments

@rogpeppe
Copy link
Contributor

Currently when a package is imported, it's not possible to tell what package name it uses without going to the package itself to see what name it uses.

This can make it slow to analyse code (for example to jump to a symbol definition). For example, say we want to find the definition of the symbol X in the expression m.X. If m isn't defined in the local package, it's necessary to obtain a copy of every imported package's code to see if it's defined by any of them. For large packages, that can take a long time (many seconds). Even with caching proxies, it can still take a long time if the set of dependencies has recently changed.

There's also the issue of cognitive overhead: just seeing an import statement is not sufficient to know what symbols that import statement brings into scope. For large programs, that overhead can be significant, and is worse when one or more of the dependencies are no longer available - the reader of the code is forced to guess what symbols are imported.

This issue is even worse when "." imports are used to bring all a package's exported symbols into local scope. Even though dot-imports are frowned upon, there is still no universal consensus that they should be avoided.

The goimports tool already implicitly acknowledges this issue by adding an explicit package name when the symbol isn't clear from the import path. Although this helps, this doesn't help in the general case, because tools cannot assume that goimports has been run.

I propose the following:

  • dot imports should be disallowed
  • the imported symbol for a given import should be made predictable

This would mean that we can always look at any symbol in a piece of Go source code with local context only and know definitively whether it is defined in the local package or not, and if not, exactly which import path would need to be investigated to find the symbol.

Dot imports

The official guidelines suggest that a dot import should only be used "to let the file pretend to be part of package foo even though it is not". This is a stylistic choice and strictly unnecessary. The tests can still use package-qualified names with only minor inconvenience (the same inconvenience that any external user will see).

Other than that, I believe the most common use is to make the imported package feel "close to a DSL". This seems to be actively opposed to the aims of Go, as it makes the programs much harder to read.

Predictable imported symbols for imports

The simplest possibility here would be to require an import symbol for every import path. As many (most?) people use goimports to manage their import statements, this might not be too onerous a requirement.

Another approach would be to state that when lacking an explicit import package name, the package name must match a name derived from the import path. Possible rules might be:

  • split the import path on /; the expected name is the last element.
  • choose the longest valid Go identifier from the end of the import path.
  • whatever rule goimports currently uses to determine whether it should add an explicit package name.

This means that we could carry on importing "fmt" without redundantly specifying fmt "fmt", but has the disadvantage that nothing else in the language specification talks about what's inside an import path.

@gopherbot gopherbot added this to the Proposal milestone Nov 30, 2018
@rogpeppe rogpeppe added the v2 An incompatible library change label Nov 30, 2018
@mvdan mvdan added the LanguageChange Suggested changes to the Go language label Nov 30, 2018
@mvdan
Copy link
Member

mvdan commented Nov 30, 2018

I think the cognitive overhead is a very important point. Code will be read many more times than written, and Go is already optimised to be easy to read before being easy to write.

For example, most of the points I've seen in favor of dot imports is that they simplify writing quick programs, or swapping imports without needing to rewrite much of the program. I think the solution to that should be tooling, not adding complexity to the language.

@Merovius
Copy link
Contributor

I like this, as it codifies existing best practices. In general, I'd prefer to derive the identifier from the import-path (i.e. I don't like the "require an import symbol for every import" solution). There is one complication, in that I still think that the identifier should - by default - be the package name. But the package name is specified in the importee, and there is no way to tell from a package alone what name it's imported as.

So maybe this change should also talk about canonical import paths - maybe making them required instead of optional? Maybe getting rid of the "package name" and instead have the package-clause refer to the canonical import path? Just spit-balling. But I do think this proposal only talks about half of the equation and should say something about the role of package names and the importee too.

@Merovius
Copy link
Contributor

Oh, as a corollary, BTW: This could also be a tooling-enforced solution (to address the "the spec doesn't care about import-paths" part) in that the Go tool could refuse to work if a package name doesn't follow the requirements set out - exactly like we do with canonical import paths, where it's the go tool that will refuse to work if you choose the wrong import path.

@rogpeppe
Copy link
Contributor Author

There is one complication, in that I still think that the identifier should - by default - be the package name. But the package name is specified in the importee, and there is no way to tell from a package alone what name it's imported as.

That's why the proposal says "when lacking an explicit import package name, the package name must match a name derived from the import path". If the package name declared in the package is different (and there's no local name specified in the import), it would be an error.

So it's entirely possible to have a package with name x importable as import path example.com/y, but all importers would be forced to add an explicit alias for it.

@Merovius
Copy link
Contributor

Ah, sorry, read over that.

So it's entirely possible to have a package with name x importable as import path example.com/y, but all importers would be forced to add an explicit alias for it.

Hm, IMO that's not great. It means goimports at least would still need to walk/parse (at least in part) the entire workspace to find an import. Which AIUI is one of the main reasons goimports (and by extension, saving, if you run it as a hook) is slow. If we're touching the rules relating import paths to package names, it seems like a good idea to just go ahead and codify the best practice of "last component of import path == package name" once and for all.

But either way, I think this is a good idea :)

@rogpeppe
Copy link
Contributor Author

Hm, IMO that's not great. It means goimports at least would still need to walk/parse (at least in part) the entire workspace to find an import. Which AIUI is one of the main reasons goimports (and by extension, saving, if you run it as a hook) is slow. If we're touching the rules relating import paths to package names, it seems like a good idea to just go ahead and codify the best practice of "last component of import path == package name" once and for all.

I think that's going a step too far. It would mean I can't copy a package verbatim to another location and use it without changing the package name it declares. Also, what about versioned import paths? The rule as proposed would allow you to import "foo/bar/v2" as bar, but it would not force you to declare the package name as v2. I don't think we should codify knowledge of versioning syntax into the language standard.

@ianlancetaylor ianlancetaylor changed the title proposal: make imported symbols predictable proposal: Go 2: make imported symbols predictable Nov 30, 2018
@deanveloper
Copy link

I've found dot-imports to be useful when writing tests. For instance:

package mypack_test

import . "github.com/deanveloper/mypack"

// ...

I wouldn't really mind the dot-import being removed. I was just adding my particular use case for them.

@bcmills
Copy link
Contributor

bcmills commented Nov 30, 2018

FWIW, the algorithm that I'm using in the current draft of goforward is: if the import path ends in /vN for some integer N, use the second-to-last component; otherwise, use the last component.

(Code here.)

CC @heschik (see #28428)

@deanveloper
Copy link

What about things like gopkg URLs (gopkg.in/yaml.v1)?

I really like this proposal, just making sure everything is covered. Would yaml be chosen because it's the longest valid Go identifier?

@heschi
Copy link
Contributor

heschi commented Nov 30, 2018

I don't have strong feelings on the proposal itself but I can add some commentary about goimports, since I've been working on it a lot lately and plan to make changes in this area soon.

Per #28428, goimports is going to change to add an explicit name to any import of a package whose doesn't "obviously" match its import path. The algorithm I'm using is the same one Bryan proposed above. And yes, this means that gopkg.in packages have the name "yaml" added to their imports.

With that done, goimports won't need to scan the workspace for a file whose imports are satisfied. It's not perfect, because without this proposal, nothing prevents someone from doing:

import "example.com/foo" // package bar
import "example.com/bar" // package foo

which will confuse goimports into thinking everything's fine. (The only way to avoid the confusion would be to load everything even if the file looks okay, which is exactly what we don't want.) But that's very contrived.

@ohir
Copy link

ohir commented Dec 3, 2018

@deanveloper > I've found dot-imports to be useful when writing tests.

Dot import might still be allowed in *_test.go files and down the /test/ path.

@dave
Copy link
Contributor

dave commented Dec 3, 2018

This has been a source of complexity in almost every project I've worked on. In fact my most recent project needs a significant amount of documentation to explain, and alternative implementations for different environments: https://github.com/dave/dst#resolvers

It would be trivial for go fix to add a package alias for each import that requires it.

I'm perhaps guilty of over-using dot-imports, but I certainly wouldn't mind if they were removed.

@bradfitz
Copy link
Contributor

One thing that might be done is put this restriction in vet (which is now run by go test) instead of the language.

@ianlancetaylor
Copy link
Member

Note that the exact rule proposed here doesn't seem to work with modules, in which programs can be expected to have imports like

import "github.com/my/pkg/v3"

@mvdan
Copy link
Member

mvdan commented Dec 12, 2018

One thing that might be done is put this restriction in vet (which is now run by go test) instead of the language.

While that's possible, we would miss an important advantage. Adding the restrictions to the language, tools and libraries like go/packages and goimports can be greatly simplified, since they can always resolve names by looking at a single source file.

If we only enforce these restrictions in vet, then the tools can't assume that the user has run test or vet beforehand, so they'd still need to implement all the current logic to load all imported packages as a fallback.

I realise that changing the language can be tricky, though. Perhaps enforcement in vet could be a first step. But in my mind it would be great if it were impossible to write Go programs using confusing imports in the future.

@rogpeppe
Copy link
Contributor Author

Note that the exact rule proposed here doesn't seem to work with modules, in which programs can be expected to have imports like

   import "github.com/my/pkg/v3"

Yes, that's true. But the alternative is probably to bake some kind of understanding of major versions into the language spec, which doesn't seem quite right to me. Note that goimports will already rewrite the above as:

import pkg "github.com/my/pkg/v3"

so ISTM that it's not necessarily a problem to require that in this case.

@ianlancetaylor
Copy link
Member

I think that if modules catches on we should expect to see more and more "/vN" package paths over time, so we should plan for that up front. It doesn't seem desirable to wind up requiring a local import alias for the majority of imports.

@rogpeppe
Copy link
Contributor Author

I think that if modules catches on we should expect to see more and more "/vN" package paths over time, so we should plan for that up front. It doesn't seem desirable to wind up requiring a local import alias for the majority of imports.

There's a problem with that, unfortunately. It is legitimate to name a package "vN" (for some number N) so if you don't require an explicit identifier in that case, you can't tell the difference between a major version import (which should use the previous path element) and a package that is in fact named vN.

I guess the assumption could then be the other way - if you really do have a package with a name that matches v[0-9]+, then you could require that a local identifier be used. That would cover the majority case reasonably, I think.

@heschi
Copy link
Contributor

heschi commented Dec 12, 2018

@rogpeppe Unless I made a mistake, you do have the behavior of goimports backward: it expects a path ending in foo/v3 to correspond to a package named foo, and will add an explicit name "v3" to the import line if it is actually named v3. See #29041.

@rogpeppe
Copy link
Contributor Author

@heschik You're right. It's behaving as we'd want it to if this proposal was accepted.

@natefinch
Copy link
Contributor

natefinch commented Dec 14, 2018

I really like this idea, though I think I would tweak the wording slightly.

choose the longest valid Go identifier from the end of the import path.

I don't think that heuristic is workable. then gopkg.in/ab.v23 would expect v23 as the package name (if I understand the heuristic correctly).

I would instead say that it should use the prefix from the last part of the path including all characters that are valid in go identifiers. I don't think length should matter.

thus github.com/example/eg.v23 would expect package name "eg". Yes, if you have "github.com/example/v2.bar` it would expect the package name v2. But I bet that's way less common than the former. Higher version numbers will become more common as the language ages. And I think most people are used to ignoring the ends of words, more than looking for the longest part of a word.

@dave
Copy link
Contributor

dave commented Dec 14, 2018

I would worry we're trying to be too clever here. I agree we should be aware of major versions, but I would think if the next part of the path isn't a valid identifier, we should just require an alias.

Remember this is probably just for historical packages... I find it rather unlikely that after this change many people will be creating packages that have names that require an alias...

@ianlancetaylor
Copy link
Member

This proposal has two separate issues--banning dot imports and using predictable import names--and I think they are separable. I moved banning dot imports into #29326. We can keep this issue for discussing predictable import names.

@ianlancetaylor
Copy link
Member

Note that if we adopted this proposal, the package clause would be nearly meaningless. The main thing the package clause does today is set the name of the package, but with this proposal in effect every import would have a local import alias. See #21479.

@josharian
Copy link
Contributor

The tool you describe is complex, to put it mildly. Part of the reason that the Go tooling ecosystem is rich and interesting is that the barrier to entry is pretty low.

@ianthehat
Copy link

I think there is some misunderstanding here (and it may be mine)
If my understanding is correct I fully support this proposal, and my understanding is this:

The only added restriction would be that the current behavior of goimports (adding a local alias that matches the package clause if it is not derivable from the import path) is required behavior, and can be relied on.
This means that files that are fully correct can be understood without parsing any of the dependencies, whereas right now you cannot know which import provided which symbols reliably.
This makes it much easier to write tools, and makes existing tools much faster, but adds no extra restrictions on import paths or package names at all.
Fixing all code to obey this restriction is as simple as running an up to date goimports on the source.

The extra work would be codifying the set of rules that goimports uses to decide if the package name is derivable, making them available as a library for tools to use, and adding them to a vet check in the compiler as well, to reject code that does not match that standard.

@logrusorgru
Copy link

The tool you describe is complex, to put it mildly.

Not really. The "simple" tools like the guru are complex, because they are monolithic. And this issue is the product of this complexity.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Jan 10, 2019

@ianthehat's remarks seem on point to me.

For the record, I scanned through Russ's Go corpus and checked which package names didn't match their package paths (excluding main packages) using this algorithm as used by goimports:

func importPathToName(importPath string) (packageName string) {
	base := path.Base(importPath)
	if strings.HasPrefix(base, "v") {
		if _, err := strconv.Atoi(base[1:]); err == nil {
			dir := path.Dir(importPath)
			if dir != "." {
				return path.Base(dir)
			}
		}
	}
	return base
}

Just under 10% (462/5092) of the packages resulted in a mismatch.
The full list of mismatches is here.

About 2.5% (1560/62791) files used such an import without using an explicit identifier; this includes about 14% (830/5875) of all packages over 25% (138/553) of all repositories.
The full list of files that would need to be changed if this rule were in place is here.

This does seem like more than I was expecting, but as the change could be entirely automatic, with backward compatibility maintained for packages that declare earlier Go versions, I don't think the requirement would be that onerous. I think the net change would be to make the code base considerably more readable.

@rogpeppe
Copy link
Contributor Author

To respond to a couple of @josharian's points:

People like to write "drop-in" replacements. If I operate github.com/g0l3ng/amazingjson, I will no longer be able to say "just replace encoding/json with github.com/g0l3ng/amazingjson and everything will work but better". Instead I have to create github.com/g0l3ng/amazingjson/json, which is not nearly as cool. :)

You can use whatever import path you like, but if there's a mismatch with the package name, people will be required to use an explicit name when importing the package. This seems good to me. If I see an import of github.com/g0l3ng/amazingjson, how am I supposed to know that it's declared the identifier json?

Even if we do enforce this at the language level, it'll be a while before tools can assume it, because of the need to support existing/legacy code. I lost an hour today trying to find a version of Java that I could use to compile bazel. I spent the entire time wishing for an ironclad compatibility guarantee.

Tools would be able to assume it if the correct Go version is declared in the go.mod file.

Every time we have added restrictions on import paths there have been frustrating knock-on effects and unexpected breakages, particularly in tools. (Examples include the vendor directory halving the speed of guru queries, Dmitry's exasperation with build tooling around go-fuzz, my chagrin at re-realizing again today that package importability isn't transitive because of internal packages.)

I don't think this should have similar implications, because old tools can understand the new code just fine. That is, an new program read by an old tool will have exactly the same semantics as it does currently. The set of programs accepted by the new version of the compiler would just be somewhat smaller than it is today.

@mvdan
Copy link
Member

mvdan commented Jan 11, 2019

A bit late to reply to @josharian's comment, and some of what I wanted to say has already been said by others. Just adding a few bits.

It makes me nervous that we have a bunch of ideas floating around for this but that none of them that seem clearly correct.

Agreed. This is where I expect people with much more experience in the language and its tooling to step in :)

Tools would be able to assume it if the correct Go version is declared in the go.mod file.

Or even simpler - after a few Go releases, tools could simply error if a program doesn't follow the new import rules. They could just ask the user to run goimports on the packages and try again.

There is already significant pain associated with violating the goimports package path/name heuristics

Right, the point is to increase the amount of pain for those who violate the rules :) Everything should still work, modulo asking the developer to re-run goimports if they want to use newer tools, much like @ianthehat mentioned earlier. But most importantly, we'd drastically reduce the amount of pain that tool developers have to put up with.

@josharian
Copy link
Contributor

I think there is some misunderstanding here (and it may be mine)

It wouldn't surprise me in the slightest if the misunderstanding is mine. :) Thanks all for bearing with me.

Fixing all code to obey this restriction is as simple as running an up to date goimports on the source.

Just running goimports will fix existing code to work. However, this proposal increases pressure on package authors to rename their packages to be in conformance with preferred naming, which will be breaking changes, and ones that aren't as easy as "run goimports" to deal with. That will be less painful with dependency management, but it will still be painful.

@rogpeppe thanks for providing actual code! Note that that code can generate invalid package names. For example, given import path "commaok.xyz", it returns import name "commaok.xyz" ( https://play.golang.com/p/LLiPkqqsTXZ) but import names can't have dots in them. Maybe it's enough of a corner case that it's ok, but it is a bit awkward.

As I said before, I think trying to nail down exactly what the path -> name rule is is the next important step for this proposal. And see whether there are other heuristics we could use to get that 10% number down. And nail down corner cases like v8 and commaok.xyz.

codifying the set of rules that goimports uses to decide if the package name is derivable

Just to make sure we're 100% on the same page, when you say "derivable", you mean that given a package path there is a single, unique, correct package name (not a set of acceptable names)?

@myitcv
Copy link
Member

myitcv commented Jan 11, 2019

Just to be clear on the case of v8: #28435 (comment) and in the worst case #28435 (comment).

@logrusorgru
Copy link

logrusorgru commented Jan 13, 2019

Ok, let's say go.mod must have real package name. That's all. I don't really see any useful and necessary in other restrictions. Reasons for them are definitely imagined.

module json example.com/pkg/amazingjson

require (
    example.com/foo/bar v1.2.3
    example.com/foo/baz v4.0.0
)
module feature example.com/pkg/feature/v12
module feature example.com/pkg/feature.v12
module feature example.com/pkg/feature.git
module example.com/pkg/feature

Any tool or developer can look the file to get required information.


And, seems the require should follow this approach. Otherwise a developer can't get the information for imported packages.

require (
    bar example.com/foo/foobar v1.2.3
    baz example.com/foo/foobaz v4.0.0
)

@kaedys
Copy link

kaedys commented Jan 23, 2019

Have to throw in another "strongly opposed" on this one, due to the proposed prohibition on dot imports. While they can be overused, there are definite situations where they make code substantially cleaner, usually when using "expressive" type packages.

A good example is the popular BDD framework pair Ginkgo + Gomega. This:

Expect(retValue).To(Equal(tc.expected)

is substantially more readable and expressive than:

gomega.Expect(retValue).To(gomega.Equal(tc.expected)

And that's a fairly basic construction of Gomega matchers and assertions. I see no reason to ban this functionality just because some people might misuse dot imports in other circumstances.

@rogpeppe
Copy link
Contributor Author

gomega.Expect(retValue).gomega.To(gomega.Equal(tc.expected)

As someone that's used a not-dissimilar package (gopkg.in/check.v1) for a long time, I'd have to respectfully disagree. Originally, we imported that package to dot for similar reasons, but we moved to using a short alias (gc) instead. This turns out just fine. The extra three characters don't make it particularly less readable, and the code is more approachable for people that aren't familiar with the specific framework.

For example, I am not at all familiar with the gomega package. If I am trying to understand some test code that imports gomega to dot, I have to be aware of all the 64 exported symbols that the gomega package exports.

By the way, your code fragment above isn't quite right. You don't need to package-qualify the .To. I think you mean:

gomega.Expect(retValue).To(gomega.Equal(tc.expected)

That's actually not an insignificant thing - use of methods and fields is often much more common than use of top-level identifiers.

I'd suggest that using a short import alias could work OK for you; for example:

gm.Expect(retValue).To(gm.Equal(tc.expected)

or even:

M.Expect(retValue).To(M.Equal(tc.expected))

If I see that code, it is immediately clear to me that Expect and Equal come from somewhere external and where. I believe that property is worth a little more weight on the page.

@peterbourgon
Copy link

peterbourgon commented Feb 2, 2021

I strongly support this proposal, especially regarding removing dot imports. Whatever readability the feature may provide in specific circumstances is insignificant compared to the ambiguity it creates in the whole. My experience in training and support has also shown that it's one of the most frequently mis-used (i.e. over-used) features by new Go programmers.

@DavidGamba
Copy link

DavidGamba commented Feb 2, 2021

As a tool or sdk creator I create the same tool for multiple languages and differentiate them in different repos using a prefix (or maybe suffix). For example:

go-companysdk
python-companysdk
ruby-companysdk

I would expect my import to remain import "github.com/company/go-companysdk" for package companysdk.

@mitar
Copy link
Contributor

mitar commented Dec 2, 2021

Isn't this something a linter could solve? Is there really a need for language change here?

But yea, just recently I have made code with the following imports:

import (
	"github.com/deckarep/golang-set"
	"github.com/go-git/go-git/v5"
	"github.com/whilp/git-urls"
	"github.com/xanzy/go-gitlab"
	"github.com/xmidt-org/gokeepachangelog"
)

Good luck guessing how they are imported. But I think a linter which would require one to specify an alias in such cases, matching the package name, would solve most of the problems:

import (
	mapset "github.com/deckarep/golang-set"
	git "github.com/go-git/go-git/v5"
	giturls "github.com/whilp/git-urls"
	gitlab "github.com/xanzy/go-gitlab"
	changelog "github.com/xmidt-org/gokeepachangelog"
)

@mvdan
Copy link
Member

mvdan commented Dec 2, 2021

@mitar optional tools such as goimports have enforced this for some time. The problem is that, as long as the toolchain and language don't enforce predictable imports, then all the tools and libraries must support both forms. Hence the proposal to restrict this in the language itself.

@logrusorgru
Copy link

I strongly support this proposal, especially regarding removing dot imports. Whatever readability the feature may provide in specific circumstances is insignificant compared to the ambiguity it creates in the whole. My experience in training and support has also shown that it's one of the most frequently mis-used (i.e. over-used) features by new Go programmers.

Dot imports is about code style, not about the language.

@mitar
Copy link
Contributor

mitar commented Dec 14, 2021

optional tools such as goimports have enforced this for some time

I do not find any option to enforce this with goimports. Maybe some other tool?

@mvdan
Copy link
Member

mvdan commented Dec 14, 2021

@mitar it should; see #28428. If you're running the latest goimports from its master branch, and it doesn't work the way it should, I'd suggest filing a bug. The way gopls fixes imports should be the same, as it uses the same internal APIs from x/tools.

@mitar
Copy link
Contributor

mitar commented Dec 14, 2021

Oh, true. I didn't notice it working because it has some exceptions, so it didn't do anything for github.com/go-git/go-git/v5 which is what I was testing it on.

@ianlancetaylor ianlancetaylor changed the title proposal: Go 2: make imported symbols predictable proposal: spec: make imported symbols predictable Aug 6, 2024
@ianlancetaylor ianlancetaylor added LanguageChangeReview Discussed by language change review committee and removed v2 An incompatible library change NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LanguageChange Suggested changes to the Go language LanguageChangeReview Discussed by language change review committee Proposal
Projects
None yet
Development

No branches or pull requests