Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intra Rustdoc Links #1946

Merged
merged 16 commits into from
Jul 25, 2017
Merged

Intra Rustdoc Links #1946

merged 16 commits into from
Jul 25, 2017

Conversation

killercup
Copy link
Member

@killercup killercup commented Mar 7, 2017

Add a notation how to create relative links in documentation comments (based on Rust item paths) and extend Rustdoc to automatically turn this into working links.

The important bits:

Additions to the documentation syntax

  1. [Iterator](std::iter::Iterator)
  2. [Iterator][iter], and somewhere else in the document: [iter]: std::iter::Iterator
  3. [Iterator] or [`Iterator`] with automatically generated link reference definition

All additions are valid CommonMark syntax.

Rendered

Revisions

March 9: Dropped <Iterator>-style links, added implied reference links (#1946 (comment))

May 28: Add more possible extensions, add more syntax for dealing with path ambiguities (#1946 (comment))

Add a notation how to create relative links in documentation comments
(based on Rust item paths) and extend Rustdoc to automatically turn this
into working links.
@GuillaumeGomez
Copy link
Member

Awesome! \o/

@fbstj
Copy link
Contributor

fbstj commented Mar 7, 2017

Bikeshed: would it be useful to have an additional <std::iter::Iterator> style syntax/marker that desugars to [Iterator](std::iter::Iterator) for convenience?

@killercup
Copy link
Member Author

killercup commented Mar 7, 2017

an additional <std::iter::Iterator> style syntax/marker that desugars to [Iterator](std::iter::Iterator)

Hm, I'm wary of adding more cases, and even more so to add magical new syntax. I'd prefer to go with explicit links if you want to set the link text to something other than the full path.

(But: If Iterator is in scope, <Iterator> will be a valid link!)

@sgrif
Copy link
Contributor

sgrif commented Mar 7, 2017

This will need a way to disambiguate between functions, traits, etc. It's completely valid to have a struct and function with the same name. Perhaps we could use URIs for this? function://path::to::fn, struct://path::to::struct, etc.

Also should this specify whether super works or not? Are the paths always relative to the crate root? Is there ever a case where it would matter whether the link is based on where the struct appears in the documentation vs where it was defined?

@kennytm
Copy link
Member

kennytm commented Mar 7, 2017

Previous discussions 🚲

@frewsxcv
Copy link
Member

frewsxcv commented Mar 7, 2017

Relevant rust-lang/rust issues: rust-lang/rust#36417, rust-lang/rust#38400

@kennytm
Copy link
Member

kennytm commented Mar 7, 2017

@sgrif It only needs to distinguish between values (functions, constants) and types (structs, enums, traits, modules, type-aliases, primitives). Maybe:

  • <syntax::ptr::P> — First search for type-like items. If not found, search for value-like items
  • <syntax::ptr::P()>Only search for functions.
  • <std::stringify!>Only search for macros.

@killercup
Copy link
Member Author

killercup commented Mar 7, 2017

This will need a way to disambiguate between functions, traits, etc.

Oh, I had completely missed that this is valid (damn you, Rust!). Is there are an existing way to disambiguate this in paths? Or is it always resolved by the position it appears in?

Playing around on the playpen a bit, it seems that (assuming all have the same name):

  • unit and tuple structs conflict with functions
  • unit and tuple structs conflict with enums
  • regular structs, enums, and traits conflict with each other (but not functions)

I think @kennytm idea with a () suffix to disambiguate functions (#1946 (comment)) works well. I propose it is only needed when the non-suffixed link is actually ambiguous. (I assume most code bases follow Rust's style lints and use snake_case functions names and PascalCase for type/trait names.)
Macros should alway use ! suffix.

Are the paths always relative to the crate root?

Good point, that's totally missing from the RFC. Path are relative to the position of item in whose documentation they appear. (See complex example for relative paths.)

Previous discussions

Thanks, @kennytm! Reading #1661 (comment), it only feels right to cc @JohnHeitmann.

@frewsxcv
Copy link
Member

frewsxcv commented Mar 7, 2017

There is talk about extending CommonMark to support syntactic extensions, but it looks like progress has stalled a bit.

If we're going to design our own hyperlink destinations (as proposed in this RFC and the comments above), I'd strongly recommend staying within the bounds of what is acceptable for a "link destination" as defined by CommonMark.

@frewsxcv frewsxcv added T-doc Relevant to the documentation team, which will review and decide on the RFC. T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. labels Mar 7, 2017
@kennytm
Copy link
Member

kennytm commented Mar 7, 2017

How will cross-crate links be handled? Like,

  • linking from num-traits to <std::u32> (linking to standard library)
  • linking from num-complex to <num_traits::One> (crates in the same workspace)
  • linking from num-bigint to <serde::Serialize> (third-party crates, and only linked when a feature is defined)

BTW there is also this thread on talk.commonmark.org discussing the issue from the Markdown side, but it is pretty inconclusive (one suggestion to api:std.cmp.Ord and one to {std::cmp::Ord}).

@killercup
Copy link
Member Author

killercup commented Mar 7, 2017

How will cross-crate links be handled?

@kennytm, rustdoc is already able to generate links to other crates' types, and I would assume we can use this mechanism. For std (and core), it links to an official URL, e.g. https://doc.rust-lang.org/nightly/core/fmt/trait.Debug.html. For other external crates, it links to the docs it generated for the crate itself.

Update: Sorry, I totally missed

linking from num-bigint to serde::Serialize (third-party crates, and only linked when a feature is defined)

I'm not sure how to approach this best. I'd say we don't generate the link and issue a warning. Rendered docs should probably always contain all possible features. It will be difficult to do this when features are exclusive, i.e. you can't use both serde_json and json. This seems to be an unusual practice, though.

- Resolving paths
- Path ambiguities
- Linking to external crates
@killercup
Copy link
Member Author

Updated the RFC with sections on

  • Resolving paths
  • Path ambiguities
  • Linking to external crates

I don't think we need to use a potential Markdown extension for this.

If we're going to design our own hyperlink destinations (as proposed in this RFC and the comments above), I'd strongly recommend staying within the bounds of what is acceptable for a "link destination" as defined by CommonMark.
#1946 (comment)

I think Rust paths already comply with that. Even the () suffix should not be problem, as the fall into part (b) of:

[…] includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses […]
http://spec.commonmark.org/0.27/#link-destination

- Mention this is valid Markdown
- Add more alternatives
@killercup killercup force-pushed the intra-rustdoc-links branch from cacaa79 to 2dd1ed2 Compare March 7, 2017 14:29
@steveklabnik
Copy link
Member

/cc @jonathandturner who has been working on a rough rustdoc re-write

@durka
Copy link
Contributor

durka commented Mar 7, 2017

Good point, that's totally missing from the RFC. Path are relative to the position of item in whose documentation they appear. (See complex example for relative paths.)

This is a problem for docs that get cross-included by Deref.

@killercup
Copy link
Member Author

This is a problem for docs that get cross-included by Deref. (#1946 (comment))

Can you give an example? I found serde::bytes::ByteBuf which has a Deref impl, and the docs from trait Deref itself are rendered.

Including docs from traits should lead to the docs being evaluated relative to the trait item they are defined on, i.e. relative to core::ops::Deref, not serde::bytes::ByteBuf.

@nrc
Copy link
Member

nrc commented Mar 7, 2017

@killercup, @sgrif, @kennytm

This will need a way to disambiguate between functions, traits, etc. It's completely valid to have a struct and function with the same name. Perhaps we could use URIs for this? function://path::to::fn, struct://path::to::struct, etc.

It only needs to distinguish between values (functions, constants) and types (structs, enums, traits, modules, type-aliases, primitives)

Rust has three namespaces - values, types, and macros. Everything is in one of those three and you don't need any more namespaces than that (it is a bit more complicated because some things can be in more than one namespace, tuple structs, iirc can be used both as type names and as function names). There are already alternate URLs in Rustdoc that use this scheme, unfortunately via redirects (I would love to deprecate the crazy ones with trait etc in the name, but there were back compat worries), these are easy to synthesise, e.g., https://doc.rust-lang.org/nightly/std/path/Path.t.html for https://doc.rust-lang.org/nightly/std/path/struct.Path.html, so you don't need to know the random rustdoc categorisation of a thing, only its namespace.

@steveklabnik
Copy link
Member

Thanks @killercup ! Here's some thoughts....

  1. For me, the priority here is to make sure that we're within regular Markdown rules. It appears that we are. 👍
  2. A significant proposal from before that doesn't seem to be addressed here is a rust:// URI. Putting this in alternatives at least should happen; it's not clear to me that we shouldn't do this.
  3. In general, an RFC like this is largely split between two things: how do I define the links, and how does rustdoc actually figure them out. The former is definitely RFC-worthy for me, but I wonder if the latter can't be avoided as some sort of "you use the canonical path for the item linking to", and then free rustdoc to figure that out itself.
  4. In contradiction to point 4, "how does rustdoc figure this out" is a pretty significant pain point, and the hardest part of the RFC, IMHO.
  5. @jonathandturner and I (among others at various times) have wondered if rustdoc in the future shouldn't be built on top of RLS. If so, that split would mean "rustodc proper asks RLS what the URL is", I think.
  6. thoughts about migrating to this new style might be helpful, I wonder how hard it would be to create a tool to convert existing links like this.

That's all (lol, I feel like it's a lot) I've got at the moment, I'm sure I'll have more eventually...

@durka
Copy link
Contributor

durka commented Mar 7, 2017

@killercup

For example, there is a link from str::len to char, but this link is also present (and broken) on the String page because it is a relative link. There are open issues about this.

It's definitely solvable, but it could require "absolute" (i.e. crate-relative) paths in some places.

@nrc
Copy link
Member

nrc commented Mar 8, 2017

If so, that split would mean "rustodc proper asks RLS what the URL is", I think.

FWIW, the RLS already creates concrete URLs from paths like this so that when you have a reference to something from std, you can go to the rustdoc page. It might not work exactly how you need it to for this application though.

@kennytm
Copy link
Member

kennytm commented Mar 8, 2017

How should Rustdoc know that <P> is an HTML tag or a link to a P item?

@sophiajt
Copy link
Contributor

sophiajt commented Mar 8, 2017

That's great!

Thanks for the ping. I think for a "rustdoc 2.0" idea, we're still a little ways off, yet. Yeah, it will probably consume metadata rather than trying to work against the compiler directly, but there's still a fair bit of research we need to do to explore what we can do there. That said, I'm currently doing the rls stuff, and what we're doing there will help make a "rustdoc 2.0" possible.

I say go for it and keep improving this rustdoc. It just sets the bar that much higher for the next one (which is a good thing!)

@eddyb
Copy link
Member

eddyb commented Mar 8, 2017

@jonathandturner IMO, instead of trying to serialize piecewise, we could have sources for everything, which combined with warm incremental recompilation caches would allow introspecting entire crates.

Even if we don't get incremental reparsing, we could still skip parsing and expansion for, say, a libstd that was compiled once and distributed, so you get almost instant access to the HIR and type information.
To keep it slim, we could drop anything that's not exported (i.e. function bodies' MIR) from the caches.

@solson
Copy link
Member

solson commented Mar 8, 2017

  1. [Iterator][iter], and somewhere else in the document: [iter]: std::iter::Iterator

@killercup Markdown has another trick that would make this even sweeter at the use sites:

[Iterator], and somewhere else in the document: [Iterator]: std::iter::Iterator

@llogiq
Copy link
Contributor

llogiq commented Mar 8, 2017

The 'easy' alternative is to auto-link all items in scope, as in [String]. This is trivially supported by pulldown-cmark with extending the links map.

@rfcbot
Copy link
Collaborator

rfcbot commented Jul 13, 2017

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Jul 13, 2017
@rfcbot
Copy link
Collaborator

rfcbot commented Jul 23, 2017

The final comment period is now complete.

@nrc nrc merged commit 6af994d into rust-lang:master Jul 25, 2017
@killercup
Copy link
Member Author

Just came up:

`[Iterator]`

should probably not be rendered as a link inside a code block.

@Manishearth
Copy link
Member

We're proposing some amendments in #2285

bors added a commit to rust-lang/rust that referenced this pull request Jan 22, 2018
…Gomez,QuietMisdreavus,Manishearth

Implement RFC 1946 - intra-rustdoc links

rust-lang/rfcs#1946 #43466

Note for reviewers: The plain line counts are a little inflated because of how the markdown link parsing was done. [Read the file diff with "whitespace only" changes removed](https://github.com/rust-lang/rust/pull/47046/files?w=1) to get a better view of what actually changed there.

This pulls the name/path resolution mechanisms out of the compiler and runs it on the markdown in a crate's docs, so that links can be made to `SomeStruct` directly rather than finding the folder path to `struct.SomeStruct.html`. Check the `src/test/rustdoc/intra-paths.rs` test in this PR for a demo. The change was... a little invasive, but unlocks a really powerful mechanism for writing documentation that doesn't care about where an item was written to on the hard disk.

Items included:

 - [x] Make work with the hoedown renderer
 - [x] Handle relative paths
 - [x] Parse out the "path ambiguities" qualifiers (`[crate foo]`, `[struct Foo]`, `[foo()]`, `[static FOO]`, `[foo!]`, etc)
 - [x] Resolve foreign macros
 - [x] Resolve local macros
 - [x] Handle the use of inner/outer attributes giving different resolution scopes (handling for non-modules pushed to different PR)

Items not included:

 - [ ] Make sure cross-crate inlining works (blocked on refactor described in #47046 (comment))
 - [ ] Implied Shortcut Reference Links (where just doing `[::std::iter::Iterator][]` without a reference anchor will resolve using the reference name rather than the link target) (requires modifying the markdown parser - blocked on Hoedown/Pulldown switch and pulldown-cmark/pulldown-cmark#121)
 - [ ] Handle enum variants and UFCS methods (Enum variants link to the enum page, associated methods don't link at all)
 - [ ] Emit more warnings/errors when things fail to resolve (linking to a value-namespaced item without a qualifier will emit an error, otherwise the link is just treated as a url, not a rust path)
 - [ ] Give better spans for resolution errors (currently the span for the first doc comment is used)
 - [ ] Check for inner doc comments on things that aren't modules

I'm making the PR, but it should be noted that most of the work was done by Misdreavus 😄

(Editor's note: This has become a lie, check that commit log, Manish did a ton of work after this PR was opened `>_>`)
bors added a commit to rust-lang/rust that referenced this pull request Jan 23, 2018
…Gomez,QuietMisdreavus,Manishearth

Implement RFC 1946 - intra-rustdoc links

rust-lang/rfcs#1946 #43466

Note for reviewers: The plain line counts are a little inflated because of how the markdown link parsing was done. [Read the file diff with "whitespace only" changes removed](https://github.com/rust-lang/rust/pull/47046/files?w=1) to get a better view of what actually changed there.

This pulls the name/path resolution mechanisms out of the compiler and runs it on the markdown in a crate's docs, so that links can be made to `SomeStruct` directly rather than finding the folder path to `struct.SomeStruct.html`. Check the `src/test/rustdoc/intra-paths.rs` test in this PR for a demo. The change was... a little invasive, but unlocks a really powerful mechanism for writing documentation that doesn't care about where an item was written to on the hard disk.

Items included:

 - [x] Make work with the hoedown renderer
 - [x] Handle relative paths
 - [x] Parse out the "path ambiguities" qualifiers (`[crate foo]`, `[struct Foo]`, `[foo()]`, `[static FOO]`, `[foo!]`, etc)
 - [x] Resolve foreign macros
 - [x] Resolve local macros
 - [x] Handle the use of inner/outer attributes giving different resolution scopes (handling for non-modules pushed to different PR)

Items not included:

 - [ ] Make sure cross-crate inlining works (blocked on refactor described in #47046 (comment))
 - [ ] Implied Shortcut Reference Links (where just doing `[::std::iter::Iterator][]` without a reference anchor will resolve using the reference name rather than the link target) (requires modifying the markdown parser - blocked on Hoedown/Pulldown switch and pulldown-cmark/pulldown-cmark#121)
 - [ ] Handle enum variants and UFCS methods (Enum variants link to the enum page, associated methods don't link at all)
 - [ ] Emit more warnings/errors when things fail to resolve (linking to a value-namespaced item without a qualifier will emit an error, otherwise the link is just treated as a url, not a rust path)
 - [ ] Give better spans for resolution errors (currently the span for the first doc comment is used)
 - [ ] Check for inner doc comments on things that aren't modules

I'm making the PR, but it should be noted that most of the work was done by Misdreavus 😄

(Editor's note: This has become a lie, check that commit log, Manish did a ton of work after this PR was opened `>_>`)
@petrochenkov petrochenkov added the T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. label Jan 30, 2018
@Centril Centril added the A-intra-links Proposals relating to intra rustdoc links. label Nov 23, 2018
bors bot added a commit to crossbeam-rs/crossbeam that referenced this pull request Sep 6, 2020
559: Use intra-doc links r=stjepang a=taiki-e

Switch to use [intra-doc links](rust-lang/rfcs#1946) in all crates. Previously there was a big bug on cross crate re-exports (rust-lang/rust#65983), but it has been fixed, so we can use this in crossbeam.

This also adds checks of the docs to CI.

560: Use collect::<Box<[_]>>() directly in ShardedLock::new() r=stjepang a=taiki-e



Co-authored-by: Taiki Endo <[email protected]>
khvzak added a commit to mlua-rs/mlua that referenced this pull request Oct 11, 2021
khvzak added a commit to mlua-rs/mlua that referenced this pull request Oct 11, 2021
khvzak added a commit to mlua-rs/mlua that referenced this pull request Oct 11, 2021
sdleffler pushed a commit to sdleffler/hv-lua that referenced this pull request Nov 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-intra-links Proposals relating to intra rustdoc links. final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-dev-tools Relevant to the development tools team, which will review and decide on the RFC. T-doc Relevant to the documentation team, which will review and decide on the RFC. T-rustdoc Relevant to rustdoc team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.