Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve man page formatting #1226

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

g-branden-robinson
Copy link

@g-branden-robinson g-branden-robinson commented Sep 14, 2024

This lengthy series of commits makes Crispy Doom's already good effort at partly machine-generated man(7) output even better (in my opinion).

The extensive quotations justifying most of the changes are not intended to hector anyone, even though many of them appear in multiple commits. I included them in each applicable case so that the reasons for the changes remain intelligible in the Git history if not all of these commits are accepted. (And even if they all are, git blame plus git show should be enough to illuminate matters for future contributors.)

I'm happy to try to address questions or concerns anyone may have.

Blank lines in *roff input files put excess space in the formatted
document.  Eliminate them where adjacent macro calls imply vertical
spacing, or replace them with appropriate calls (`PP` for an ordinary
paragraph, `IP` for an indented one) when separating paragraphs from
each other.

groff_man_style(7):
  ... Do not put blank (empty) lines in a man page source document.
...to man pages when interpolating templated material.  Newlines already
exist at the end of each `t`.
groff_man_style(7):
     .TH identifier section [footer‐middle [footer‐inside [header‐
     middle]]]
...
            By convention, footer‐middle is the date of the most recent
            modification to the man page source document, ...
...with the project name.

groff_man_style(7):
     .TH identifier section [footer‐middle [footer‐inside [header‐
     middle]]]
...
            By convention, footer‐middle is the date of the most recent
            modification to the man page source document, and footer‐
            inside is the name ... of the project providing it.
...similarly to existing `-s` and `-z` options.

The `@VERSION@` replacement is not yet handled.
@fabiangreffrath
Copy link
Owner

Thank you for your effort! However, I'm sorry to tell you that this is the wrong address for manpage improvements. The manpages are inherited mostly unmodified from Chocolate Doom, so in order for both project to benefit from these changes, I'll have to ask you to propose them over there.

@g-branden-robinson
Copy link
Author

Thank you for your effort! However, I'm sorry to tell you that this is the wrong address for manpage improvements. The manpages are inherited mostly unmodified from Chocolate Doom, so in order for both project to benefit from these changes, I'll have to ask you to propose them over there.

Oh, bummer. I've watched this project for quite a while but not Chocolate. Any tips or advice for contributing there?

@fabiangreffrath
Copy link
Owner

Same as here, just open a pull request there. You have done everything right, just targeted at the "wrong" project.

...to embed the package version in man page templates that use
`@VERSION@`.  (None do, yet.)
...with the package version number.

groff_man_style(7):
     .TH identifier section [footer‐middle [footer‐inside [header‐
     middle]]]
...
            By convention, footer‐middle is the date of the most recent
            modification to the man page source document, and footer‐
            inside is the name and version or release of the project
            providing it.
Paragraph macros already break any pending output line.  A `br` request,
absent anything else before the paragraphing macro call, is superfluous.
It looks better, and is more conventional, to have a paragraph break
between a list of copyright notices and license texts.

See, for example:

https://spdx.org/licenses/BSD-3-Clause.html
https://spdx.org/licenses/MIT.html
https://spdx.org/licenses/Artistic-2.0.html

Longer licenses, like the GNU GPL and Apache 2, tend to separate the
full texts of their licenses into separate files or sections of a
document, but where the copyright notices are placed, they too are
followed by a paragraph break.

See, for example, the "Standard License Header", the final sections of
both those licenses

https://spdx.org/licenses/GPL-3.0-or-later.html
https://spdx.org/licenses/Apache-2.0.html

Finally, I observe that having the copyright and license notice in the
rendered text of a man page, while sometimes seen, is not common
practice.  Putting this information in *roff comments in the man(7)
source document (in *roff comments) suffices.  It is installed on
systems (if sometimes in compressed form), so there is no risk of the
notices getting separated from the text to which they apply--not by mere
neglect or accident.  Only deliberate action can efface the notices.
Use `EX` and `EE` macros to turn off filling, obviating the need for
repeated `br` requests.

I must admit at the outset that these macros are extensions to the
original man(7) dialect of Seventh Edition Unix in 1979.

groff_man_style(7):
     .EX
     .EE    Begin and end example.  After .EX, filling is disabled and a
            constant‐width (monospaced) font is selected.  Calling .EE
            enables filling and restores the previous font.

            Example regions are useful for formatting code, shell
            sessions, and text file contents.  An example region is not
            a “literal mode” of any sort: special character escape
            sequences must still be used to produce correct glyphs for
            ', -, \, ^, `, and ~ (see subsection “Portability” below).
            Sentence endings are still detected and additional inter‐
            sentence space applied.  If the amount of additional inter‐
            sentence spacing is altered, the rendering of, for instance,
            regular expressions using . or ? followed by multiple spaces
            can change.  Use the dummy character escape sequence \&
            before the spaces.

            .EX and .EE are extensions introduced in Ninth Edition Unix.
            Documenter’s Workbench, Heirloom Doctools, and Plan 9
            troffs, and mandoc (since 1.12.2) also support them.
            Solaris troff does not.

"Solaris troff" means the AT&T System V-descended troff program that
Solaris shipped up through version 10 of that operating system.  Solaris
11 adopted groff as its troff, and therefore handles these extensions
fine.  (Its groff is pretty old, but not older than these macros, which
date to 2009 [or 1986 in Bell Labs Research Unix].)
Use man(7) font alternation macros instead of *roff font selection
escape sequences.  Stop shouting the "OPTIONS" metasyntactic variable
name; this is sometimes the custom of usage messages (where formatting
in multiple typefaces, even bold, is impractical), but not of man pages.

groff_man_style(7):
     Unlike the above font style macros, the font style alternation
     macros below set no input traps; they must be given arguments to
     have effect.  They apply italic corrections as appropriate.
...
     .RI roman‐text italic‐text ...
            Set each argument in roman and italics, alternately.

                   .RI ( tpic
                   was a fork of AT&T
                   .I pic
                   by Tim Morgan of the University of California at Irvine
1.  Escape hyphens used as option dashes, for ease of copy and paste.[1]
2.  Use man(7) font alternation macros instead of *roff font selection
    escape sequences.  In my opinion, these look cleaner.
3.  Set file names in italics, not bold, as suggested in
    groff_man_style(7) and for consistency with "default.cfg.template"
    and "extra.cfg.template".

There are a few ways to force spacing between font alternation macro
arguments.  I selected `\~` because it is generally useful.  It is,
however, an extension.

     \~        Adjustable non‐breaking space.  Use this escape sequence
               to prevent a break inside a short phrase or between a
               numerical quantity and its corresponding unit(s).

```
                      Before starting the motor,
                      set the output speed to\~1.
                      There are 1,024\~bytes in 1\~KiB.
                      CSTR\~fabiangreffrath#8 documents the B\~language.
```

               \~ is a GNU extension also supported by Heirloom Doctools
               troff 050915 (September 2005), mandoc 1.9.14
               (2009‐11‐16), neatroff (commit 1c6ab0f6e, 2016‐09‐13),
               and Plan 9 from User Space troff (commit 93f8143600,
               2022‐08‐12), but not by Solaris or Documenter’s Workbench
               troffs.

[1] brouhaha alert: https://lwn.net/Articles/947941/
...when introducing them in the lead paragraph of the "DESCRIPTION"
section.  MLA style is to italicize such a thing, because it is "a work
that is complete in and of itself, like a book or movie or painting".

https://universitypark-lonestar.libanswers.com/faq/278770

This is the only instance where such styling seems important.  The game
engines and artistic assets are severable (and severed in practice with
strongly differentiated licensing) and dealt with independently by
source port project developers and users.
1.  Set file names in italics, not bold, as suggested in
    groff_man_style(7) and for consistency with "default.cfg.template"
    and "extra.cfg.template".
2.  Break input lines at sentence endings.

groff_man_style(7):
            Use italics for file and path names, for environment
            variables, for C data types, for enumeration or preprocessor
            constants in C, for variant (user‐replaceable) portions of
            syntax synopses, for the first occurrence (only) of a
            technical concept being introduced, for names of journals
            and of literary works longer than an article, and anywhere a
            parameter requiring replacement by the user is encountered.
            ...

roff(7):
     A roff formatter attempts to detect boundaries between sentences,
     and supplies additional inter‐sentence space between them.  It
     flags certain characters (normally “!”, “?”, and “.”) as
     potentially ending a sentence.  When the formatter encounters one
     of these end‐of‐sentence characters at the end of an input line, or
     one of them is followed by two (unescaped) spaces on the same input
     line, it appends an inter‐word space followed by an inter‐sentence
     space in the output. ...

Input conventions
     Since troff fills text automatically, it is common practice in the
     roff language to avoid visual composition of text in input files:
     the esthetic appeal of the formatted output is what matters.
     Therefore, roff input should be arranged such that it is easy for
     authors and maintainers to compose and develop the document,
     understand the syntax of roff requests, macro calls, and
     preprocessor languages used, and predict the behavior of the
     formatter.  Several traditions have accrued in service of these
     goals.

     •  Follow sentence endings in the input with newlines to ease their
        recognition.  It is frequently convenient to end text lines
        after colons and semicolons as well, as these typically precede
        independent clauses.  Consider doing so after commas; they often
        occur in lists that become easy to scan when itemized by line,
        or constitute supplements to the sentence that are added,
        deleted, or updated to clarify it.  Parenthetical and quoted
        phrases are also good candidates for placement on text lines by
        themselves.
1.  Favor man(7) font selection macros over roff(7) font selection
    escape sequences.
2.  Use the groff man(7) extension `TQ` to stack multiple paragraph tags
    rather than comma-separating them (which can get tedious if style
    changes are required within tags).
3.  Use quotation instead of boldface to cite other sections of the same
    man page.
4.  Use quotation _and_ boldface when presenting value literals for
    environment variables.  The use of both is for clarity when font
    styling is lost (as when quoting man pages in emails--or Git commit
    messages), and to indicate unambiguously to the reader portions of
    the page that they might wish to copy and paste to a command line or
    shell script.

groff_man_style(7):
               As long as at most two styles are needed in a word, style
               macros like .B and .BI usually result in more readable
               roff source than \f escape sequences do.
...
     .TQ    Set an additional tag for a paragraph tagged with .TP,
            planting a one‐line input trap as with .TP.

            .TQ is a GNU extension supported by Heirloom Doctools troff
            and mandoc (since 1.14.5) but not by Documenter’s Workbench,
            Plan 9, or Solaris troffs.  ...
...
     Be frugal with italics for emphasis, and particularly with bold.
     Article titles and brief runs of literal text, such as references
     to individual characters or short strings, including section and
     subsection headings of man pages, are suitable objects for
     quotation; see the \(lq, \(rq, \(oq, and \(cq escape sequences in
     subsection “Portability” below.
...
     \(lq
     \(rq   Left and right double quotation marks.  Use these for paired
            directional double quotes, “like this”.
Break input lines at sentence endings.

roff(7):
     A roff formatter attempts to detect boundaries between sentences,
     and supplies additional inter‐sentence space between them.  It
     flags certain characters (normally “!”, “?”, and “.”) as
     potentially ending a sentence.  When the formatter encounters one
     of these end‐of‐sentence characters at the end of an input line, or
     one of them is followed by two (unescaped) spaces on the same input
     line, it appends an inter‐word space followed by an inter‐sentence
     space in the output. ...

Input conventions
     Since troff fills text automatically, it is common practice in the
     roff language to avoid visual composition of text in input files:
     the esthetic appeal of the formatted output is what matters.
     Therefore, roff input should be arranged such that it is easy for
     authors and maintainers to compose and develop the document,
     understand the syntax of roff requests, macro calls, and
     preprocessor languages used, and predict the behavior of the
     formatter.  Several traditions have accrued in service of these
     goals.

     •  Follow sentence endings in the input with newlines to ease their
        recognition.  It is frequently convenient to end text lines
        after colons and semicolons as well, as these typically precede
        independent clauses.  Consider doing so after commas; they often
        occur in lists that become easy to scan when itemized by line,
        or constitute supplements to the sentence that are added,
        deleted, or updated to clarify it.  Parenthetical and quoted
        phrases are also good candidates for placement on text lines by
        themselves.
1.  Break input lines at sentence endings.
2.  Set file names in italics, not bold, as suggested in
    groff_man_style(7) and for consistency with "default.cfg.template"
    and "extra.cfg.template".
3.  Use `\~\c` pair of escape sequences to continue a paragraph tag with
    a word space over multiple macro calls.
4.  Set "Current working directory" paragraph tag in roman since it is
    neither a literal (so would be bold) nor a parameter (so would be
    italic).
5.  Set environment variables names in italics, not bold.
6.  Use `\%` escape sequence to protect lengthy literals from
    hyphenation.
7.  Use typographer's quotation marks (`\(lq` and `\(rq` escape
    sequences) to quote multi-word literals, in addition to setting them
    in bold.  These special characters degrade gracefully to the `"`
    character on output devices (like terminals limited to the US-ASCII
    for ISO Latin-1 charcter encodings) that don't support them.

roff(7):
     A roff formatter attempts to detect boundaries between sentences,
     and supplies additional inter‐sentence space between them.  It
     flags certain characters (normally “!”, “?”, and “.”) as
     potentially ending a sentence.  When the formatter encounters one
     of these end‐of‐sentence characters at the end of an input line, or
     one of them is followed by two (unescaped) spaces on the same input
     line, it appends an inter‐word space followed by an inter‐sentence
     space in the output. ...

Input conventions
     Since troff fills text automatically, it is common practice in the
     roff language to avoid visual composition of text in input files:
     the esthetic appeal of the formatted output is what matters.
     Therefore, roff input should be arranged such that it is easy for
     authors and maintainers to compose and develop the document,
     understand the syntax of roff requests, macro calls, and
     preprocessor languages used, and predict the behavior of the
     formatter.  Several traditions have accrued in service of these
     goals.

     •  Follow sentence endings in the input with newlines to ease their
        recognition.  It is frequently convenient to end text lines
        after colons and semicolons as well, as these typically precede
        independent clauses.  Consider doing so after commas; they often
        occur in lists that become easy to scan when itemized by line,
        or constitute supplements to the sentence that are added,
        deleted, or updated to clarify it.  Parenthetical and quoted
        phrases are also good candidates for placement on text lines by
        themselves.

groff_man_style(7):
            Use italics for file and path names, for environment
            variables, for C data types, for enumeration or preprocessor
            constants in C, for variant (user‐replaceable) portions of
            syntax synopses, for the first occurrence (only) of a
            technical concept being introduced, for names of journals
            and of literary works longer than an article, and anywhere a
            parameter requiring replacement by the user is encountered.
            ...
...
     \%        Control hyphenation.  The location of this escape
               sequence within a word marks a hyphenation point,
               supplementing groff’s automatic hyphenation patterns.  At
               the beginning of a word, it suppresses any hyphenation
               breaks within except those specified with \%.
...
     \(lq
     \(rq   Left and right double quotation marks.  Use these for paired
            directional double quotes, “like this”.
1.  Set program executable names in italics, not bold.
2.  Break input lines at sentence endings.
3.  Protect lengthy literal from hyphenation.

groff_man_style(7):
            Use italics for file and path names, for environment
            variables, for C data types, for enumeration or preprocessor
            constants in C, for variant (user‐replaceable) portions of
            syntax synopses, for the first occurrence (only) of a
            technical concept being introduced, for names of journals
            and of literary works longer than an article, and anywhere a
            parameter requiring replacement by the user is encountered.
            ...

roff(7):
     A roff formatter attempts to detect boundaries between sentences,
     and supplies additional inter‐sentence space between them.  It
     flags certain characters (normally “!”, “?”, and “.”) as
     potentially ending a sentence.  When the formatter encounters one
     of these end‐of‐sentence characters at the end of an input line, or
     one of them is followed by two (unescaped) spaces on the same input
     line, it appends an inter‐word space followed by an inter‐sentence
     space in the output. ...

Input conventions
     Since troff fills text automatically, it is common practice in the
     roff language to avoid visual composition of text in input files:
     the esthetic appeal of the formatted output is what matters.
     Therefore, roff input should be arranged such that it is easy for
     authors and maintainers to compose and develop the document,
     understand the syntax of roff requests, macro calls, and
     preprocessor languages used, and predict the behavior of the
     formatter.  Several traditions have accrued in service of these
     goals.

     •  Follow sentence endings in the input with newlines to ease their
        recognition.  It is frequently convenient to end text lines
        after colons and semicolons as well, as these typically precede
        independent clauses.  Consider doing so after commas; they often
        occur in lists that become easy to scan when itemized by line,
        or constitute supplements to the sentence that are added,
        deleted, or updated to clarify it.  Parenthetical and quoted
        phrases are also good candidates for placement on text lines by
        themselves.

groff_man_style(7):
     \%        Control hyphenation.  The location of this escape
               sequence within a word marks a hyphenation point,
               supplementing groff’s automatic hyphenation patterns.  At
               the beginning of a word, it suppresses any hyphenation
               breaks within except those specified with \%.
Use adverbial, not adjectival, form of word where appropriate.
1.  Favor man(7) font selection macros over roff(7) font selection
    escape sequences.
2.  Break input lines at sentence endings.

groff_man_style(7):
               As long as at most two styles are needed in a word, style
               macros like .B and .BI usually result in more readable
               roff source than \f escape sequences do.

roff(7):
     A roff formatter attempts to detect boundaries between sentences,
     and supplies additional inter‐sentence space between them.  It
     flags certain characters (normally “!”, “?”, and “.”) as
     potentially ending a sentence.  When the formatter encounters one
     of these end‐of‐sentence characters at the end of an input line, or
     one of them is followed by two (unescaped) spaces on the same input
     line, it appends an inter‐word space followed by an inter‐sentence
     space in the output. ...

Input conventions
     Since troff fills text automatically, it is common practice in the
     roff language to avoid visual composition of text in input files:
     the esthetic appeal of the formatted output is what matters.
     Therefore, roff input should be arranged such that it is easy for
     authors and maintainers to compose and develop the document,
     understand the syntax of roff requests, macro calls, and
     preprocessor languages used, and predict the behavior of the
     formatter.  Several traditions have accrued in service of these
     goals.

     •  Follow sentence endings in the input with newlines to ease their
        recognition.  It is frequently convenient to end text lines
        after colons and semicolons as well, as these typically precede
        independent clauses.  Consider doing so after commas; they often
        occur in lists that become easy to scan when itemized by line,
        or constitute supplements to the sentence that are added,
        deleted, or updated to clarify it.  Parenthetical and quoted
        phrases are also good candidates for placement on text lines by
        themselves.
Break input lines at sentence endings.

roff(7):
     A roff formatter attempts to detect boundaries between sentences,
     and supplies additional inter‐sentence space between them.  It
     flags certain characters (normally “!”, “?”, and “.”) as
     potentially ending a sentence.  When the formatter encounters one
     of these end‐of‐sentence characters at the end of an input line, or
     one of them is followed by two (unescaped) spaces on the same input
     line, it appends an inter‐word space followed by an inter‐sentence
     space in the output. ...

Input conventions
     Since troff fills text automatically, it is common practice in the
     roff language to avoid visual composition of text in input files:
     the esthetic appeal of the formatted output is what matters.
     Therefore, roff input should be arranged such that it is easy for
     authors and maintainers to compose and develop the document,
     understand the syntax of roff requests, macro calls, and
     preprocessor languages used, and predict the behavior of the
     formatter.  Several traditions have accrued in service of these
     goals.

     •  Follow sentence endings in the input with newlines to ease their
        recognition.  It is frequently convenient to end text lines
        after colons and semicolons as well, as these typically precede
        independent clauses.  Consider doing so after commas; they often
        occur in lists that become easy to scan when itemized by line,
        or constitute supplements to the sentence that are added,
        deleted, or updated to clarify it.  Parenthetical and quoted
        phrases are also good candidates for placement on text lines by
        themselves.
1.  Use quotation to cite other sections of the same man page.
2.  Break input lines at sentence endings.
3.  Set "Doom" in titlecase, not full capitals.  On the Crispy master
    branch, before my changes, the former is preponderant in
    documentation.
        $ git grep -wc DOOM man
        man/INSTALL.template:9 (all of which are `#if` or `#ifdef`)
        man/strife.template:2
        $ git grep -wc Doom man
        man/INSTALL.template:16
        man/Makefile.am:2
        man/bash-completion/doom.template.in:1
        man/docgen:4
        man/doom.template:5
        man/heretic.template:1
        man/hexen.template:1
        man/iwad_paths.man:5
        man/server.template:1
        man/setup.template:1
        man/strife.template:5
        man/wikipages:1
4.  Use man(7) font style macros, not asterisks, for typographical
    emphasis.  Man pages and Markdown employ different conventions.
5.  Set the C standard library symbol "NULL" in italics, not roman.

groff_man_style(7):
     Article titles and brief runs of literal text, such as references
     to individual characters or short strings, including section and
     subsection headings of man pages, are suitable objects for
     quotation; see the \(lq, \(rq, \(oq, and \(cq escape sequences in
     subsection “Portability” below.
...
     \(lq
     \(rq   Left and right double quotation marks.  Use these for paired
            directional double quotes, “like this”.

roff(7):
     A roff formatter attempts to detect boundaries between sentences,
     and supplies additional inter‐sentence space between them.  It
     flags certain characters (normally “!”, “?”, and “.”) as
     potentially ending a sentence.  When the formatter encounters one
     of these end‐of‐sentence characters at the end of an input line, or
     one of them is followed by two (unescaped) spaces on the same input
     line, it appends an inter‐word space followed by an inter‐sentence
     space in the output. ...

Input conventions
     Since troff fills text automatically, it is common practice in the
     roff language to avoid visual composition of text in input files:
     the esthetic appeal of the formatted output is what matters.
     Therefore, roff input should be arranged such that it is easy for
     authors and maintainers to compose and develop the document,
     understand the syntax of roff requests, macro calls, and
     preprocessor languages used, and predict the behavior of the
     formatter.  Several traditions have accrued in service of these
     goals.

     •  Follow sentence endings in the input with newlines to ease their
        recognition.  It is frequently convenient to end text lines
        after colons and semicolons as well, as these typically precede
        independent clauses.  Consider doing so after commas; they often
        occur in lists that become easy to scan when itemized by line,
        or constitute supplements to the sentence that are added,
        deleted, or updated to clarify it.  Parenthetical and quoted
        phrases are also good candidates for placement on text lines by
        themselves.
Use two spaces between sentences in C comments destined for inlining
into generated man(7) documents.  This way the formatter can tell where
the sentence endings are, which affects how much space it puts between
them.  (This amount of space is configurable at formatting time by the
readers of man pages, if man(7) authors accommodate them.  Some people
have strong preferences here.)

/usr/share/groff/site-tmac/man.local
       Put site‐local changes and customizations into this file.

              .\" Put only one space after the end of a sentence.
              .ss 12 0 \" See groff(7).
Add periods to ends of sentences.
1.  Use hyphenated phrases for metasyntactic variable names; these are
    not C identifiers and do not have to follow the same rules.
2.  Favor "file" over "filename"; the former is already used with
    greater consistency.
3.  Use "W" and "H" to refer to width and height measurements rather
    than "x" and "y", for clarity and distinction from other geometric
    parameters.
...to handle command-line options with complex argument structures
(where "complex" means "anything but the most trivial case").  This
causes them to format idiomatically, with parameters in italics,
literals in bold, and "synopsis language", like brackets denoting
optional parameters, in roman.

This approach is necessary for idiomatic man page rendering because in
the language interpreted by this script, there is no way to mark up such
distinctions in code comments.  Such a mini-language would have to be
designed, and that way lies perlpod(1) and similar efforts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants