Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[[nodiscard]] advice is largely absent #2109

Open
seppeon opened this issue Jul 4, 2023 · 12 comments
Open

[[nodiscard]] advice is largely absent #2109

seppeon opened this issue Jul 4, 2023 · 12 comments

Comments

@seppeon
Copy link

seppeon commented Jul 4, 2023

I noticed today that [[nodiscard]] isn't present in this set of guidelines (apart from not ignoring results of functions declared nodiscard and not using void casts to ignore). I think there are some clear cases where guidance would help people:

  1. When a stateless function has a non-void return value and no output parameters.
  2. With functions that return error codes, especially the new std::expected and almost always std::optional's.

Some broad guidance may also be useful, such as:

  • "When it doesn't make sense for the caller of a function to discard the result, declare it [[nodiscard]]"
@Eisenwave
Copy link
Contributor

The problem is that even though it is "more correct" to make such functions [[nodiscard]], it's also very verbose. It's the same reason why recommending const on function parameters is also not a great idea.

The C++ standard only uses [[nodiscard]] when it resolves some ambiguity, e.g.:

  • std::vector::empty could empty the vector
  • std::vector::empty could check whether the vector is empty

I feel like the guideline should use this much weaker requirement:

When adding [[nodiscard]] disambiguates what a function does, or discarding the result is a safety issue, add [[nodiscard]].

@jwakely
Copy link
Contributor

jwakely commented Jul 7, 2023

The C++ standard only uses [[nodiscard]] when it resolves some ambiguity

That's not because the standard thinks it shouldn't be used elsewhere. There have been several proposals to add it to loads of functions in the standard library, but I (and others) argued strongly against that, because:

  • Specifying it in the standard has no normative effect, implementations are allowed to ignore the attribute, and even if they don't ignore it, they're not required to issue a diagnostic.
  • Specifying it in a few places in the standard could be interpreted as saying it shouldn't be used elsewhere. That could result in users reporting bogus bugs about nodiscard warnings from other functions.
  • Specifying it in the standard would cause churn for implementation vendors who do want to add it where the standard "requires", to avoid complaints about being non-conforming.
  • Reviewing and approving those proposals would have eaten up hundreds of person hours in the committee, which should be spent on far more productive work.
  • It can (and IMHO should) be left to implementation vendors to just Do The Right Thing.

And the Right Thing is to use it widely, everywhere that it makes sense.

IMHO compilers should warn about discarded values for all comparison operators, all begin and end accessors, nearly all const member functions, etc. etc. because the chances that a == b; as an entire statement is a typo for = is much higher than the changes that the developer really wanted to compare two things just for the thrill of it. Compilers already warn about this for fundamental types, so it makes sense to do so for strings, containers etc. too. But unfortunately they're not smart enough to do it automatically, so it is useful to add [[nodiscard]] to them.* And std::lib implementations do that, without needing the standard to tell them to.

tl;dr do not assume from the absence of [[nodiscard]] attributes in the standard specification that the committee thinks it should be used sparingly. The opposite is true.

* I wish compilers were smart enough to warn about it without being told to, e.g. equality comparisons for string_view objects are obviously pure functions with no side effects, so the compiler should be able to warn if the result is ignored. A sufficiently smart compiler would warn, and so explicit [[nodiscard]] attributes on the function would be redundant, which is another reason not to require them in the standard. Implementations should be encouraged to warn about useless statements with no side effects, by whatever means necessary. It doesn't have to be done using [[nodiscard]].

@Eisenwave
Copy link
Contributor

And the Right Thing is to use it widely, everywhere that it makes sense.

The question is whether this makes sense as a guideline though. Afaik, the guidelines also don't recommend putting top-level const on function parameters, and it creates extremely verbose code when you do that. In my opinion, it creates too much clutter to spam a million [[nodiscard]]s into a code base. If someone wants to do it anyway because they have a safety-critical program, where an accidental discard bug can cause significant damage, then they can still make this part of their style guide. This extends beyond the core guidelines though.

Do we really want the core guidelines to recommend code which looks like this:

template <integral Int>
[[nodiscard]] // let's use nodiscard; after all, it's more correct
// don't forget the constexpr; after all, it can be constexpr
// let's make all the parameters const too; after all, it's more correct
constexpr Int max(const Int paramA, const Int paramB, const Int paramC)
// while we're at it, we also need a conditional noexcept specification
// this code is just oozing with correctness now!
    noexcept(noexcept(max(paramA, max(paramB, paramC))))
{
    return max(paramA, max(paramB, paramC));
}

I know this is a bit of slippery slope fallacy, but you get the point: we pay a price in readability when we add all of these features that make our code "more correct".

The responsibility should really be on compilers to detect when a function contains no side effects, and mark those functions as implicitly [[nodiscard]], as long as a definition is visible. Forcing users to spray this all over their codebase is not a good solution imo.

@jwakely
Copy link
Contributor

jwakely commented Jul 7, 2023

Yes, I agree. What's important is to get warnings about things that are probably typos or logic errors. Decorating functions with [[nodiscard]] is just the best way we currently have to do that in the general case. If compilers were smarter, we would rarely need to use [[nodiscard]].

@quicknir
Copy link

quicknir commented Jul 7, 2023

I agree with @Eisenwave , fwiw.

To respond to @jwakely 's example, yes, of course this can catch real errors. But:

  1. The number of real bugs that this would have caught for me is very small (just offering my personal experience as a data point).
  2. Most errors it would catch (not all) are very "obvious" kinds of things that would be caught pretty quickly anyway. So, both the frequency is low, and the time saved per catch is pretty low.

The == example is a good one mostly because it's just one character away from =, the same I suppose for things like += where you accidentally write +. I still don't think I've made this mistake frequently or close to it, but at least it's more possible. But I really doubt I've ever written begin() and thrown away the result. Taking all this together, I think this is adding a lot of verbosity to save maybe 5 minutes per developer per year, and isn't a good trade.

At the very least hopefully we can agree that putting nodiscard on a huge percentage of function is at least not a consensus best practice or close to it; I think the core guidelines should be conservative here to start and only recommend it in a narrower set of circumstances where it is widely agreed. And in particular, the best usage of nodiscard in my view is actually on types. You can put nodiscard on a type that carries an error, and this gives you significant real value, and avoids verbosity (it's a real pity std::expected isn't, or seemingly cannot, be mandated nodiscard).

Beyond that, I'd mostly put nodiscard on functions that return error types, or values that need to be checked/compared against (e.g. C style functions returning integers for number of bytes written), and functions that are misleadingly named (like empty).

Edit: looking back at this; I think I'd support the guideline also suggesting it for == and any arithmetic operator with a compound version, for the reasons discussed. But I still wouldn't suggest going extremely broad with it.

@jwakely
Copy link
Contributor

jwakely commented Jul 7, 2023

I see you edited your comment to remove:

  1. Compilers already do catch many of the most obvious cases. For example, the x == y; typo instead of x = y;. This is already warned for (where x, y could be either cosnt string&, or string_view, in both gcc and clang), and without even turning on any extra warnings.

Do you know why GCC warns about it?

Because I added [[nodiscard]] to the relevant operator== overloads!

And I had to do that because otherwise GCC doesn't warn for user-defined comparison operators (Clang does though). Which is precisely the point I've made above. Not all compilers are smart enough to do it without the attribute.

I'm not arguing that the guidelines should say to use nodiscard, I was originally just commenting on the claim that "The C++ standard only uses [[nodiscard]] when it resolves some ambiguity". The few places that use it were added before the current policy of not bothering to specify it was decided on. The Core Guidelines are for writing C++, not writing C++ specifications, so the standard is not a good example to follow here. Real C++ stdlib implementations do use nodiscard widely.

@jwakely
Copy link
Contributor

jwakely commented Jul 7, 2023

Taking all this together, I think this is adding a lot of verbosity to save maybe 5 minutes per developer per year, and isn't a good trade.

That depends. A library with a million users might save 5 million minutes per year, which seems more useful.

@quicknir
Copy link

quicknir commented Jul 7, 2023

Yes, I went back and double checked the reason why it did that, and saw that was in the source. I was just surprised because tbh this seemed like a good warning to have and one that's reasonable to implement. This is a fair point of course.

The Core Guidelines are for writing C++, not writing C++ specifications, so the standard is not a good example to follow here. Real C++ stdlib implementations do use nodiscard widely.

I agree that the specification isn't a good example. I also think the stdlib is barely a better example; 99% of C++ developers aren't writing code that will have millions of users, and whose source code is read so rarely because you have absolutely pristine documentation on cppreference.

That depends. A library with a million users might save 5 million minutes per year, which seems more useful.

FWIW when I'm thinking of the verbosity costs, I'm largely thinking about reading rather than writing, e.g. like in Eisenwave's example of what happens when you "max out" correctness. And that costs scales with readers of course too, so the number of users doesn't really affect the calculus very much.

I think the comparison between nodiscrard, and const-ing values in the definition is pretty apt. In both cases it's strictly more correct when applicable, yes, but it's also a very small benefit, so small that I think there's definitely room to say that it doesn't outweigh the verbosity cost (which is low as well, of course).

Perhaps we could try to break down a few specific "cases" where nodiscard could be applied, label them (1-N or A, B, etc) so that we're all talking about the same thing, and see how folks feel about each one.

@MikeGitb
Copy link

MikeGitb commented Jul 7, 2023

I think the comparison between nodiscrard, and const-ing values in the definition is pretty apt.

Not really. Const parameters are only relevant/helpful for the function implementation. [[nodiscard]] Is relevant for every single call site of that function.

Also, just saying: #of readers != #of users, so I'd challenge the claim that the # of users doesn't make a lot of difference.

Personally I started to put nodiscard on every sideeffect-free function that belongs to the API surface of a library and sometimes even on internal functions. That said "I like/dislike", "it helps/doesn't help me" are imho usually very poor arguments for why something should or should not be put into a core guideline. Just because something is useful/not useful for me/my team doesn't mean the same is true for the majority of projects out there.

@quicknir
Copy link

quicknir commented Jul 7, 2023

Yes, you make good points. Without getting further into the details though, I agree with your last comment, which I think is aligned with what I said earlier about it not being a clear cut, consensus best practice. I'd be against nodiscard on every single side-effect free function in the core guidelines; so far nobody other than OP has said they're actually in favor.

To make things a bit more concrete; categories of functions that we could consider suggesting nodiscard on:

  1. Functions returning errors (preferably, nodiscard on the error type, but that may not be possible). Otherwise, you can easily forget to check for errors which is rarely correct.
  2. Functions where the output is required to correctly interpret side effects of the function. For example, for read, even aside from the possibility of an error, you must know how many bytes were read. It's very easy to just assume the maximum number of bytes were read otherwise. Another example is std::partition; the mutations performed can't be easily (cheaply) interpreted without the return.
  3. == and operators that have a compound form; these are just a tiny typo away from a mutating operation, and is plausible to write by accident when a mutating function was desired.
  4. Functions that are especially prone to being mistaken as mutators when they are not, and having their result dropped. std::vector::empty is the canonical example here.
  5. All functions that are side-effect free.

I would say I'm presently in favor of 1-2, perhaps 3 and 4 also. For 4, in most cases just renaming your function is better, but if you have a poorly named function and can't tolerate an API break perhaps that's your only option.

@seppeon
Copy link
Author

seppeon commented Jul 8, 2023

  1. Since similar discussions have happened before here is a relevant paper:

  2. Since some tooling does relevant checks, this page shows the rules applied to them:

  3. Just like const, [[nodiscard]] is an expression of intent. It expresses when it does and does not make sense to ignore some return value.

  4. While I don't like calling out edge cases one by one and I think that should be avoided (I'd rather a simpler rule). Here are some, perhaps to help consider the commonality between them:

    • a += b; is also one character away from a + b; etc (as per quicknirs 3);
    • the whole series of pop functions (pop_front, pop_back, pop);
    • string_view functions (as per jwakely).
    • span functions.
    • substring functions.
    • subview functions.
    • subrange functions.
    • make_* functions (make_unique etc...) and all methods returning ownership of something.
    • .empty() (as per Eisenwave): Empty being problematic because it is an adjective + verb and I'm sure there are a lot of those.
  5. "the guidelines also don't recommend putting top-level const on function parameters"

    An exception exists because of arguments passed by value, not because of all the other semantics afaik.

    P.10: Prefer immutable data to mutable data

    And

    Con: Constants and immutability

    The exception being:

    Function parameters passed by value are rarely mutated, but also rarely declared const. To avoid confusion and lots of false positives, don’t enforce this rule for function parameters.

  6. CppCoreGuidelines already deals with how to ignore [[nodiscard]] return values, so it makes sense to say when to use [[nodiscard]] return values Use std::ignore = to ignore [[nodiscard]] values..

@Eisenwave
Copy link
Contributor

Eisenwave commented Jul 8, 2023

It's difficult to encapsulate all of these scenarios in a single sentence, and that's ultimately what we need for a guideline. I would approach this in reverse; i.e. starting at the wording and seeing what implications it has:

Wording A

Add [[nodiscard]] when it disambiguates what a function does, or discarding the result is a safety issue.

This policy is a trade-off which is similar to how it is decided whether a function in the standard library is [[nodiscard]]. It includes the greatest hazards that [[nodiscard]] avoids, but excludes minor issues like discarding the result of ==.

Wording B

Add [[nodiscard]] when it is likely to avoid avoid misuse of a function.

This policy is most aggressive in promoting [[nodiscard]].

Wording C

Add [[nodiscard]] when a function is impossible to use when the result is discarded.

This policy is more in line with B, but constrained to cases like std::launder, where discarding the result breaks code. It does not include functions like std::malloc, which still do something with the result discarded. For example, if our goal is to intentionally leak 1000 bytes, then std::malloc(1000) is the way to go.

Wording D

Add [[nodiscard]] when discarding the result of a function with side effects is not intended.

It's exclusively aimed at misuses of functions like read().

When is [[nodiscard]] applied with these wordings?

Category A B C D
1. Returning errors maybe maybe maybe maybe
2. Returning information about side effects (e.g. read) maybe yes maybe yes
3. Typo hazards like == vs = no yes maybe no
4. Mutation ambiguities like empty() yes yes maybe no
5. All pure functions no yes yes no

"Returning errors" is maybe in all cases, because discarding errors is common and often intentional, like for std::printf.
It's also possible that even if we discard an error, we might get error information from elsewhere, like errno.

Overall, I don't think that the guideline should extend to 5. and 3., but it's very difficult to come up with wording that doesn't. I still prefer my original wording A.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants