Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] move from last use break code where variable is passed to function with inout argument passing #231

Open
filipsajdak opened this issue Jan 17, 2023 · 70 comments
Labels
bug Something isn't working

Comments

@filipsajdak
Copy link
Contributor

In the current implementation of cppfront, the following code:

f2: (inout x) -> _ = {
    x *= 2;
    return x;
}

main: () -> int = {
    x := 21;
    std::cout << f2(x) << std::endl;
}

Passes cppfront:

tests/bug_inout_argument.cpp2... ok (all Cpp2, passes safety checks)

But failed to compile on the cpp1 side with an error:

tests/bug_inout_argument.cpp2:7:18: error: no matching function for call to 'f2'
    std::cout << f2(std::move(x)) << std::endl;
                 ^~
tests/bug_inout_argument.cpp2:1:20: note: candidate function [with x:auto = int] not viable: expects an lvalue for 1st argument
[[nodiscard]] auto f2(auto& x) -> auto{
                   ^
1 error generated.

When cppfront moves x on its last use it breaks the requirements of the f2 function that requires an lvalue reference but gets an rvalue reference.

Expectations

cppfront should take into consideration the context in which the variable is used to avoid breaking the code on the cpp1 side.

@hsutter
Copy link
Owner

hsutter commented Jan 17, 2023

Thanks!

One way to suppress this could be to require a mutating argument to be qualified with inout, which I've thought of and @jbatez suggested in #198.

I see there's a related issue in #230 which might also be solved with an inout argument (call-site) requirement...

I'll consider #230 and #231 together...

@hsutter
Copy link
Owner

hsutter commented Mar 26, 2023

Thanks for picking this up again in #294.

tldr

After reconsidering the examples, I think the status quo is a feature, not a bug, in Cpp2. I think the combination of parameter passing + move from definite last use is (elegantly? certainly naturally) exposing real user code bugs that were silent in Cpp1. This is very pleasing.

That said, I agree that an argument qualifier is the right answer. But understanding why the status quo is actually a feature is important because it will:

  • help us name the qualifier,
  • show where else the qualifier should be allowed,
  • show why the qualifier will be needed rarely, and
  • show why when the qualifier is needed it adds value (and a "programmer doing something odd here" flag to focus on during code review).

Why a feature: Diagnosing an unused output, like [[nodiscard]]

There are two features interacting here:

(1) Declaring parameter passing intent: This states the direction of data flow (in, inout, etc.).

(2) Move from definite last use: When we know the variable won't be used again, of course it's safe to move from so it seems this should be automatic and default.

Both features let the programmer declare their intent in a way that helps expose program bugs. Specifically:

(1) An inout or out (or Cpp1 non-const &) parameter is declaring that one of the function's outputs is via that argument, just as declaring a non-void return type is declaring that one of the function's outputs is via the return value. Those are the outputs, and ignoring an output is usually bad (but of course not always, see bottom). Just as Cpp2 makes [[nodiscard]] the default for return values, what you're encountering here is that it is naturally doing the same thing for inout arguments too, treating both declared output data flows similarly.

(2) A last use argument is diagnosing that the variable will no longer be used. If the last use is to an inout or out parameter, then not looking at it afterward is just the same as calling a function with a non-void return and never looking at the returned value (which is diagnosed in Cpp2 because of the enforced [[nodiscard]]).

So we are doing the user a favor by diagnosing this, just the same as if the user were ignoring a [[nodiscard]] return value.

And that's why I think that we should consider naming the opt-out for "unused out result" and "unused return value" with the same name, if there's a good name. They are the same case. (Sure, you sometimes want an opt-out, but only in rarer cases where you're relying on other side effects being performed by the function and really don't need the value, in which case the code should say so by writing discard or something.)

Example 1: Just return

Let's consider the two versions of the code... First, consider the version of the code you used in #294:

f2: (inout x) -> _ = {
    return x * 2;
}

main: () = {
    x := 21;
    std::cout << f2(x) << std::endl;
}

Compiling this with cppfront and then a Cpp1 compiler calls out f2(x) as invalid. But why? The compilers tell us it's because x is an rvalue, and the argument must be an lvalue. This is great, because it's true. There's something fishy.

What's fishy? It's f2... it declares its parameter as inout, but never writes to it. As you know, I aim eventually (not now) to emit a diagnostic for failure to have a non-const use of an inout parameter on at least one path somewhere in the function... when I implement that, the error will be flagged even sooner within the callee. Right now, the error is being flagged at the call site, which I expect to be usually still caught early at f2 unit test time because it will be common for even f2's initial toy test cases to do this... pass a last use, which exposes the bug in f2.

What's the solution? In this case, f2 should change its parameter to be in, and then everything compiles and runs.

Example 2: Also modify parameter

But what if f2 actually modifies its parameter? That brings us to the other version of your code, above...

f2: (inout x) -> _ = {
    x *= 2;
    return x;
}

main: () = {
    x := 21;
    std::cout << f2(x) << std::endl;
}

Again we get the error flagged, but this time the problem is at the call site, f2 is okay.

Consider why f2 is okay: Even though f2 is a little odd for redundantly emitting the same output value in two different output return paths (the inout argument and the return value), that's not wrong per se, and might be useful for chaining or whatever. So f2 is fine this time, in the sense that it's doing what it declared it would do... it's writing to its input argument, and it's returning a value.

But now the call site is definitely suspicious because it's making a call that is declared to modify its argument, but then never looks at the argument again. Ignoring an output is usually bad, at least by default.

Naming the opt-out

So I view this as a great feature of Cpp2... by:

  • declaring the parameter passing direction, and
  • moving from definite last use,

we naturally and automatically diagnosed failure to use an output. I like that a lot.

Furthermore, this is just like [[nodiscard]]. In both cases, we want an opt-out. But what's the right name? Given:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 

Then consider this call site, where we want an explicit opt-out, and ideally the same word of power in both places since they're opting out of conceptually the same thing:

{
    x := 42;
    inout_func( SOMETHING x );
    (SOMETHING returning_func());
}

I want to think about the naming some more, but as a start I'm not sure inout works well for both:

//  What if "SOMETHING" were "inout"? Doesn't feel quite right...
{
    x := 42;
    inout_func( inout x );     // inout works pretty well here
    (inout returning_func());  // but not so well here
}

On the other hand, "discard" gives a nice first impression, and is symmetric with [[nodiscard]] and could connote "don't do anything special with, including don't move its guts along" as well as "discard this thing's value, I'm not going to use it from here onward":

//  What if "SOMETHING" were "discard"? I think I like it... "discard this value, I'm not going to use it after here"
{
    x := 42;
    inout_func( discard x );     // that word is a big red code review flag (good)
    (discard returning_func());  // and here with a clear meaning
}

It seems right to use the same opt-out word for unused inout/out arguments and unused return values. Getting the name right is important, though. This is something I want to sleep on further, but there's my brain dump for today. Thanks again.

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Mar 26, 2023

@hsutter Thank you for this summary - I think you synthesize it very well.

I agree that this is a similar thing as [[nodiscard]], and not using the inout or out argument is suspicious at the minimum.

I like the

discard returning_func();

But using it next to the function argument looks suspicious:

inout_func( discard x );

My first impression is that we want to discard the x variable - unfortunately, it is on the call side before it gets to the function. It could be misinterpreted as something will happen to x before a call to inout_func... or maybe it is just me.

Maybe we can add a passing style to clarify:

inout_func( discard inout x ); // maybe `discard out x` to emphasize that we discard output of the x

Another keyword to consider is unused:

x := 42;
inout_func( unused x );
(unused returning_func());

But still I would prefer to add a passing style:

x := 42;
inout_func( unused inout x ); // or unused out x
(unused returning_func());

@SebastianTroy
Copy link

SebastianTroy commented Mar 26, 2023 via email

@AbhinavK00
Copy link

AbhinavK00 commented Mar 26, 2023

Agreed with Herb's analysis on why this is actually great. Small things I want to point out from @SebastianTroy's comment:

Why not use [[discard]] instead of adding a new keyword?

I have the same question, an attribute seems like the right fit for this job instead of "some new keyword popping out of nowhere".

Would it be reasonable to decorate a parameter with multiple passing intentions, i.e.
in_or_inout_func: ( in|inout x ) = { /*..

I have a question related to this, will we able to mark certain parameters in the function body as discard, this could be another way of doing the same thing. So, our example would become

f2: ([[discard]] inout x) -> _ = {
    x *= 2;
    return x;
}

main: () = {
    x := 21;
    std::cout << f2(x) << std::endl;
}

This would be a way to signify that the mutations it makes to x could be discarded, so cppfront would simply do a static cast of any rvalue before passing to such functions or not emit std::move at all.

@gregmarr
Copy link
Contributor

I like the idea but also don't think discard makes sense as a parameter decoration, as you aren't discarding the entire thing, just the returned information, so maybe something like discard_return or discard_result. I think that would apply just as well to the return value of the function.

@hsutter
Copy link
Owner

hsutter commented Mar 27, 2023

Another keyword to consider is unused:

x := 42;
inout_func( unused x );
(unused returning_func());

That's a good candidate. Thoughts:

  • Primarily, it has a connection with "definite last 'use'". The idea we want to convey is that there would be more "use" of x later in the scope, but the annotation on this use is that that later last use explicitly being omitted... "unused" might connote that, but maybe less strongly than "discard" or "discard_result"?
  • Also, it has a symmetry with Cpp1 [[maybe_unused]].

Why not use [[discard]] instead of adding a new keyword? (I'm not against using just discard, merely curious, in fact is cpp2 avoiding the [[ xyz ]] syntax altogether?)

Cpp2 is currently using [[ ]] only for contracts, and I might change that too. In Cpp1 we spell some things as attributes, in part for syntax compatibility constraints which don't apply in Cpp2.

Would it be reasonable to decorate a parameter with multiple passing intentions, i.e. in_or_inout_func: ( in|inout x ) = { /*..
Suggesting that the parameter's side effect is not mandatory and therefore not worth warning when the user doesn't use it?

If we want to express that an output (parameter out data flow, or non-void return value) is discardable, we should have a consistent way to say that and again I would like to use the same word in both places.

For example:

inout_func_with_ignorable_result: ( ~SOMETHING inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> ~SOMETHING _ = { /*...*/ }

I tag this as ~SOMETHING because it probably wants to be the inverse of the above SOMETHING.

Putting it together:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( ~SOMETHING inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> ~SOMETHING _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( SOMETHING x );
    (SOMETHING returning_func());
}

Trying out @gregmarr's discard_result suggestion:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( discardable_result inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> discardable_result _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( discard_result x );
    (discard_result returning_func());
}

That looks fairly decent at first blush. Clear, and a little verbose which is a good thing for an explicitly lossy escape hatch that we want to stand out. (Syntax colorizer writers, feel free to make it red... :) )

Trying out @filipsajdak's unused suggestion:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( maybe_unused inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> maybe_unused _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( unused x );
    (unused returning_func());
}

This looks nice on the call site, but I worry that on the parameter it could imply that the name is not used in the callee body, which is what Cpp1 [[maybe_unused]] does.

Trying out a merger of the two, even more verbose on the declarations but again this is a case where verbosity can be a plus:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( maybe_unused_result inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> maybe_unused_result _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( unused_result x );
    (unused_result returning_func());
}

Will think some more...

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Mar 27, 2023

@hsutter I like the way you make a synthesis of the proposed ideas.

Looking at the last one:

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( maybe_unused_result inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> maybe_unused_result _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( unused_result x );
    (unused_result returning_func());
}

How will it interact with move-of-last-use?

    x := 42;
    inout_func( unused_result x ); // will it just suppress the move?

And when we would define a function with maybe_unused_result inout:

    x := 42;
    inout_func_with_ignorable_result( x ); // will it just suppress the move?

Will it change the function's signature or add unused_result on the call side by default? Is this good or bad?

I like the focus on the intention and would like to know if we shall support defining functions in that way. I feel comfortable with -> maybe_unused_result _ on the return side, but having that on the inout argument feels like trying to fix some wrong design decision. Is there a use case where we use such an approach in the current cpp1 code?

But now the call site is definitely suspicious because it's making a call that is declared to modify its argument but then never looks at the argument again. Ignoring an output is usually bad, at least by default.

I like the above way of thinking, and for sure, I need to fix some cpp2 code just because cppfront complains about ignoring the return value from a function. Please note that I use the term FIX as, after second thought, my code was just somehow broken. I don't know if providing an easy way to opt out of this rule on the definition side is a good thing.

I like the idea of being explicit when something odd is going on. Ignoring output from a function is an odd thing that you might want to do, which is why you should have the possibility to add unused_result on the call side. That will focus the attention of the code reader.

Having the same thing on the definition side and not requiring anything on the call side will make things (from that perspective) worse, as when you read code, you don't check function definitions all the time - that might mislead the reader.

@JohelEGP
Copy link
Contributor

Would it be reasonable to decorate a parameter with multiple passing intentions, i.e. in_or_inout_func: ( in|inout x ) = { /*..
Suggesting that the parameter's side effect is not mandatory and therefore not worth warning when the user doesn't use it?

If we want to express that an output (parameter out data flow, or non-void return value) is discardable, we should have a consistent way to say that and again I would like to use the same word in both places.

inout_func_with_ignorable_result: ( maybe_unused inout x ) = { /*...*/ }

This looks nice on the call site, but I worry that on the parameter it could imply that the name is not used in the callee body, which is what Cpp1 [[maybe_unused]] does.

f: (in out? x) = { /*.. could be an alternative spelling. Although that fails to meet this:

If we want to express that an output (parameter out data flow, or non-void return value) is discardable, we should have a consistent way to say that and again I would like to use the same word in both places.

Alternatively, consider using the most appropriate spelling for a given context. This might be useful if consistency isn't convincing enough, and to help find a middle ground.

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( in out? x ) = { /*...*/ }
inout_func_with_ignorable_result: ( in maybe_out x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> out? _ = { /*...*/ }
returning_func_with_ignorable_result: () -> maybe_unused _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( not out x );
    inout_func( in x ); // "Force the `in`, ignore the `out`".
    (unused returning_func());
    (void returning_func());
}

@JohelEGP
Copy link
Contributor

When I thought of the alternative above, it occurred to me that

inout_func_with_ignorable_result: ( maybe_unused_result inout x ) = { /*...*/ }

is to

inout_func_with_ignorable_result: ( in maybe_out x ) = { /*...*/ }

what if (not irreversible) is to if (reversible). We want to say the latter. But there's no direct way to say it. So we have to add to what was said to make it what is actually wanted.

@hsutter
Copy link
Owner

hsutter commented Mar 28, 2023

I think out? or maybe_out would imply more that the callee body might or might not produce an output value. To some extent inout already accounts for that side of things, with the intended semantics of "write to this on at least one code path."

I think what we're looking at here is the complement of that -- not whether the callee will change the argument's value to emit a new output value, but whether the caller should view the output as important vs. can safely ignore it.

Trying out "ignore"...

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( ignorable_result inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> ignorable_result _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( ignore_result x );
    (ignore_result returning_func());
}

Or with "output", and using a "can" prefix to avoid dealing with English verb-to-adjective conventions (e.g., wherever possible I'd like to avoid non-English speakers having to learn conventions like "ignore" -> "ignorable" to program in Cpp2)...

inout_func: ( inout x ) = { /*...*/ }
returning_func: () -> _ = { /*...*/ } 
inout_func_with_ignorable_result: ( can_ignore_output inout x ) = { /*...*/ }
returning_func_with_ignorable_result: () -> can_ignore_output _ = { /*...*/ }

{  // call site
    x := 42;
    inout_func( ignore_output x );
    (ignore_output returning_func());
}

@AbhinavK00
Copy link

AbhinavK00 commented Mar 28, 2023

Is there a use case where we use such an approach in the current cpp1 code?

I have the same question, is there even a use-case for this?
We can try to not implement this feature now and maybe implement it later if actual use cases are encountered.
Even if we do encounter an use case, I would argue that annotations are only needed at the call site and not in function definitions.

@gregmarr
Copy link
Contributor

To make sure I understand, currently all function returns are converted to Cpp1 as [[nodiscard]], and the can_ignore_output decoration is to suppress that?

Is the intent on the call site that the SOMETHING has to be used like this: (SOMETHING returning_func()) as opposed to just SOMETHING returning_func()? Is there a parsing issue that requires the parens, or is it intended as clarification for the user?

@hsutter
Copy link
Owner

hsutter commented Mar 28, 2023

I feel comfortable with -> maybe_unused_result _ on the return side, but having that on the inout argument feels like trying to fix some wrong design decision. Is there a use case where we use such an approach in the current cpp1 code?

It's the same use/bug case. A caller ignoring a return value output is a well known source of a family of security vulnerabilities: CWE-252 is a general category, and then there are more specific categories under it. It's the same bug if the caller ignores an argument output, if the function happens to choose to produce an output via a modified argument instead of (or in addition to) the return value.

For example:

  • Allocation functions: malloc returns a pointer that should be checked before use, whereas COM object allocation returns the allocated pointer via an Object** parameter which is today's spelling for a Cpp2 out unique_ptr<Object> parameter.
  • std::error_code-using functions: Many functions return an error_code by value. Others return them via an inout or out parameter, such as a lot of filesystem functions like bool is_character_file( const std::filesystem::path& p, std::error_code& ec ) noexcept;. A returned error_code should be checked regardless of which way the function happened to return it.

is_character_file is an example of the above inout_func whose "out" parameter result should not be ignored. But whereas we're getting better at diagnosing failure to look at the return value because of linter tools and [[nodiscard]], we're not yet as good at diagnosing failure to look at the output via "out" parameter.

Today we have a patchwork of narrow solutions:

  • [[nodiscard]] for return values (but it can't be made the default in Cpp1, so we're adding it on the vast majority of std:: value-returning functions because it should be the default)
  • std::ignore for a subset of cases
  • proposals like P1881 proposing [[discardable]] (if we could make [[nodiscard]] the default)
  • (void) as a de facto convention for spelling "ignore this value"

Cpp2 already has the right consistent automatic defaults so that we never need to write anything for the majority of cases: [[nodiscard]] is already the automatic default for function return values, and detecting failure to use the result of an inout or out parameter is already the automatic default (that's what spawned this thread). So most of the time we don't need to write anything.

Now we're discussing the right consistent opt-out, aiming for a single consistent answer to avoid piecemeal patches like a [[discardable]] here and a std::ignore there.

To make sure I understand, currently all function returns are converted to Cpp1 as [[nodiscard]], and the can_ignore_output decoration is to suppress that?

Yes.

Is the intent on the call site that the SOMETHING has to be used like this: (SOMETHING returning_func()) as opposed to just SOMETHING returning_func()? Is there a parsing issue that requires the parens, or is it intended as clarification for the user?

Those parens are currently required because I happen to only allow argument modifiers in expression lists. Having to write ( ) around them hasn't bothered me enough yet to parse them also as prefix operators, but I could do that and then the parens would not be required around single expressions.

@gregmarr
Copy link
Contributor

Those parens are currently required because I happen to only allow argument modifiers in expression lists. Having to write ( ) around them hasn't bothered me enough yet to parse them also as prefix operators, but I could do that and then the parens would not be required around single expressions.

Sounds good.

@AbhinavK00
Copy link

AbhinavK00 commented Mar 28, 2023

Val has a feature to discard return values of functions by assigning them to a placeholder underscore like this:

_ = returning_func();

This effectively discards the return value, but I can't think of a way to extend it to inout arguments.

@gregmarr
Copy link
Contributor

Go also does that, and I thought of mentioning that, but it also has the same issue of not being extendable to inout. I think we discussed that for returns somewhere at some point.

@JohelEGP
Copy link
Contributor

JohelEGP commented Mar 28, 2023

It's the same use/bug case.

I think what #231 (comment) asked, I upvoted, and #231 (comment) agreed with, was the opposite. Whether there's value in giving power to the callee to determine that an out parameter is ignorable. And that there's value in the caller always having to opt-out. Your answer suggests that in your examples an out argument shouldn't be ignored by default. There does not seem to be an example of an out parameter that the caller doesn't inspect and doesn't need to opt-into ignoring it.

Granted, there's still value in the discussion, to determine a consistent opt-out for parameters and arguments. In case it's ever needed.

@hsutter
Copy link
Owner

hsutter commented Mar 28, 2023

Ah, got it -- I see the question was just about the parameter being able to declare its output is ignorable, and whether there are use cases for such parameters. Thanks.

I suspect the pattern of the answer will be the same: Declaring 'this output is ignorable' is uncommon for return values, but in the cases where you want to declare an ignorable return value you would also want to declare an ignorable output value if the function author chose that as the path to deliver the output. But I don't have a concrete example in hand, and having one would be helpful.

@gregmarr
Copy link
Contributor

gregmarr commented Mar 28, 2023

I've seen many APIs with Foo ** parameters where if the argument is null, then it's ignored and not populated, and otherwise, it sets the Foo* to an output value. There are many of those in the Win32 API. That would to me correspond to an ignorable out parameter.

https://learn.microsoft.com/en-us/windows/win32/api/winreg/nf-winreg-regqueryvalueexw

LSTATUS RegQueryValueExW(
  [in]                HKEY    hKey,
  [in, optional]      LPCWSTR lpValueName,
                      LPDWORD lpReserved,
  [out, optional]     LPDWORD lpType,
  [out, optional]     LPBYTE  lpData,
  [in, out, optional] LPDWORD lpcbData
);

@JohelEGP
Copy link
Contributor

I wonder if it's possible to come up with an example of an ignorable inout/out parameter using pointers.

An inout maps to a reference. Supposing inout x: Foo** works, is x* = &foo a write to x? I'd expect the answer to be no.

Foo** parameters and the like are the target of C++23 "Smart pointer adaptors".

@AbhinavK00
Copy link

AbhinavK00 commented Mar 28, 2023

I've seen many APIs with Foo ** parameters where if the argument is null, then it's ignored and not populated, and otherwise, it sets the Foo* to an output value. There are many of those in the Win32 API. That would to me correspond to an ignorable out parameter.

To me, that sounds like a normal inout argument, keep in mind that function has to write to the reference only atleast in one control path.

The only use case I could think is of a function which produces output via both parameters and return value but you just call it for one of those output (for whatever reason) and therefore you'd have to ignore the other one. For example:

func : (inout x : std::string ) -> std::string = {
  x = "done";
  return x;
}

main : () ={
  a : std::string = "test";
  std::cout << func(a);
}

Here, you're calling func just for it's return value and passing a to it just because you happened to have a variable you won't use again.
Same can be said the other way around, ignoring return value just because you wanted mutation on your passed argument.

In both cases, it's the callee which decides to use the outputs so I'd say it's not at all needed for the function to say that its output can be ignored, it should be upto the callee only.

@gregmarr
Copy link
Contributor

gregmarr commented Mar 28, 2023

I've seen many APIs with Foo ** parameters where if the argument is null, then it's ignored and not populated, and otherwise, it sets the Foo* to an output value. There are many of those in the Win32 API. That would to me correspond to an ignorable out parameter.

To me, that sounds like a normal inout argument, keep in mind that function has to write to the reference only atleast in one control path.

In a normal out or inout, as you must provide a valid variable. For an optional parameter, it's allowed to be null. This is more complicated, but it's an example of a large set of APIs. I don't know if that's something that we should say "you can't write this in Cpp2 because it's not safe" or if it's something that we should figure out how to support.

@gregmarr
Copy link
Contributor

So the implementation knows.

My understanding is that the implementation knows so that it can warn you about it, either at compile time or runtime, but I don't think that there's any concept of the user code itself being able to access that information. Written another way, the cpp1 generated code knows that information, but the cpp2 code doesn't, because that would change the type of the parameter, and you'd have to deal with the wrapped type, which could block things that you want to do with the original type.

After you initialize it, there's no difference if you pass it to an out parameter before or after reading from it.

There is a difference in that you initialized it unnecessarily because the function you passed it to is going to initialize it. This warning would be about the unnecessary initialization prior to calling the function. This might not actually happen, and might not be on by default, but it's definitely something that COULD be done. I'd have to watch 2022 again to see.

@JohelEGP
Copy link
Contributor

My understanding is that the implementation knows so that it can warn you about it,

It knows to determine whether to construct or assign.

but I don't think that there's any concept of the user code itself being able to access that information.

That's right.

Written another way, the cpp1 generated code knows that information, but the cpp2 code doesn't, because that would change the type of the parameter, and you'd have to deal with the wrapped type

The generated code is such that the meaning doesn't change. You still deal with the type you declared. The generated code takes care of the unwrapping.

@AbhinavK00
Copy link

It knows to determine whether to construct or assign.

In cpp2, construction and assignment are unified, so there's no distinction. I don't think the compiler knows how to determine whether to contruct or assign.

out:
object should not be initialized before the call

Objects CAN be unitialised before the call.

I'm doubting the usefulness of a writeonly parameter. There are some examples yes but not enough I'd say

@JohelEGP
Copy link
Contributor

JohelEGP commented Mar 31, 2023

In cpp2, construction and assignment are unified, so there's no distinction. I don't think the compiler knows how to determine whether to contruct or assign.

out T parameters outside operator= use cpp2::out<T> in C++1, which looks like an std::optional<T>, so it knows. What is generated from operator= doesn't use that, as constructors always construct, and assignments assign.

@filipsajdak
Copy link
Contributor Author

filipsajdak commented Mar 31, 2023

I need to explain what I have presented above in more detail.

In cpp1, we have const methods - whenever we pass a const object, we can only call const methods that cannot change the object. We also have a mutable keyword that can somehow overcome that.

Based on that, I thought, what if we can mark what in things we can do in out context (this is precisely the reverse thing to the mutable where you mark what out actions you can make in in context - of course in cpp1 not in cpp2).

Let's consider the following type:

safe_store: type = {
  data: int;
  m: std::mutex;

  operator=:(out this) = {
    data = ();
    m = ();
  }

  write: (inout this, x : int) = {
    lock := std::lock_guard(m);
    data = x;
  }

  read: (inout this) -> int = {
    lock := std::lock_guard(m);
    return data;
  }
}

[For the sake of my following example, I have introduced the init passing style to distinguish write-only out from initialize-first out; the out will mean write-only].

I want to express something like the following:

safe_store: type = {
  data: int;
  SOMETHING m: std::mutex; // the SOMETHING marks m to be safe to read in `out` context and safe to modify in `in` context

  operator=:(init this) = { // note: currently there is no init passing style
    data = ();
    m = ();
  }

  write: (out this, x : int) = { // note: currently out this means something different
    lock := std::lock_guard(m); // currently fail as it needs to read and modify m
    data = x;
  }

  read: (in this) -> int = {
    lock := std::lock_guard(m); // currently fail as it needs to read and modify m
    return data;
  }
}

Using out like the above conflicts with the possibility that an object can be uninitialized - that is why I introduce the init passing style to distinguish uninitialized passing from the initialized write-only passing.

I wonder if there are more examples like that. I like the consistency, symmetry, and as few exceptions as possible. From my perspective, out is overloaded and needs special care to explain it and use it. It is similar to the cpp1 non-const lvalue reference - currently might mean inout or out.

I might need to be corrected. I was reading through 708, and I did not find why out is more about initialization than write-only. I have found that the current out is a symmetry to the current move - see paragraph "1.3.4 move parameters". And maybe this is my error as I assume that out (write) is the opposite of in (read), but it is more the opposite of move (uninitialize)... maybe it is all about names used.

If the above is true, I was confused by quote:

“Finally, the [internal January 1981] memo introduces writeonly: ‘There is the type operator writeonly, which is used like readonly, but prevents reading rather than writing...’ “ —B. Stroustrup (D&E, p. 90)

Where I assume that out is all about writeonly.

@SebastianTroy
Copy link

SebastianTroy commented Mar 31, 2023 via email

@JohelEGP
Copy link
Contributor

JohelEGP commented Mar 31, 2023

std::mutex has no const accessors, so writeonly doesn't help.

safe_store::read modifies the mutex and reads the data, so this should be inout.

safe_store::write modifies the mutex and modifies the data. But we don't know if the modification of the mutex would be writeonly.
It doesn't exist in C++1, so it'd be a C++2 invention. That entails a new issue following the [SUGGESTION] template.
I do remember having read about writeonly-like suggestions. But no further memories or emotions surge, which suggests me it wasn't that good.
The fact that it doesn't scale when mixed with C++1, that we haven't agreed on its usefulness, and that it'd require cppfront to be more than just a transpiler, doesn't bode well for the suggestion.

@filipsajdak
Copy link
Contributor Author

@SebastianTroy I have rework the example from cpp1:

class ThreadsafeCounter
{
    mutable std::mutex m; // The "M&M rule": mutable and mutex go together
    int data = 0;
public:
    int get() const
    {
        std::lock_guard<std::mutex> lk(m);
        return data;
    }
 
    void inc()
    {
        std::lock_guard<std::mutex> lk(m);
        ++data;
    }
};

See here: https://en.cppreference.com/w/cpp/language/cv

@JohelEGP
Copy link
Contributor

It'd better to separate the suggestions of wanting mutable in C++2 and wanting to change the meaning of the parameter passing keywords.

@AbhinavK00
Copy link

So if I understand correctly, Filip's suggestion is about a keyword that'd denote that the data member is immune to the parameter qualification, i.e. you can mutate it in in context and read it in out context. Not sure what will be the use cases of this (I fail to see how your second example is any better than the first one).

As for the differentiating between init and out, that'd just be increasing the concept count and will probably lead to confusion. The best would be probably to make out as writeonly because in most cases, the function would itself have the access to assigned value (except in cases where it initializes the parameter by moving some object into it).

But as @JohelEGP pointed out, this all should be in a different issue.

@JohelEGP
Copy link
Contributor

Alternatively, consider using the most appropriate spelling for a given context. This might be useful if consistency isn't convincing enough, and to help find a middle ground.
-- #231 (comment)

Apparently, foo as const is valid JavaScript, so here's an alternative.

    inout_func( x as in );
    (returning_func() as discarded);
    (returning_func() as unused);

@JohelEGP

This comment was marked as resolved.

JohelEGP referenced this issue Apr 1, 2023
Add order-independent type definitions, including order-independent dependencies between type definitions - see new test case code, pasted below in this commit message for convenience

Note: This looks in pretty good shape, but it's a significant restructuring in the way the code is lowered to Cpp1, so there will probably be some corner cases to tweak in follow-on commits

There are now three (not two) Cpp1 lowering passes:
1. type forward declarations (new)
2. type definitions (new) and function forward declarations (existing)
3. function definitions, including type-scope/member function out-of-line definitions (new) and non-type-scope functions (existing)

When compiling `.h2` files, phases 1 and 2 go into the generated `.h` and phase 3 goes into the generated `.hpp`

Comments written in Cpp2 all go on an entity's definition, because there are no forward declarations. So when lowering to Cpp1, comments are now split into two groups to put them where a reader of the Cpp1 code would naturally expect to see them:
   - comments not within function bodies are emitted in phase 2, where they belong on the types and function declarations (new)
   - comments within function bodies are emitted in phase 3, likewise where they belong

Also:

- Added support for `out` parameters on constructors. For example, this is useful when safely creating pointer cycles among types that refer to each other, respecting Cpp2's guaranteed program-meaningful initialization safety rules (demonstrated in the test case in this commit, pasted below)

- Added support for `_` as the name of an anonymous object. This supports uses of RAII guards such as `_ := std::lock_guard(my_mutex);` and `_ := finally( :()= print("done"); ); );` - see example in test case below

- Removed unused `offset` variable, closes #309

### Regression test to demonstrate code that now works... convenience copy of `regression-tests\pure2-types-order-independence-and-nesting.cpp2`

```
N: namespace = {

//  A type X, that uses Y (defined later)
//
X: type = {
    py: *Y;

    //  Note: A constructor with an 'out' parameter
    operator=: (out this, out y: Y) = {
        //  === The following comments will stay close to, but not exactly at,
        //      the corresponding lines that get moved to the Cpp1 mem-init-list

        //  At this point, 'this' is now in scope and can safely be used.
        //  The guaranteed initialization rules ensure we cannot use an
        //  unconstructed object, and if a constructor that has begun chooses to
        //  hand out 'this&' (as in the next line below) then it is explicitly
        //  aware it's doing so - this constructor knows that if y's constructor
        //  uses the pointer it will see 'this' object's state as of this line

        //  'out' parameters are constructed first
        y = this&;  // ok, construct 'y' to refer to 'this'
        //  at this point, 'y' is now initialized and can safely be used

        //  then members are constructed next
        py = y&;    // ok, set 'this' to refer to 'y'

        //  === from here onward, the comments stick with their code

        //  then do anything else the constructor wants to do
        std::cout << "made a safely initialized cycle\n";
    }

    //  X::exx member function description here
    exx: (this, count: int) = {
        //  Exercise '_' anonymous objects too while we're at it
        _ := cpp2::finally( :()= std::cout << "leaving call to 'why((count)$)'\n"; );
        if count < 5 {
            py*.why( count+1 );     // use Y object from X
        }
    }
}

//  Another type Y, that uses X
//
Y: type = {
    px: *X;

    operator=: (out this, x: *X) = px = x;

    why: (this, count: int) =
        px*.exx( count+1 );         // use X object from Y
}

M: namespace = {

//  Testing nested templated types
A: <T, U> type = {
    B: <I: int> type = {
        f: <V, J: int, W> (w: W) = std::cout << "hallo (w)$\n";
    }
}

}

}

//  Mainline - gratuitous comment just to check that this comment
//  stays on the function declaration when lowering
main: () =
{
    y: N::Y;            // declare an uninitialized Y object
    x: N::X = (out y);  // construct y and x, and point them at each other

    // now have the two objects call each other back and forth a few times
    x.exx(1);

    // and test a nested template out-of-line definition
    N::M::A<int, int>::B<42>::f<int, 43>("welt");
}
```
@hsutter
Copy link
Owner

hsutter commented Apr 8, 2023

I've skimmed the new comments on the thread, so please let me know if I inadvertently am not answering a point added above since last time:

Re "writeonly": There's nothing wrong with reading from an out parameter, as long as it's after you've set its value. So out isn't about "writeonly". I should make that clearer in the d0708 summary table that it's not only about 'give me an x I can write to'... the point is that I'll write its value first before doing anything else, and I am guaranteed to actually write to it. But after that it's okay to read the value (that this function just set!). For similar reasons, it's okay for a function to read the value of a forward parameter before it's forwarded.

Re "mutable": Yes, I probably should add a way to get the same effect as a Cpp1 "mutable" class member. Those are occasionally useful for instrumentation data, or the canonical example shown above of having a synchronization object (e.g., mutex, concurrent message queue) as a data member, which is fine because those are internally synchronized. (That's the short answer; I gave a talk on the longer answer, but that video is no longer hosted on Channel 9 and I can't find it on YouTube.)

Perhaps something like using "inout" here too, signifying the data member is always treated as though "inout":

ThreadsafeCounter: type = {
    inout m: std::mutex = ();  // hypothetical: "inout"(?) for Cpp1 "mutable"
    // ... other data

    get: (this) -> data = {
        _ := std::lock_guard(m);
        return data;
    }

    inc: (inout this) = {
        _ := std::lock_guard(m);
        modify(data);
    }
};

Re the original topics: Still thinking ...

@filipsajdak
Copy link
Contributor Author

@hsutter Regarding writeonly, I think I need to collect all my thought together and share my doubts - as I think I did not communicate well on that.

@filipsajdak
Copy link
Contributor Author

I will not continue writeonly topic here to leave a space for the original topic.

@AbhinavK00
Copy link

Small chime in regarding mutable, maybe we can skip it for now and see how far cpp2 can go without it. If we get writeonly some day, then the current solution proposed by Herb would need to be tweaked to also allow reading from writeonly members.

@JohelEGP
Copy link
Contributor

The need for this is resolved, see #305 (comment).

@JohelEGP
Copy link
Contributor

JohelEGP commented Jul 30, 2023

I might have an use case for marking an ignorable output on a function declaration.

Consider https://compiler-explorer.com/z/r6sq4nGM1,
generated from https://cpp2.godbolt.org/z/K3v7xWeav.

The iterator::operator++ returns this.
MSVC and GCC warn on the unused result of the compiler rewrite of the range for statement.

iterator: @struct @ordered type = {
  operator*: (this) -> i32 = 0;
  operator++: (this) -> forward iterator = this;
}
range: @struct type = {
  begin: (this) -> iterator = ();
  end: (this) -> iterator = ();
}
main: () = {
  for range() do (x) _ = x;
}
main.cpp2: In function 'int main()':
main.cpp2:10:31: error: ignoring return value of 'const iterator& iterator::operator++() const', declared with attribute 'nodiscard' [-Werror=unused-result]
   10 |   for range() do (x) _ = x;
      |                               ^
main.cpp2:3:22: note: declared here
    3 |   operator++: (this) -> forward iterator = this;
      |                      ^~~~~~~~
cc1plus: some warnings being treated as errors

JohelEGP added a commit to JohelEGP/cppfront that referenced this issue Dec 21, 2023
According to <hsutter#231 (comment)>:
> to avoid dealing with English verb-to-adjective conventions
hsutter added a commit that referenced this issue Feb 4, 2024
* fix(sema): recognize last use immediately after declaration

* fix(sema): give function expression its own scope level

* fix(sema): consider use as function in UFCS call

* refactor(sema): remove stale comment

* test(sema): also add unit test for `move` parameter

* refactor(sema): avoid possible decrement beyond `begin`

* fix(sema): move `this` on last use

* test: update GCC 13 results

* test: add Clang 18 results

* test: add overloads with `that`

* test: add another case that doesn't work yet

* test: rename members to avoid shadowing

* fix(sema): move implicit `this` of member on last use

* test: remove commented cases that aren't last uses

* fix(sema): guard check for `move this` parameter once

* test: add case for destructor

* test: add missing GCC 13 result

* test: regenerate tests

* fix(sema): consider use as function in UFCS call

* fix(to_cpp1): type-scope variable initialization is not deferred

* refactor(util): add UFCS macros to move and forward

* fix(to_cpp1): emit last use of function in UFCS as part of macro name

* test: regenerate GCC 13 result

* fix(to_cpp1): exercise last use in range of for loop

* refactor(util): remove UFCS macros of invalid combinations

* refactor(to_cpp1): handle range of for loop specifically

* refactor(to_cpp1): consider the expression list range

* fix(sema): apply last use to call of member function

* fix(sema): do not move a member function

* fix(to_cpp1): do not move any member function

* test: remove non-problematic comment about double move

It just works, see <https://cpp1.godbolt.org/z/6fhjxrz65>.

* refactor: regenerate `reflect.h`

* refactor(sema): avoid decrementing before begin

* fix(sema): extend implicit else branch to containing statement

* fix(sema): increase scope to match an explicit else

* refactor(sema): move heuristic to not change ending depths

* fix(sema): pop consecutive implicit else branches

* refactor(parse): put implicit else statements in the actual else node

* test: add test cases that combine implicit and explicit access

* test: add unit test for all test cases as selections

* fix(reflect): fix initialization and regenerate

* fix(sema): apply sema phase to generated code

* test: regenerate results

* test: add test case for another limitation

* test: add another test case that fails

* fix(sema): improve heuristic to only move from last (implicit) `this`

* refactor(reflect): restore implicit `this` access

* fix(sema): do not stack an implicit else branch

* refactor: use "implicit scope" throughout

* fix(sema): use local `move this` to determine last use

* fix(sema): adjust condition for looking up to type

* fix(sema): use last move on called type-scope variable

* test: remove bogus comment

* fix(sema): correct unset `this.` case

* test: reference open issue

* test: add test cases not affected by the open issue

* test: clarify FIXME comment

* test: review the new tests

* perf: replace `std::map` with `mutable` data member

* refactor: add `symbol::get_final_position`

* refactor(to_cpp1): do not move single named return

* test: add new test cases to commented out test case

* test: add similar test case to callable case

* refactor(sema): rework some outdated whitespace

* test: regenerate results after "do not move single named return"

* feat(sema): diagnose unused variables in generate code

* fix(sema): consider variables in type scope declared later

* test: refactor new test to use `f_copy`

* refactor(sema): use member token

* refactor(sema): update comments

* refactor(sema): use the `this` parameter exclusively

* refactor(sema): update the comments

* refactor(sema): finish focusing on the implicit 'this'

* fix(to_cpp1): move returned uninitialized local

* fix(to_cpp1): move single named uninitialized return

* test: add case with member declared last

* refactor(sema): set condition for "at `this.`" correctly

* fix(sema): use the better lookup algorithm for this case

* fix(to_cpp1): stack current names only in namespaces

* fix(to_cpp1): stack current names of bases

* test: exercise forward in generated code

* test: add stand-in for `std::move_only_function`

* test: remove bogus test case

* refactor(to_cpp1): rename to `is_class_member_access`

* test: add test case showing limitations with base members

* test: actually exercise forward in generated code

* refactor(to_cpp1): reorder branch

* refactor(to_cpp1): remove outdated condition

* refactor(to_cpp1): rename to `is_class_member_access`

* fix: revert using the empty implicit else branch

Refs: e9cc033, 7f4a60f

* fix(sema): change algorithm to find last uses

* test: add test case for recursion logic

* refactor(sema): simplify condition for UFCS on member

* test: use `f_inout` and `f_copy` in earlier test cases

* test: enable commented out tests

* test: extend limitation unit test with alternative

* test: remove redundant explicit discards

* test: add more test cases for the new last use algorithm

* test: add missing `new<int>()`

* fix: regenerate `reflect.h`

* test: add test cases of discovered problems

* fix(sema): pop sibling branch conditions on found last use

* refactor(sema): localize variables

* fix(sema): recognize uses in iteration statements

* fix(sema): start from the last symbol, not past-the-end

* refactor(sema): add local type `pos_range`

* fix(sema): handle loops and (non) captures

* test: add similar case but without declaration

* test: regenerate results

* fix(reflect): update and regenerate `reflect.h`

* fix(sema): start for loop at its body

* refactor(sema): use `std::span`

* refactor(sema): register name deactivation at visit

* fix(sema): do not apply use to a hiding name

* fix(sema): skip hiding loop parameter

* test: revert `gcc-version.output`

* fix(sema): recognize use in _next-clause_

* test: add corner case

* fix(sema): recognize Cpp1 `using` declaration

* refactor(sema): avoid adding duplicate symbols

* refactor(sema): clarify similar members with comments

* refactor(sema): turn comment into code

* refactor(sema): modify local convenience variable

* refactor(sema): remove expression scope

* refactor(sema): use the right predicate

* refactor(sema): remove inactive, stale assertions

* refactor(sema): keep using a sentinel

* fix(sema): handle a nested true branch

* refactor(sema): revert whitespace change

* refactor(sema): fix `started_postfix_expression` simpler

* refactor(sema): revert stale fix of `scope_depth`

* refactor(sema): comment the need of `final_position` at hand-out

* refactor(to_cpp1): drop periods from comment

* refactor(to_cpp1): reorder arguments for better formatting

* refactor(to_cpp1): remove stale non-rvalue context

* refactor(to_cpp1): remove useless non-rvalue context

* refactor(to_cpp1): clarifiy comment with example

* test: regenerate gcc-13 results

Commit 4eef0da
actually worked around #746.

* test: regenerate results

* refactor(sema): resolve CI issues

* test: revert changes to gcc-13 result

* refactor: generalize to `identifier_sym::safe_to_move`

* test: fix implementation of `identity`

* refactor(sema): add scoped indices of uses

* refactor(sema): simplify names of activation entities

* fix(sema): do not mark non-captures as captures

* refactor(sema): rename to avoid verb prefix

According to <#231 (comment)>:
> to avoid dealing with English verb-to-adjective conventions

* fix(sema): disable implicit move unsequenced with another use

* fix(sema): consider _is-as-expression_ too

* fix(sema): check all parameters for use

* refactor(sema): remove wrong fix for UFCS issue

* Minor updates to compile cppfront and the new test cleanly using MSVC

And re-remove a few stray `;` that were removed as part of PR #911 and now cause errors because of the pedantic builds

I regenerated `reflect.h` and found two changes that weren't already in the PR, so committing those too

Also including the new test file's run against MSVC showing the five messages mentioned in the PR review, see PR review for more...

* refactor(to_cpp1): pass `0` to `int` parameter instead of `false`

* test: discard unused results

* refactor(to_cpp1): avoid the UFCS macro to workaround compiler bugs

* test: update GCC 13 results

* test: remove Clang 18 results

* refactor(sema): avoid pointer arithmetic

* test: regenerate results

* refactor(sema): avoid pointer arithmetic

* Add regression test result diffs from my machine

MSVC error is resolved

New Clang 12 error in `pure2-last-use.cpp2:938`

* test: comment Clang 12 limitation

* test: apply CI patches

* test: apply CI patches

* test: apply CI patches

* Tweak: Change "final position" to "global token order"

* Minor tweaks

While I'm at it, clean up redundant headers that now we get from cpp2util.h
And it's updating those regression test line-ends...

---------

Signed-off-by: Johel Ernesto Guerrero Peña <[email protected]>
Signed-off-by: Herb Sutter <[email protected]>
Co-authored-by: Herb Sutter <[email protected]>
bluetarpmedia pushed a commit to bluetarpmedia/cppfront that referenced this issue Feb 7, 2024
* fix(sema): recognize last use immediately after declaration

* fix(sema): give function expression its own scope level

* fix(sema): consider use as function in UFCS call

* refactor(sema): remove stale comment

* test(sema): also add unit test for `move` parameter

* refactor(sema): avoid possible decrement beyond `begin`

* fix(sema): move `this` on last use

* test: update GCC 13 results

* test: add Clang 18 results

* test: add overloads with `that`

* test: add another case that doesn't work yet

* test: rename members to avoid shadowing

* fix(sema): move implicit `this` of member on last use

* test: remove commented cases that aren't last uses

* fix(sema): guard check for `move this` parameter once

* test: add case for destructor

* test: add missing GCC 13 result

* test: regenerate tests

* fix(sema): consider use as function in UFCS call

* fix(to_cpp1): type-scope variable initialization is not deferred

* refactor(util): add UFCS macros to move and forward

* fix(to_cpp1): emit last use of function in UFCS as part of macro name

* test: regenerate GCC 13 result

* fix(to_cpp1): exercise last use in range of for loop

* refactor(util): remove UFCS macros of invalid combinations

* refactor(to_cpp1): handle range of for loop specifically

* refactor(to_cpp1): consider the expression list range

* fix(sema): apply last use to call of member function

* fix(sema): do not move a member function

* fix(to_cpp1): do not move any member function

* test: remove non-problematic comment about double move

It just works, see <https://cpp1.godbolt.org/z/6fhjxrz65>.

* refactor: regenerate `reflect.h`

* refactor(sema): avoid decrementing before begin

* fix(sema): extend implicit else branch to containing statement

* fix(sema): increase scope to match an explicit else

* refactor(sema): move heuristic to not change ending depths

* fix(sema): pop consecutive implicit else branches

* refactor(parse): put implicit else statements in the actual else node

* test: add test cases that combine implicit and explicit access

* test: add unit test for all test cases as selections

* fix(reflect): fix initialization and regenerate

* fix(sema): apply sema phase to generated code

* test: regenerate results

* test: add test case for another limitation

* test: add another test case that fails

* fix(sema): improve heuristic to only move from last (implicit) `this`

* refactor(reflect): restore implicit `this` access

* fix(sema): do not stack an implicit else branch

* refactor: use "implicit scope" throughout

* fix(sema): use local `move this` to determine last use

* fix(sema): adjust condition for looking up to type

* fix(sema): use last move on called type-scope variable

* test: remove bogus comment

* fix(sema): correct unset `this.` case

* test: reference open issue

* test: add test cases not affected by the open issue

* test: clarify FIXME comment

* test: review the new tests

* perf: replace `std::map` with `mutable` data member

* refactor: add `symbol::get_final_position`

* refactor(to_cpp1): do not move single named return

* test: add new test cases to commented out test case

* test: add similar test case to callable case

* refactor(sema): rework some outdated whitespace

* test: regenerate results after "do not move single named return"

* feat(sema): diagnose unused variables in generate code

* fix(sema): consider variables in type scope declared later

* test: refactor new test to use `f_copy`

* refactor(sema): use member token

* refactor(sema): update comments

* refactor(sema): use the `this` parameter exclusively

* refactor(sema): update the comments

* refactor(sema): finish focusing on the implicit 'this'

* fix(to_cpp1): move returned uninitialized local

* fix(to_cpp1): move single named uninitialized return

* test: add case with member declared last

* refactor(sema): set condition for "at `this.`" correctly

* fix(sema): use the better lookup algorithm for this case

* fix(to_cpp1): stack current names only in namespaces

* fix(to_cpp1): stack current names of bases

* test: exercise forward in generated code

* test: add stand-in for `std::move_only_function`

* test: remove bogus test case

* refactor(to_cpp1): rename to `is_class_member_access`

* test: add test case showing limitations with base members

* test: actually exercise forward in generated code

* refactor(to_cpp1): reorder branch

* refactor(to_cpp1): remove outdated condition

* refactor(to_cpp1): rename to `is_class_member_access`

* fix: revert using the empty implicit else branch

Refs: e9cc033, 7f4a60f

* fix(sema): change algorithm to find last uses

* test: add test case for recursion logic

* refactor(sema): simplify condition for UFCS on member

* test: use `f_inout` and `f_copy` in earlier test cases

* test: enable commented out tests

* test: extend limitation unit test with alternative

* test: remove redundant explicit discards

* test: add more test cases for the new last use algorithm

* test: add missing `new<int>()`

* fix: regenerate `reflect.h`

* test: add test cases of discovered problems

* fix(sema): pop sibling branch conditions on found last use

* refactor(sema): localize variables

* fix(sema): recognize uses in iteration statements

* fix(sema): start from the last symbol, not past-the-end

* refactor(sema): add local type `pos_range`

* fix(sema): handle loops and (non) captures

* test: add similar case but without declaration

* test: regenerate results

* fix(reflect): update and regenerate `reflect.h`

* fix(sema): start for loop at its body

* refactor(sema): use `std::span`

* refactor(sema): register name deactivation at visit

* fix(sema): do not apply use to a hiding name

* fix(sema): skip hiding loop parameter

* test: revert `gcc-version.output`

* fix(sema): recognize use in _next-clause_

* test: add corner case

* fix(sema): recognize Cpp1 `using` declaration

* refactor(sema): avoid adding duplicate symbols

* refactor(sema): clarify similar members with comments

* refactor(sema): turn comment into code

* refactor(sema): modify local convenience variable

* refactor(sema): remove expression scope

* refactor(sema): use the right predicate

* refactor(sema): remove inactive, stale assertions

* refactor(sema): keep using a sentinel

* fix(sema): handle a nested true branch

* refactor(sema): revert whitespace change

* refactor(sema): fix `started_postfix_expression` simpler

* refactor(sema): revert stale fix of `scope_depth`

* refactor(sema): comment the need of `final_position` at hand-out

* refactor(to_cpp1): drop periods from comment

* refactor(to_cpp1): reorder arguments for better formatting

* refactor(to_cpp1): remove stale non-rvalue context

* refactor(to_cpp1): remove useless non-rvalue context

* refactor(to_cpp1): clarifiy comment with example

* test: regenerate gcc-13 results

Commit 4eef0da
actually worked around hsutter#746.

* test: regenerate results

* refactor(sema): resolve CI issues

* test: revert changes to gcc-13 result

* refactor: generalize to `identifier_sym::safe_to_move`

* test: fix implementation of `identity`

* refactor(sema): add scoped indices of uses

* refactor(sema): simplify names of activation entities

* fix(sema): do not mark non-captures as captures

* refactor(sema): rename to avoid verb prefix

According to <hsutter#231 (comment)>:
> to avoid dealing with English verb-to-adjective conventions

* fix(sema): disable implicit move unsequenced with another use

* fix(sema): consider _is-as-expression_ too

* fix(sema): check all parameters for use

* refactor(sema): remove wrong fix for UFCS issue

* Minor updates to compile cppfront and the new test cleanly using MSVC

And re-remove a few stray `;` that were removed as part of PR hsutter#911 and now cause errors because of the pedantic builds

I regenerated `reflect.h` and found two changes that weren't already in the PR, so committing those too

Also including the new test file's run against MSVC showing the five messages mentioned in the PR review, see PR review for more...

* refactor(to_cpp1): pass `0` to `int` parameter instead of `false`

* test: discard unused results

* refactor(to_cpp1): avoid the UFCS macro to workaround compiler bugs

* test: update GCC 13 results

* test: remove Clang 18 results

* refactor(sema): avoid pointer arithmetic

* test: regenerate results

* refactor(sema): avoid pointer arithmetic

* Add regression test result diffs from my machine

MSVC error is resolved

New Clang 12 error in `pure2-last-use.cpp2:938`

* test: comment Clang 12 limitation

* test: apply CI patches

* test: apply CI patches

* test: apply CI patches

* Tweak: Change "final position" to "global token order"

* Minor tweaks

While I'm at it, clean up redundant headers that now we get from cpp2util.h
And it's updating those regression test line-ends...

---------

Signed-off-by: Johel Ernesto Guerrero Peña <[email protected]>
Signed-off-by: Herb Sutter <[email protected]>
Co-authored-by: Herb Sutter <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants