Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emitting multi-line messages from build scripts #13233

Open
ratmice opened this issue Jan 1, 2024 · 27 comments
Open

emitting multi-line messages from build scripts #13233

ratmice opened this issue Jan 1, 2024 · 27 comments
Labels
A-build-scripts Area: build.rs scripts C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.

Comments

@ratmice
Copy link

ratmice commented Jan 1, 2024

Problem

#10159 discusses and #11312 has an implementation of cargo::error= and cargo::error+= for build scripts,
This breaks out the discussion of cargo::error+= from these as this aspect needs more design discussion.

Questions

  • For syntax, how do we handle escaping?
  • For syntax, how do we handled of unmatched (+= without a = or directives between = and +=, begin without end or double begins, etc)
  • When rendering, the existing warning (and presumable error) has a built-in prefix. If we don't include that here, then it would be inconsistent and users might forget. If we do include it, users might have problems aligning the first line with the second if they want.
  • Should we specify allowing ANSI escape codes for styling, regardless of platform support?
  • Should presence of an error automatically fail the build?

Proposed Solution

No response

Notes

No response

@ratmice ratmice added C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-triage Status: This issue is waiting on initial triage. labels Jan 1, 2024
@ratmice
Copy link
Author

ratmice commented Jan 1, 2024

I also have concerns with cargo::error+= and cargo::warning+=, and would prefer to see something harkening back to rust raw strings such as cargo::error#"=, cargo::error"#= and ditto for warnings or even cargo::error_start=, cargo::error_end= which avoids questions about whether they should behave like raw strings. This will avoid dropping errors if the first one is a cargo::error+= with no previous cargo error, or perhaps worse, appending to the one that happened to occur previous to it.

It at least looks like this could be possible with a little let mut mode_raw_diagnostic: Option<Diagnostic> outside the exec_with_streaming call. With that it would be possible to tell when a diagnostic fails to start or end a raw diagnostic, and append to the mode_raw_diagnostic rather than whatever happens to be the last element in the output.diagnostics in vector.

I should probably note that my first inclination when seeing cargo:error+= was whether I could abuse it to avoid printing Error: at the beginning of an error, because the errors I'm working with are already formatted by an error formatter which prefixes it too. As such it wouldn't surprise me if people tried to sneakily emit them without an initial cargo:error= and be confused when it emits one of the above spurious behaviors.

I previously posted this in the other thread, for which I've moved it here.

@epage epage changed the title emitting multi-line errors from build scripts emitting multi-line messages from build scripts Jan 2, 2024
@epage
Copy link
Contributor

epage commented Jan 2, 2024

I should probably note that my first inclination when seeing cargo:error+= was whether I could abuse it to avoid printing Error: at the beginning of an error, because the errors I'm working with are already formatted by an error formatter which prefixes it too. As such it wouldn't surprise me if people tried to sneakily emit them without an initial cargo:error= and be confused when it emits one of the above spurious behaviors.

This is one of the topics that needs to be worked out is how multi-line messages are rendered. For example, if cargo emits "error", etc, should there be an expectation that users be able to align later lines with the first, requiring Cargo to guarantee the column count used.

That along with all the semantics for how to handle edge cases. Id have to look up our meeting notes to see if there was anything else.

@epage
Copy link
Contributor

epage commented Jan 2, 2024

I also have concerns with cargo::error+= and cargo::warning+=, and would prefer to see something harkening back to rust raw strings such as cargo::error#"=, cargo::error"#= and ditto for warnings or even cargo::error_start=, cargo::error_end= which avoids questions about whether they should behave like raw strings. This will avoid dropping errors if the first one is a cargo::error+= with no previous cargo error, or perhaps worse, appending to the one that happened to occur previous to it.

imo that isn't raw string literal syntax because its on the left hand of the assignment and being too close but not quite the same will likely lead to errors.

We also likely don't want to support it after the assignment because that would change the interpretation of existing values.

@ratmice
Copy link
Author

ratmice commented Jan 2, 2024

Yeah, I definitely prefer something like cargo::error_start= cargo::error_end to the raw literal syntax, also these are probably not intended to be nested and so we shouldn't need to do # counting. But that was the initial inspiration, and error_start/error_end were later edited in.

the basic idea is these would look like:

Edit: The following snippet is a bad idea.

cargo::error_start=
foo
bar
cargo::error_end=

So it is enough to just take a string and println!("cargo::error_start=\n{}\ncargo::error_end=\n", s)
Then when it reads: cargo::error_start=\n, it calls sets some local let mut mode_raw_diagnostic = Some(String::new);
When it sees cargo::error_end=\n it diagnostics.push(if let Some(x) = mode_raw_diagnostic.take() { x} else { unbalanced }).

With this we know if it's None and we see cargo::error_end= they are unbalanced, and if it ends the loop with Some(_) it is also unbalanced. That is at least vaguely the idea, I skipped some details like the remainder of the start and end lines for simplicity.

I'm happy to drop the comparison to raw string literals.

@epage
Copy link
Contributor

epage commented Jan 2, 2024

Personally, I always like to have a form of escaping for that "just in case" situation where a delimiter needs to be included. The approach of raw string literals and markdown code fences of "do something more" to escape works well.

@epage epage added A-build-scripts Area: build.rs scripts S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. and removed S-triage Status: This issue is waiting on initial triage. labels Jan 2, 2024
@ratmice
Copy link
Author

ratmice commented Jan 2, 2024

An alternative that would allow us to avoid escaping cargo::multi_line_error=num_lines or even error.len() instead of num_lines which allows you to avoid counting the number of lines...

Because, to be honest I don't really see the need for escaping though. The only thing that could possibly need escaping is cargo::error_end at the start of a line, and you could just wrap quotes around it to keep it off the start or a zero width space.

Edit:
I think perhaps I now understand the need for escaping (are other cargo:whatever handled elsewhere outside that loop?).
Perhaps a more robust way worth considering is just combining the cargo::error+= with the start and end approach leading to something such as.

cargo::multiline_error_start=
cargo::multiline_error+=...
cargo::multiline_error_end=

@ratmice

This comment was marked as resolved.

@epage
Copy link
Contributor

epage commented Jan 3, 2024

That concern is not to clear to me. The build will fail, whether we use error or not. That alone would drive people away. I'd be more concerned about nags that are independent of --cap-lints but that is independent of this (can't remember if the warning messages are subjected to --cap-lints)

@ratmice
Copy link
Author

ratmice commented Jan 3, 2024

Yes, I totally misread that statement I quoted, as "Treat the script as having errored if it returns an error code." which was the behavior I believe was originally implemented for compatibility.

@kornelski
Copy link
Contributor

kornelski commented Jan 8, 2024

I'm not a fan of begin/end directives that allow raw unprefixed content in between, because that means there could be a line staring with cargo:something= and not be a Cargo directive. And creating an escaping syntax for this just to be able to include "\ncargo::multiline_error_end=" in the text seems like overkill, but OTOH not having escaping and having an impossible-to-print edge case seems inelegant.

I've worked on design and multiple implementations of the HTML5 text/event-stream protocol. It escapes multi-line strings with data: prefix, and that has proven to be be easy to implement: "data: " + string.replace("\n", "\ndata: "). This could work for Cargo too:

println!("cargo:error={}", err.to_string().replace("\n", "\ncargo:<some continuation syntax>="));

This design makes escaping mandatory and easy to get right.

An end directive isn't necessary. If every line of the content needs a prefix, then the first non-prefixed line or unrelated directive can imply an end. In this sense cargo:<some continuation syntax> won't mean "append to the last error wherever it was", but rather "previous line continued".

HTTP/1 headers have a similar continuation syntax (a line starting with a tab appends to previous line).

An orphaned cargo:<some continuation syntax> would be ignored. I don't think that's a big deal. That's just one case out of countless ways users could theoretically get cargo: directives syntax wrong, and in this case Cargo can reliably warn about the problem.

@ratmice
Copy link
Author

ratmice commented Jan 8, 2024

An orphaned cargo:<some continuation syntax> would be ignored. I don't think that's a big deal. That's just one case out of countless ways users could theoretically get cargo: directives syntax wrong, and in this case Cargo can reliably warn about the problem.

My problem with this is that an orphaned cargo::<some continuation syntax> will end up being appended some of the time (when there was a previous cargo::error), and being ignored when there wasn't one. I don't really consider that to be a very reliable warning.
It may not be obvious on terminal output, but more visible in things like cargo build --message-format json. It may not mean "append to the last error whatever it was", but unless cargo can enforce that, that is how it behaves.

I feel less strongly about the need for an end marker, but it makes enforcing balance easier,
and I don't feel like the following is adding unreasonable complexity to allow for that checking.

    println!(
        "cargo::<begin continuation syntax>={}\ncargo::<end continuation syntax>",
        err.to_string()
            .replace("\n", "\ncargo::<some continuation syntax>=")
    );

@kornelski
Copy link
Contributor

kornelski commented Jan 8, 2024

My problem with this is that an orphaned cargo:: will end up being appended some of the time (when there was a previous cargo::error)

This seems to be a very rare edge case, and I don't understand why this should be an issue worth designing against.

  • For this to happen, the build must already be failing, twice, so this won't be breaking anyone's working builds.
  • If it happens, and there's no other text/warning/directive/junk after the first error message, the code that incorrectly uses continuation will just append lines to the previous error message. That may be formatted in a clumsy way (in non-terminal clients), but there isn't even any data loss. Both error messages will reach the user.
  • If there is some junk after the previous error, and the second error is started incorrectly with continuation syntax, Cargo can identify the issue, display all the errors, and warn that the build script has this small bug.

So overall the problem you're trying to address seems completely benign, very unlikely to happen, and is just a small UI formatting issue at worst.

And remember the current status quo is a hard-to-read dump of all stdout and stderr without any structure. Even incorrectly run-on error line continuations are an improvement over the current unformatted state of the output. Please don't let perfect be the enemy of the good.

@ratmice
Copy link
Author

ratmice commented Jan 9, 2024

My feeling is that when faced with a pile of errors, the last thing one wants to be doing is debugging the error handling code in an external project... so it is a good idea to sus these issues out beforehand so if we can find a tolerable way to avoid these quirks before they land we should. Having a staring contest over this feature hasn't appeared to actually be getting us actually closer to landing anything, and I would like to see this feature happen.

@kornelski
Copy link
Contributor

In practice build scripts don't emit piles of errors. They typically panic, once.

@ratmice
Copy link
Author

ratmice commented Jan 9, 2024

Not necessarily when your build.rs is calling into a library which are parsing many files.
In some of my build scripts this does exactly as you say it chains about 3 lists of errors
together converting them into a single error panics there. I think cargo::error and this
appears to me as an alternative to the practice of panic'ing, since from what I understand
the intention is for it to error regardless of the return code of the build script.

So in practice typically that is the case, but not always and there are plenty of examples where it is not.

@kornelski
Copy link
Contributor

kornelski commented Jan 9, 2024

I don't think there's a risk of cargo:error replacing Result or panic in build-time libraries, because printing is an out-of-band asynchronous side effect that the caller can't act upon, and it makes a very sloppy library API.

FWIW, I originally proposed, and still prefer, that the status code of the build script should be the only way to fail the build. Cargo should flag printing of errors and returning a success code from the build as a programmer mistake (and downgrade these faux-errors to warnings or hide them and warn that the script is buggy).

Anyway, if a good scenario looked like this:

error: failed to parse file1
   because the file could not be found
error: failed to parse file2
   because the file could not be found
error: failed to parse file3
   because the file could not be found
error: failed to parse file4
   because the file could not be found
error:
   failed to compile the set of files due to parse errors

error: the build script failed because of 5 previous errors.

then the absolutely worst case of everyone always using cargo:error+=, would look like this:

error: failed to parse file1
   because the file could not be found
   failed to parse file2
   because the file could not be found
   failed to parse file3
   because the file could not be found
   failed to parse file4
   because the file could not be found
   failed to compile the set of files due to parse errors

error: the build script failed because of 1 previous error.

So I don't understand why this one particular minor issue deserves the focus. This is such a small detail, that even a contrived worst-case scenario is quite good!

There are infinitely many ways to print a directive wrong, and this is a fundamental flaw of Cargo's print-based API. A sloppy programmer could print Cargo:error= or cargo:errrrorr= or cargo:::error= or cargo=error:. All of these would all be way worse, because they would lose the error message. And this problem is not limited to the proposed error directive — it's a weakness of the API between build scripts and Cargo.

If this is really showstopping concern, then syntax variations can't do much about it. The whole interface of build scripts would be better replaced with a proper Rust one, e.g.:

extern crate cargo_build;

fn main() {
    cargo_build::abort(r"many
lines
and
checked
syntax!
");
}

@ratmice
Copy link
Author

ratmice commented Jan 9, 2024

The lists of errors in my case are coming from 3 different libraries. So there should be 3 error codes. What I would like to focus on is whether authors of crates should have any say in whether others can append to a crates error message.
the end of continuation marker allows a crate to say that the error is finished, and should no longer allowed to be appended to.

I agree, the output you list is good, because my errors though already contain an [Error] prefix, I looked at just using bare
cargo::error+=. But from the perspective of when do errors start and end, and ensuring that cargo outputs what the crate author intends so I think the person emitting cargo::errors should have the ability to format errors how they intend.

I think the actual worse case scenario is probably more something like:

error: failed to parse file1
   because the file could not be found
   --- Abraham Lincoln

As fun as that sounds, I'd rather not see us go that way.
Edit: removed some self deprecating humor to highlight that I see this as essentially a quoting problem.
Anyhow I think we've probably beaten this horse done and dead by now, it is a minor issue but one I would rather avoid.

@kornelski
Copy link
Contributor

start/end markers have worse failure modes, especially if user prints start, but no valid end.

If user prints `start`, but no `end` (e.g. due to last line not ending with a newline, or due to a panic when `unwrap()`ing some `println!` arguments, or a syntax error in the end directive), then the parser will be stuck in the error-text-capturing state, eating arbitrary other output. That may be a ton of unhelpful output from spawned `Command`s. If another error is printed after the unclosed one, then it will capture the first error and everything in between, potentially other `cargo:` directives too. A continuation syntax can't break so catastrophically, because every line opts in to being included.

A basic start/end markers can't be nested. If a library "helpfully" inserted them into their error message, then user trying to print it along with their own context will have their context truncated and lost. It's possible to support nesting or escaping to fix that, but would add even more syntax to get wrong, and the concern here is about users who can't even print += correctly.

cargo:error= is consistent with pre-existing cargo:warning=, and such a natural extension of it that people may just guess it exists. There's no precedence for a being/end syntax in Cargo.

Currently cargo directives can be parsed with a simple stateless 1-liner like str.lines().filter_map(|x| x.strip_prefix("cargo:")). It works with grep too, and may be handy when integrating multiple build tools. A stateful parser is more fiddly to implement, especially if it supports nesting or escaping.

Regarding presentation of cargo:error=, the content should not contain its own [Error]: prefix or similar. Cargo (or an IDE) would decorate these messages according to its own rules. Even in terminal output this matters for getting (ANSI) coloring right.

@ratmice
Copy link
Author

ratmice commented Jan 10, 2024

If there is an unterminated end marker, then cargo can warn at the end when it sees that there is an unterminated because the current continuation is Some, that should catch this error during development when the developer outputs it. similarly if current continuation is None when cargo sees a cargo::error+= cargo knows there is is a missing start marker.

I don't think it should handle nesting, so it should if the current continuation is Some and it sees a start print a warning that they are unbalanced. Neither do I think we should try to handle escaping.

match (current_error_continuation, error_cont_prefix) {
    (None, "cargo::<error continuation begin syntax>") =>  current_error_continuation = Some(...),
    (Some(Error) "cargo::<error continuation end syntax>) => current_error_continuation.take(),
    (Some(s), "cargo::<error continuation append syntax>") =>  diagnostics.push(s.push_str("\n{}", ...))
    _ => warn about an unbalanced continuation. 
}

// when all done
if let Some(_) = current_error_continuation {
  warn about an unbalanced continuation.
}

I don't see it being super fiddly, just a few cases and a catchall, and one check at the end to see if there was an error left unconsumed...
But indeed it does add state where there current is no need for it (which I agree is unfortunate, thanks for pointing this out). We need not unhelpfully gobble up other cargo: directives though, if that is the behavior wanted. I think it would be, because of the unterminated marker issue you highlight.

If cargo warns about these cases they can be caught by developers. While in the start, += case they cannot.
and with the addition of cargo::<error continuation append> syntax, raw unprefixed strings don't get appended to the current error.

I think we have probably been talking past one another this whole time, because I think the += syntax you recommend is good, I just want to add an end marker to it. In the beginning I had said we didn't need +=, for all the reasons you mention it is a bad idea not to have such an append directive.

If for whatever reason this becomes untenable, cargo can just completely ignore cargo::<error continuation end syntax> and implement like as your branch as though cargo::<continuation start syntax> was an alias for cargo::error. So I don't think we have anything to lose by trying to restrict what we accept here.

@kornelski
Copy link
Contributor

An empty line can act as a terminator for the continuation syntax. That follows naturally from the fact that continuation lines have to follow a line they can continue, so any other line will work, and an empty line is the simplest case.

@ratmice
Copy link
Author

ratmice commented Jan 10, 2024

I would tend towards preferring a directive, but it isn't a big deal to me whatever is settled upon.
Edit: however it seems to me that a newline is one of the most likely things to be present in whatever is output after a unterminated error continuation, likely defeating the purpose of checking...

@ratmice
Copy link
Author

ratmice commented Jan 10, 2024

I'm not confident I understand exactly what you mean anymore, could you give an example?

@kornelski
Copy link
Contributor

kornelski commented Jan 10, 2024

Your worry is that printing wild cargo:error+= will append to some previous error, like so:

cargo:error= first error line1
cargo:error+= first error line2
cargo:error+= second error

and it would be prevented by requiring a terminator:

cargo:error-start=
cargo:error= first error line1
cargo:error= first error line2
cargo:error-end=
cargo:error+= second error # or another start

I suggest that any non-continuation line, including an empty line, can be the terminator, like so:

cargo:error= first error line1
cargo:error+= first error line2

cargo:error+= second error

This is enough to detect that the second error is an invalid syntax (it's not immediately after an error directive), without introducing any special syntax for the terminator itself. It also looks fine when reading the raw log (when debugging the build script or getting verbose dump). Any line will work, this too:

cargo:error= first error line1
cargo:error+= first error line2
cargo:rerun-if-changed=x # this is not an error directive more, so the first error is complete and done now
cargo:error+= second error

or this

cargo:error= first error
cargo:error+= first error
thread 'now processing the second error' panicked at src/main.rs:1232:15:
assertion `left == right` failed
  left: "Hello"
 right: "world!"
cargo:error+= second error

and because any irrelevant line works as a terminator, then it's literally impossible to get it wrong and end up with an unterminated state, e.g.

cargo:error-start=
cargo:error= first error line1
cargo:error= first error line2
cagro:errorr-end= 

thread 'now processing the second error' panicked at src/main.rs:1232:15:
assertion `left == right` failed
  left: "Hello"
 right: "world!"
cargo:error-start=
cargo:error= second error
cargo:error-end=

In this case the typo in the terminator generated a message like:

first error line1
first error line2
cagro:errorr-end= 

thread 'now processing the second error' panicked at src/main.rs:1232:15:
assertion `left == right` failed
  left: "Hello"
 right: "world!"

@ratmice
Copy link
Author

ratmice commented Jan 10, 2024

I guess my only concern with bare newlines as terminator is just that it is likely to be output by something random, and completely unrelated to the error potentially allowing an unterminated error to be accepted. However it does make the raw log look nice.

I.e. in the final check

// when all done
if let Some(_) = current_error_continuation {
  warn about an unbalanced continuation.
}

It seems very likely that this check could become irrelevant because of any random newline. So I would tend towards explicit personally, but I don't want to be difficult so if that is the thing so be it?

@kornelski
Copy link
Contributor

kornelski commented Jul 18, 2024

I've re-read the discussion, and think the points still stand, so this issue needs an executive decision to pick one way or another.

(I'm not impartial in this discussion, so I don't think I can make a fair summary)

@epage
Copy link
Contributor

epage commented Jul 18, 2024

First, it would be good to provide a summary of each option for easy comparison, rather than trying to piece together the "final states" from the thread.

Second, syntax is only one problem to be solved. I've not gone back to the meeting notes but I've added to the Issue several design points that are of concern.

@kornelski
Copy link
Contributor

Some solution is really needed, includingcargo:warning too.

I've tried improving output of openssl-sys build script, but the current formatting of cargo:warning makes it noisy. Building openssl-sys is a frequent point of failure, so the crate wants to give verbose help to users.

Having warning prefix message with the crate name is great for a single warning, but for multi-line output it becomes repetitive and takes quite a bit of line length:

warning: [email protected]: Could not find directory of OpenSSL installation, and this `-sys` crate cannot
warning: [email protected]: proceed without this knowledge. If OpenSSL is installed and this crate had
warning: [email protected]: trouble finding it,  you can set the `OPENSSL_DIR` environment variable for the
warning: [email protected]: compilation process.
warning: [email protected]: 
warning: [email protected]: Make sure you also have the development packages of openssl installed.
warning: [email protected]: For example, `libssl-dev` on Ubuntu or `openssl-devel` on Fedora.
warning: [email protected]: 
warning: [email protected]: If you're in a situation where you think the directory *should* be found
warning: [email protected]: automatically, please open a bug at https://github.com/sfackler/rust-openssl
warning: [email protected]: and include information about your system as well as this message.
warning: [email protected]: 
warning: [email protected]: $HOST = aarch64-apple-darwin
warning: [email protected]: $TARGET = aarch64-apple-darwin
warning: [email protected]: openssl-sys = 0.9.103

openssl-sys has these things to output:

  • turns a pkg_config error into a warning (not included in the above). This is a multi-line output, with a long error message + help text.
  • an error that openssl-sys failed. This is a long-ish error + multi-line help and troubleshooting info.

If it was something more like rustc errors, I presume it'd be formatted like:

warning: [email protected]: Could not run `PKG_CONFIG_PATH=1 PKG_CONFIG_ALLOW_SYSTEM_CFLAGS=1 pkg-config --libs --cflags openssl`
  The pkg-config command could not be found.

help: [email protected]: Most likely, you need to install a pkg-config package for your OS.
  Try `brew install pkg-config` if you have Homebrew.

  If you've already installed it, ensure the pkg-config command is one of the
  directories in the PATH environment variable.

  If you did not expect this build to link to a pre-installed system library,
  then check documentation of the openssl-sys crate for an option to
  build the library from source, or disable features or dependencies
  that require pkg-config.

error: [email protected]: Could not find directory of OpenSSL installation, and this `-sys` crate cannot
  proceed without this knowledge. If OpenSSL is installed and this crate had
  trouble finding it,  you can set the `OPENSSL_DIR` environment variable for the
  compilation process. 

help: [email protected]: Make sure you also have the development packages of openssl installed.
  For example, `libssl-dev` on Ubuntu or `openssl-devel` on Fedora.
  
  If you're in a situation where you think the directory *should* be found
  automatically, please open a bug at https://github.com/sfackler/rust-openssl
  and include information about your system as well as this message.
  
  $HOST = aarch64-apple-darwin
  $TARGET = aarch64-apple-darwin
  openssl-sys = 0.9.103

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-build-scripts Area: build.rs scripts C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.
Projects
None yet
Development

No branches or pull requests

3 participants