Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change raw string literal syntax: [#]*" represents single-line string and [#]*''' represents block string #1360

Merged

Conversation

SlaterLatiao
Copy link
Contributor

@SlaterLatiao SlaterLatiao commented Jun 30, 2022

Requiring """ to only be used for block string can avoid confusions.
Update: Use " for simple string literals and ''' for block string literals.

@SlaterLatiao SlaterLatiao added the proposal A proposal label Jun 30, 2022
@SlaterLatiao SlaterLatiao requested a review from a team June 30, 2022 22:54
@SlaterLatiao SlaterLatiao force-pushed the raw_string_literal_proposal branch from b741148 to bb43d88 Compare June 30, 2022 22:54
@SlaterLatiao SlaterLatiao changed the title Change raw string literal syntax Change raw string literal syntax: enforcing """ to be block string Jun 30, 2022
@SlaterLatiao SlaterLatiao changed the title Change raw string literal syntax: enforcing """ to be block string Change raw string literal syntax: requiring """ to only be used for block strings Jul 1, 2022
@SlaterLatiao SlaterLatiao added the proposal rfc Proposal with request-for-comment sent out label Jul 1, 2022
@SlaterLatiao SlaterLatiao marked this pull request as ready for review July 1, 2022 00:31
proposals/p1360.md Show resolved Hide resolved
proposals/p1360.md Outdated Show resolved Hide resolved
proposals/p1360.md Show resolved Hide resolved
@jonmeow
Copy link
Contributor

jonmeow commented Jul 1, 2022

I want to note, another thing we sometimes do is edit the design in the same PR as the proposal. i.e., to edit string_literals.md and make corresponding adjustments for this proposal. That can be considered a substitution for "details", as the design is what people are typically referring to. It may make sense to do here when you respond to my prior comments.

@SlaterLatiao SlaterLatiao removed the proposal rfc Proposal with request-for-comment sent out label Jul 6, 2022
@SlaterLatiao SlaterLatiao changed the title Change raw string literal syntax: requiring """ to only be used for block strings Change raw string literal syntax: block strings start with [#]+""" and [#]+""" only used for block strings Jul 7, 2022
@SlaterLatiao SlaterLatiao requested a review from a team as a code owner July 7, 2022 22:31
jonmeow
jonmeow previously requested changes Jul 9, 2022
Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment -- there is still a miscommunication on how to handle simple block string literals.

@SlaterLatiao SlaterLatiao changed the title Change raw string literal syntax: block strings start with [#]+""" and [#]+""" only used for block strings Change raw string literal syntax: [#]*""" always starts a block string Jul 11, 2022
@jonmeow jonmeow self-requested a review July 19, 2022 10:19
Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM on my end, just a little more example tuning.

docs/design/lexical_conventions/string_literals.md Outdated Show resolved Hide resolved
docs/design/lexical_conventions/string_literals.md Outdated Show resolved Hide resolved
@SlaterLatiao SlaterLatiao requested a review from jonmeow July 21, 2022 05:09
@jonmeow jonmeow dismissed their stale review July 21, 2022 11:48

LGTM, but leaving approval for lead

@lexi-nadia
Copy link

I'm not sure what the scope of this PR is (maybe it's narrower than i'm getting at), but could the alternatives section be updated to describe the current C++ behavior? If we're discarding that -- which doesn't have this problem -- it seems worth stating why.

@SlaterLatiao
Copy link
Contributor Author

Discussed C++ and Swift behaviors in alternatives section.

docs/design/lexical_conventions/string_literals.md Outdated Show resolved Hide resolved
proposals/p1360.md Outdated Show resolved Hide resolved
[easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write),
because it avoids confusion on the type of certain string literals.

## Alternatives considered
Copy link
Contributor

@chandlerc chandlerc Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are other alternatives we should at least document here...

We could allow a non-escape based disambiguation that focuses on making the opening marker not ambiguous between single-line and block. I can imagine two ways of doing this.

  • a) We could allow some non-quote marker after the open quote similar to C++. Either #"(")"# or #"#"#"# come to mind.

  • b) We could use different quotes in various ways that allow: #'"'#.

    • b1) /#+'/ could be an additional way to start a single line string literal, without removing anything
    • b2) /#+'/ could always start a single line string literal, and /#+" could always start a block string literal

These all seem to have problems though that should be covered in the alternatives considered.

(a) IMO hurts readability and makes it harder to adddress a complaint of a collision by adding a(nother) set of #s around the contents. The first time, you'd have to modify things more than that. It also loses the visual connection of the "s surrounding the quoted-text.

(b) Adds a lot of complexity no matter how we slice it, and doesn't fully remove a lurking surprise in the lexical structure.

Fundamentally, (a), (b), (c) this proposal, and (d) the status quo are moving around where there is a surprise, but not eliminating it.

The source of the surprise is that the introducing syntax for single line string literals in a prefix of multiline string literals, including in the raw case. The result is that we don't know which we're seeing until its too late to avoid surprises.

But there is yet another alternative, suggested @zygoloid after I gave a long series of bad suggestions, let's call it (e).

In (e), we change block string literals to always use ''', never """. Everything else stays the same. If/when we have character literals of any form, we reject '', when we see '' we know the next character must be ' and it is a block string literal. Raw block string literals are #+'''. Etc. Python already allows ''' as well, although it is rarely used to my knowledge.

With (e), #"""# is the same as "\"" and doesn't require anything to recognize. You can even write #""""A python string here!""""#.

For teaching, this even seems nice -- its a "triply quoted" string instead of a "triple-double quoted" string. ;]

WDYT? Both @zygoloid and I quite like this, but we should check with @KateGregory as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, seems all leads are happy here and haven't heard huge concerns. It is a bit inventive, but I think folks are roughly comfortable so let's consider the path forward.

Copy link
Contributor

@jonmeow jonmeow Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion, I believe the intent is to also disallow """ (triple double quote), rather than interpreting it as two string literals. This would help reduce the confusion with Python (where ''' is supported, but """ is recommended under PEP 257).

Of course, #""" is still a confusability issue -- but I think it sounds like the leads want to support that use-case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note, I think it'd be good to go through the discussed options and some advantages and disadvantages of each in the alternatives. If you're trying to find a good example of how to do this, something like the alternatives from one of these:

Explicitly calling out advantages and disadvantages:
https://github.com/carbon-language/carbon-lang/blob/trunk/proposals/p0257.md#alternatives-considered

Using prose to talk about options, but a separate section for each significant option:
https://github.com/carbon-language/carbon-lang/blob/trunk/proposals/p0722.md#alternatives-considered

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added discussed options to the alternatives.

@SlaterLatiao SlaterLatiao changed the title Change raw string literal syntax: [#]*""" always starts a block string Change raw string literal syntax: [#]\*" represents single-line string and [#]\*''' represents block string Aug 8, 2022
@SlaterLatiao SlaterLatiao changed the title Change raw string literal syntax: [#]\*" represents single-line string and [#]\*''' represents block string Change raw string literal syntax: [#]*" represents single-line string and [#]*''' represents block string Aug 8, 2022
Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

docs/design/lexical_conventions/string_literals.md Outdated Show resolved Hide resolved
Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leads are happy with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation An issue or proposed change to our documentation proposal rfc Proposal with request-for-comment sent out proposal A proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants