Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newlines as whitespace text #19

Open
RReverser opened this issue Sep 28, 2014 · 27 comments
Open

Newlines as whitespace text #19

RReverser opened this issue Sep 28, 2014 · 27 comments

Comments

@RReverser
Copy link
Contributor

Currently contents like

<b>{this.title}</b> <span>123</span>

are converted to

React.DOM.b(null, this.title), " ", React.DOM.span(null, "123")

which is logical and expected since whitespace might change the layout (i.e. inside element with white-space: pre and in other cases).

However:

<pre>
    <b>{this.title}</b> <span>123</span>
</pre>

is converted to

React.DOM.pre(null, 
    React.DOM.b(null, this.title), " ", React.DOM.span(null, "123")
)

so whitespaces across newlines are inserted only in code, but not treated as regular text, which means we loose them completely in generated layout.

Shouldn't we treat them as text too?

@syranide
Copy link
Contributor

This is entirely intentional, facebook/react#480.

Summary: Indenting/beautifying code should never affect the outcome, especially as it means all indentation has to carry over into the final code and cannot be minified away (yuck!). While there are a handful of cases where you might want the newlines to affect outcome, those are far better handled explicitly by {' '} or {'\n'}.

PS. This belongs in React as sebmarkbage believes whitespace rules are best left outside of the spec.

@RReverser
Copy link
Contributor Author

PS. This belongs in React as @sebmarkbage believes whitespace rules are best left outside of the spec.

Ah, if we leave this outside the spec, then fine. Because for implementations like @vjeux / JSXDOM, preserving all the whitespaces is critical IMO as it's expected to replicate HTML behavior (and this is not currently supported by esprima-fb even as an additional option).

Though, even for React things like {'\n'} in JSX look ugly personally to me, and even worse that easily-gzipped whitespace overhead in generated JS. Reminds dark middle-age times of React when we really had to use {' '} for every single space that needs to be preserved.

@syranide
Copy link
Contributor

I can count the number of times I've had to use {'\n'} on one hand (and I still prefer it, it's explicit; explicit > ugly) and the thousands of times it would have otherwise inserted unwanted white-space. If you're trying to insert tons of white-space: pre-text use a string or template string even, problem solved.

I think the fact that it is easily gzipped is irrelevant, it shouldn't be there, and it has to be reproduced in the DOM which has a very real performance cost. On the order of one TextNode for every element, that's not acceptable. Not being able to structure code sensibly but having to worry about affecting layout/performance is rediculous, regardless of whether that's how HTML behaves or not.

HTML sucks in many ways (and sometimes because it has to), there's no reason to keep the bad things when we have all of JavaScript at our disposal to solve practical problems in better ways.

@RReverser
Copy link
Contributor Author

I think the fact that it is easily gzipped is irrelevant, it shouldn't be there

I thought it was the reasoning why you didn't want to see it there. Can't see any other meaningful reasons.

it has to be reproduced in the DOM which has a very real performance cost

Not really - text nodes are lightweight and fast to insert/render + this part will be usually static and not changed in time.

HTML sucks in many ways (and sometimes because it has to), there's no reason to keep the bad things when we have all of JavaScript at our disposal to solve practical problems in better ways.

IMO we are building tools that allow to work with HTML DOM in really simple and fast way. But we shouldn't change semantics of HTML on our own only because we don't like some of it's parts.

One of strong parts of React that many teams see is that layout guy (the one that knows only HTML/CSS and saw some templating solutions) can easily open any React component and work with JSX block as with regular layout, just being told to use className instead of class, htmlFor instead of for and {...} for templating.

Just that simple.

He shouldn't rack his brain on why layout is different from what he wrote in that block and definitely shouldn't google how to insert extra whitespace in the layout (it's not just about white-space: pre - we don't even have single whitespace inserted between elements that are on different lines).

@syranide
Copy link
Contributor

He shouldn't rack his brain on why layout is different from what he wrote in that block and definitely shouldn't google how to insert extra whitespace in the layout (it's not just about white-space: pre - we don't even have single whitespace inserted between elements that are on different lines).

Pretty much every single project I've ever worked on has used a post-process for stripping white-space out, it's has worked fantastically to get consistently pixel-perfect layouts, across all browsers. Pretty much every out-sourced HTML-file I've received that hasn't had it applied has had magic numbers and offsets tweaked by designers to compensate for implicit white-space. Worse than that, the offsets introduced by unwanted implicit white-space are font-size/line-height dependent, there is no excuse for that.

What if he didn't want white-space inserted between these?

<span className="count">10</span>+
<span className="extra">15</span>/
<span className="total">40</span>=
<span className="total">25</span>

Oh right:

<span className="count">10</span>+<span className="extra">15</span>/<span className="total">20</span>=<span className="total">20</span>

But if he really does:

<span className="count">10</span>{' + '}
<span className="extra">15</span>{' / '}
<span className="total">40</span>{' = '}
<span className="total">25</span>

This is much more flexible and precise, you can have as many spaces (or tabs, newlines) before and after as you want. You can realistically even use white-space: pre-wrap as a default style for your entire app, there is no unwanted white-space. HTML offers none of this unless you write everything on a single line.

IMO we are building tools that allow to work with HTML DOM in really simple and fast way. But we shouldn't change semantics of HTML on our own only because we don't like some of it's parts.

We've added valueLink, changed the meaning of value, onChange, etc. <textarea> uses value instead of children. style is an object with camelCase-keys, not a CSS-string (and they're not equivalent), float should be used instead of cssFloat and classSet will probably make an appearance in the future. All attributes are camelCased. JSX requires <input /> instead of <input>. The list goes on.

The ReactDOM-implementation is tailored to everyone's best interests and I'm certain it will continue to be. AFAIK there is no written goal of implementing HTML DOM 1:1 and React diverted away from that day 1. If that's what someone really wants, then just fork JSX or add your own JSX preprocessor (the current JSX white-space rules are completely compatible). To my knowledge no one has and no one has complained.

@RReverser
Copy link
Contributor Author

Pretty much every single project I've ever worked on has used a post-process for stripping white-space out, it's has worked fantastically to get consistently pixel-perfect layouts, across all browsers. Pretty much every out-sourced HTML-file I've received that hasn't had it applied has had magic numbers and offsets tweaked by designers to compensate for implicit white-space.

Looks like me and you had really different experience and projects.

What if he didn't want white-space inserted between these?

No need to put everything on one line - it's easy to:

<span className="count">10</span>+{
}<span className="extra">15</span>/{
}<span className="total">40</span>={
}<span className="total">25</span>

Very explicit, just like you like, but turned into different direction :) And no need to put any special escaped characters inside.

The list goes on.

Most of these are really React-specific extensions and yes, some of them have same names as HTML attrs so need to be learned specifically. Things like valueLink are more about JS part and have nothing to do with HTML itself, and classSet is also separate convenient addition that doesn't break anything.

Still, let's get abstracted from React - I've created issue in JSX because I'm currently more interesting in spec for other implementations and parsers, and don't really want to spend time discussing React here :)

If @sebmarkbage says that we just keep whitespace handling outside the spec, then I'm fine with closing this issue and continuing own work with preserved whitespaces. But I want to follow same spec as close as possible, so if whitespace handling used in React is meant to be standard for any JSX implementations, then would be nice to add this explicitly to spec and remove it in own project, too.

@syranide
Copy link
Contributor

I know that at least all of these suffer from it:

<div style="font-size: 100px">
  <span style="font-size: 10px; background: #ff0000">Button #1</span>
  <span style="font-size: 10px; background: #00ff00">Button #2</span>
</div>

These two doesn't even render the same:

<div style="font-size: 100px">
  <span style="font-size: 10px; background: #ff0000">
    Button #1
  </span>
  <span style="font-size: 10px; background: #00ff00">
    Button #2
  </span>
</div>
<div style="font-size: 100px">
  <div style="font-size: 10px; background: #ff0000; display: inline-block">Button #1</div>
  <div style="font-size: 10px; background: #00ff00; display: inline-block">Button #2</div>
</div>
<div style="font-size: 100px">
  <img src="error.jpg" width="10" height="10" style="background: #ff0000">
  <img src="error.jpg" width="10" height="10" style="background: #00ff00">
</div>

Note that lack of explicit padding in the code, also tweak the outer font-size and watch.

PS. They are not made up fantasy edge-cases, you find them everywhere.

@sebmarkbage
Copy link
Contributor

See #6 for why white-space behavior is unspecified. Basically, the fact that we have this controversy here is the primary reason to leave it out of the spec for now.

It's kind of inconvenient that this is underspecified because this tends to be built into the parser rather than the transformation step. However, for many use cases this doesn't matter. E.g. syntax highlighters doesn't care.

Perhaps we need to document WHY white-space isn't specified?

I'd recommend that we open up an issue on React where we can discuss this in the context of React specifically (which doesn't involve JSXDOM and other use cases). There are more interesting aspects of this when a composite component has to reason about the extra white-space.

@RReverser
Copy link
Contributor Author

See #6 for why white-space behavior is unspecified.

Oh, didn't find it on my own. Sorry for that.

It's kind of inconvenient that this is underspecified because this tends to be built into the parser rather than the transformation step.

Exactly. That's why I proposed preserving whitespaces on parser level - this would be suitable as for transformers that do care about preserving them (yay JSXDOM!) as well as for transformers that intentionally don't (React's jstransform can easily strip parsed whitespaced that contain \n).

Perhaps we need to document WHY white-space isn't specified?

Yeah, even this would be helpful. At least others could find information on same question on the main page instead of opening another issue.

I'd recommend that we open up an issue on React where we can discuss this in the context of React specifically (which doesn't involve JSXDOM and other use cases).

Might make sense. As I said previously, at this moment I'm interested in this question on spec+parser level, abstracted from React, but if you create one, please /cc me, would be interesting to track and/or participate.

@syranide
Copy link
Contributor

Exactly. That's why I proposed preserving whitespaces on parser level

AFAIK it is preserved by the parser, it's the JSX transform that strips it. Although we discussed (and actually agreed on) moving it to the parser instead in facebook/react#1514 facebookarchive/esprima#19, as HTML-entities should be decoded after stripping white-space and it actually makes most sense to do both in the parser.

@RReverser
Copy link
Contributor Author

@syranide Sorry, I might not remember details of topic being discussed ~2 months ago but didn't we discuss trimming whitespaces on ends of content and not whitespaces in between?

@garthk
Copy link

garthk commented May 18, 2015

I'd like to see whitespace inserted on all newlines that aren't between adjacent tag boundaries, before the first non-whitespace text in a tag, or after the last non-whitespace text in a tag. It's least surprising, and (unless I'm missing something) can be done without harm to the list of edge cases above.

Consider:

<p>
a
b
c
<p>

yields:

React.createElement("p", null, 
"a" + ' ' +
"b" + ' ' +
"c"
)

Given that, I'm shocked when the spaces disappear when I replace one line in the text with a tag:

<p>
a
<a>b</a>
c
</p>

yields:

React.createElement("p", null, 
"a", 
React.createElement("a", null, "b"), 
"c"
)

What harm am I missing, apart from possibly lending moral support to an HTML feature @syranide calls "horrible" in facebook/react#480, that would come if we preserved whitespace for the following newlines?

  • between text and begin-tags, e.g. text\n<div>
  • between end-tags and text, e.g. </div>\ntext

EDIT: I dimly recall getting annoyed about whitespace being preserved before some tag inside an anchor, once. I'm not sure it's more annoying than having to scatter {' '} everywhere. More examples may persuade me otherwise.

@syranide
Copy link
Contributor

@garthk I'm not a fan of whitespace rules for text at all, to lend some more context #28 (and #8 #35).

... would come if we preserved whitespace for the following newlines?

If you were to have that there would no longer be a way not to have whitespace between text and tags (except for making the entire string an expression), whereas the current way allows you to easily add whitespace you think it should be there and you'll also be explicit for others benefit. The verbosity and side-effects of {' '}is undesirable though (hence why I'd like to see #28 or whatever) but the alternative is far worse in my opinion.

Inserting whitespace between text lines makes quite a lot of practical sense, it's almost certainly what you wanted as you break on space when lines get too long and if you don't like it you can just put {} at the end and it's gone. Whereas foo\n<tag>bar</tag> having a whitespace but not <b>foo</b>\n<tag>bar</tag> feels ambigious and means that using newlines to structure your code affects the outcome of it (which is part of what I think is horrible).

I think it's important to remember, especially now with React-Native, that HTML whitespace rules makes absolutely no sense for frontends other than HTML, JSX is not a markup language.

tl;dr I don't like the whitespace rules, but IMHO it's the best compromise. I think an alternative (not strictly HTML) approach is necessary.

@ghost
Copy link

ghost commented Aug 4, 2015

Thank you for reporting this issue and appreciate your patience. We've notified the core team for an update on this issue. We're looking for a response within the next 30 days or the issue may be closed.

@LucaColonnello
Copy link

Hello guys,
almost 3 years from the last comment on this issue, we have the same problem and we are still wondering what to do on the transformation step.

The problem though is real babel/babel#7360 and affects other parsers as well microsoft/TypeScript#22186.

We really need to have a proper solution for this. It doesn't have to address all the issues above (I appreciate the downsides of a solution over another) but we need a direction so that at least all the parsers are following the spec.
As @sebmarkbage said, this is unspecified, but it would be really helpful to have a spec for it.

A suggestion I would give, is to implement the same behavior as the HTML standard does.

TL;DR

Don't strip out whitespaces and tab if they are following text nodes.

Text before newline and tabs

screen shot 2018-02-28 at 10 50 46

There's some text before the newline, so the browser carry over the newline and tabs.

Tag before newline and tabs

screen shot 2018-02-28 at 10 51 25

There's no text before the newline, only the strong tag, so the browser treat it as indentation and strips eventual newlines, whitespaces and tabs out.


In both cases the browser renders the whitespace, which means that we only need to transform to what is standard on the HTML spec in order to reach the expected behavior on the renderer layer.

I'm aware that JSX is not only for HTML, but I found sensible that if there's a text before the newline, the following characters (newlines, whitespaces and tabs) are considered as part of the above text.

@LucaColonnello
Copy link

LucaColonnello commented Feb 28, 2018

Also, note the HTML spec mentioning the same here https://www.w3.org/TR/html4/appendix/notes.html#notes-line-breaks

@Strate
Copy link

Strate commented Jul 25, 2018

Just ran into this issue trying to use native bootstrap 4 and react :(

@cancerberoSgx
Copy link

@syranide Thanks! the trick with {' '} did work! I however don't agree. Most of you are assuming that everybody targets HTML DOM or a visual medium with stylesheets.

In my case I'm targeting the command line and there we don't have much more than spaces for positioning.

Even with some layouts implementations, I want to implement a text area like or widget without my users to be forced to manually write those {' '} . Is not a problem for the en user but it is for document authors.

BTW HTML makes exceptions for these cases in text area and pre, which seems to be reasonable / useful. Any idea idea of how to support that ? My only idea now is to preprocess the JSX to add the {' '} since I don't have any control on that part of the code generation. Or there is some way of hooking in there ?

Thank you very much.

@syranide
Copy link
Contributor

syranide commented May 14, 2019

@cancerberoSgx JSX inline strings IMHO are only really well-suited to small snippets of text/labels. If you want something more specific or extensive then I recommend putting it all into a string, and maybe not even as JSX-children but as a prop. I.e. <FooBar>{" fooo " + index + "baaar "}</FooBar> or <FooBar text={" fooo " + index + "baaar "} />, or you can use ES6 template-strings if you want nicer interpolation or multi-line strings (and don't mind the indentation).

PS. If it was up to me I would probably not have made inline JSX-strings a thing, but required some kind of JSX-element for those too. They will always be opinionated, it's the only way they make sense.

@cancerberoSgx
Copy link

@syranide Aha, thanks for your view, My relationship with jsx lately was trying to implement renderers targeting different mediums (like a string, latex, CLI, etc and in general those mediums are just buffers / strings.

As a language, that has a duality with JavaScript / TypeScript I think is important not to limit its design to accommodate to most used use cases. For me is not just another template language, is an Object representation alternative syntax! specially useful for visual ouptuts.

And in that matter, I disagree with you: JSX.TextNode is the most practical and unique kind of Node that JSX introduces. It support cases that cannot be supported in that natural way. (even string templates where you need to wrap and escape but worst , you cannot represent structure in a string literal).

Imagine the user is authoring a book, or article or news paper and is forced to use strings or attribtues for that !!!. JSX.TextNode is the natural way of accomodate such content and no other kind of node in JS/TS/JSON is better for that. The other kind of nodes, like JSXElement, TagName, JSXAttributes, for me, have more straight forward equivalence with JavaScript node kinds, but JSXTextNode, from the point of view of authors has unique capability syntax.

Just wanted to give this opinion since I'm surprised how lightly the argument is to justify (defacto) messing with JSX.TextNode and not providing any means of configuration. JSX.TextNode also is important for parent-children semantics declarations. WOULD BE sad that the language ends up being just elements with attributes because of HTML developers peacefulness (and I one of them). Then I prefer to represent that pattern with just JS objects... those were designed for that and not for text.

@syranide
Copy link
Contributor

Imagine the user is authoring a book, or article or news paper and is forced to use strings or attribtues for that !!!. JSX.TextNode is the natural way of accomodate such content and no other kind of node in JS/TS/JSON is better for that.

@cancerberoSgx It could also be a matter of <Foobar> Hello </Foobar> vs something like <Foobar>{' Hello '}</Foobar>. As the second relies on JS-strings with boundary markers it's very precise (or rather, it's just an expression), and you would interpolate strings like you normally do.

But the first thing to realize I would say is that JSX is not a syntax for writing blogs or articles, it's for structuring data, static button labels, etc. It's just an extension to make it less cumbersome and more readable than what you can otherwise achieve with the existing JS syntax. Content data should be stored in a database or such and loaded into React at run-time. Also, as you develop you generally tend to want to support different languages, so then even button labels become function calls and you pretty much stop using JSX inline strings entirely.

IMHO, JSX is not HTML, JSX is an extension of JS, which is about code, not arbitrary text representation. It's about providing a more readable syntax for specifying a type + data. If you want to encode markdown in JS today you need to put it in an ES6 template-string and process it at run-time, or use some markdown-file transform, this is no different.

you cannot represent structure in a string literal).

ES6 template-strings support that though and will encode all white-space contained within it, for better or worse. It's the right tool for the job if you want more exact run-time control.

Just wanted to give this opinion since I'm surprised how lightly the argument is to justify (defacto) messing with JSX.TextNode and not providing any means of configuration.

It's not lightly, it's actually me who designed the white-space rules way back. The fact is that when you evaluate the possibilities it essentially boils down to two sensible paths.

  1. Either you keep all white-space, indentation, newlines, etc. From beginning to end. However, this is problematic, because not only is indentation then encoded into the generated code. Trimming white-space then becomes a run-time concern (which would actually be quite complicated and expensive to get right) and you will end up with tons and tons of white-space-only strings everywhere in the final code, and worse run-time behavior would be affected by code structure (i.e. indentation and newlines). Also, this would have a universally detrimental effect on not only code-size, but also performance.

  2. Trim all white-space and treat inline strings more like tokens. Then you get a minimal representation which is well-suited to basic text representation, i.e. labels and such. I've advocated for removing the HTML-entity parsing rules and using JS escape sequences instead, but there's been little interest or movement on that it seems. That's one way to "solve" the white-space issues by way of just writing \ or such for forcing leading/trailing white-space and you could then explicitly insert newlines with e.g. \n. Not perfect, but all use-cases would be reasonably covered, but there doesn't exist a perfect catch-all solution, there's always a trade-off.

As JSX is a compile-time transform there is really no sensible means of configuration without resorting to directives, but directives are messy. But if you want to have a babel-jsx plugin with different white-space rules then by all means.

@vjeux
Copy link
Contributor

vjeux commented May 15, 2019

If you’re writing a book, MDX might be more suited than straight JSX. https://mdxjs.com

@cancerberoSgx
Copy link

cancerberoSgx commented May 15, 2019

@syranide thanks, yes I was also thinking about ways of customize this behavior and came to the same conclusion that since the transformation happens at compile time it cannot be accomplished at runtime, and the only way is with a custom JSX factory. Perhaps would be better than the compile time transformation leave the text as it is and the runtime reconciler takes care of removing the spaces or whatever text processing is needed, at runtime. This way users are able to configure the behavior at runtime instead of having to implement a new compile-time transformation (which is expensive, not because the implementation but also because you need your users to use that too). thanks again.

@nik72619c
Copy link

Hey folks,
Saw one wierd catch in the jsx, but not sure if its relevant to this or not
Note:- <spacebar> means a whitespace applied using spacebar key in the keyboard

<span>
  <span>this is text1<spacebar></span>
  <span>and text2</span>
</span>

renders 👉🏻 this is text1 and text2

But,

<span>
  <span>
    this is text1<spacebar>
  </span>
  <span>
    and text2
  </span>
</span>

renders 👉🏻 this is text1and text2

Could someone please guide me on this?
@syranide @RReverser @sebmarkbage 🙂
Thanks!

@syranide
Copy link
Contributor

syranide commented Feb 7, 2021

@nik72619c That is correct, all lines are trimmed. Since there are no wrapping elements in the second example the whitespace at the end gets trimmed. Here's the explanation of the algorithm: facebook/react#480 (comment)

You can force a whitespace with {' '}, {'this is text1 '} or similar.

@nik72619c
Copy link

nik72619c commented Feb 7, 2021

ah got it 👍
Thanks @syranide !
Just to clarify - by "wrapping element" we mean the closing tag in the same line (which is absent in the 2nd example) right?

@syranide
Copy link
Contributor

syranide commented Feb 7, 2021

@nik72619c Anything really, simply put, like the explanation says, leading and trailing white-space is trimmed (from the point-of-view of the source code lines). So white-space that is surrounded by anything (text, tag, braces) will be kept, otherwise it will be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants