Template string syntax highlighting (release-1.4) #1477

derekcicerone-zz · 2014-12-13T20:32:36Z

I'm playing with the new support for template strings in release-1.4 and seeing potential issues with syntax highlighting.

In a single-line template string it looks like the string is classified as a keyword.

In a multi-line template string like this:

var template = `/hello/
/world/`;

It looks like the "/hello/" is classified as a keyword and the "/world/" as a regexp.

Sorry if this is just not implemented yet. I looked for an issue first but didn't see one.

The text was updated successfully, but these errors were encountered:

mhegazy · 2014-12-14T05:15:06Z

The issue here is that the there are two classifiers, one lexical and one syntactic.

The lexical classifier is ment to be fast and stateless, something you run on the UI thread as the user is typing and get a result in ten milliseconds or so. The way this works is by only scanning, and looking at the lexim kind. This works well most of the time, but it can not get context sensitive constructs that require parsing, such as "var" in an object literal property name, without knowing you are inside an object literal it can be a keyword.

The other classifier is syntactic. This is something that you run on a background thread with a lesser time limitation, say less than a 100 ms so you can afford to parse, and inspect the parse tree. This one is the accurate one as it had all the needed information.

We are now adding incremental parsing support that should make syntactic parsing on an active file fast enough to be your only classifier you need.

CyrusNajmabadi · 2014-12-14T06:13:57Z

Also, we intend to have a dedicated thread for doing syntactic classification. That should help make sure you get correct classification in very reasonable times.

mhegazy · 2014-12-14T06:45:04Z

Forgot to add the details:

The lexical classifier on ts.Classifier interface, accessible through:
createClassifier(...).getClassificationsForLine()

The syntactic classifier is on the ts.LanguageService interface accessible through:
createLanguageservice(...).getSyntacticClassifications()

For completeness, though not related to this issue, there is a third kind of classifiers: semantic classifier. ts.LanguageService.getSemanticClassifications, this one will give you more information about semantic classification for identifiers, e.g are they classes, interfaces, modules.. etc. this one uses the semantic state of the program along with the parse tree to figure out what identifiers really mean.

derekcicerone-zz · 2014-12-14T09:20:55Z

Cool, thanks for the context (I have a much better sense of what each classifier does now). I'm currently calling getClassificationsForLine() (this is for the TypeScript plugin). I'm kinda curious, why are template strings treated differently from old school strings by the lexical classifier? Is there something specific about template strings that makes them different from normal strings in terms of syntax highlighting?

DanielRosenwasser · 2014-12-14T09:50:26Z

Hey @derekcicerone, I actually had some work going on with the lexical classifier for template strings (as template strings were mostly my recent work), and for reasons @mhegazy mentioned, the experience was not very up to par.

Just to elaborate for the sake of documentation, there are actually 4 types of template literal tokens: TemplateHead, TemplateMiddle, TemplateTail, and NoSubstitutionTemplateLiteral. The last one is a basic literal that looks like a string. The other three are components of a TemplateExpression. As soon as we have a TemplateHead, we start parsing out these TemplateSpans, which are pairs of Expressions with either a TemplateMiddle or TemplateTail. This is entirely motivated by the ES6 spec, and I encourage you to take a look into it. Just to make sure we're on the same page:

Syntax Kind	Example
`NoSubstitutionTemplateLiteral`	`Hello world`
`TemplateHead`	``abc ${`
`TemplateMiddle`	`} def ${`
`TemplateTail`	`} ghi ``
`TemplateExpression`	`abc ${ 0 } def ${ 1 } ghi`

So for all practical purposes in our lexical classifier, all we can really do is recognize a TemplateHead and a NoSubstitutionTemplateLiteral. This is because in certain context-sensitive positions, the lexer needs to try to deliver tokens with a different "lexical goal" in mind (which we achieve with our rescan* family of functions), but those two tokens are completely accountable for, because no other token starts with a backtick.

While you can see that the lexical classifier uses some basic heuristics to enhance the classifications it delivers per token, we try to keep it as simple as possible while still delivering a decent experience. Knowing how to differentiate between a CloseBraceToken and the beginning of a TemplateMiddle or TemplateTail is a little complex - we'd really rather not maintain a stack across lines (which is also very error-prone given the way in which the lexical classifier operates).

So even so, you might say "Well why can't I at least get my TemplateHeads and NoSubstitutionTemplateLiterals?" Well, problematically when you have a TemplateTail, you end on a backtick. For example:

`Hello${ ' ' }world!`;

Here's the stream of tokens you'd get for that line without rescanning: TemplateHead, StringLiteral, CloseBraceToken, Identifier ("world"), ExclamationToken, NoSubstitutionTemplateLiteral.

Wait, what? NoSubstitutionTemplateLiteral? Yup! Because we can't contextually know that the CloseBraceToken should have been the TemplateTail, and so you now have a template string at the end of the line. Now the rest of your file is temporarily "poisoned" to be classified as a template string, much in the same way that a /* will turn your entire file into a comment - except that we'd be _wrong_ in this instance.

Given that we're aiming to improve our syntactic classifier to subsume the need for fast-acting lexical classifications, it's not worth agitating our users by making the entire file flash to a different color for them when you enter what should be a TemplateTail.

Hope that answers your question!

derekcicerone-zz · 2014-12-15T04:34:10Z

Whoa, that's really cool - thanks for the thorough explanation!

DanielRosenwasser · 2015-01-20T02:22:06Z

@derekcicerone, you may be interested in my comment on #1698 (comment).

In short, you can get very basic support if you cherry pick off e42ce9c.

derekcicerone-zz · 2015-01-20T02:36:40Z

Ah cool, thanks for the tip!

DanielRosenwasser · 2015-02-13T22:09:55Z

A fix is now in master through #2026.

derekcicerone-zz · 2015-02-13T22:30:48Z

Awesome, thanks for the fix! I will check this out shortly.

Related question: what is the proper design for using the 3 classifiers? It seems like the lexical classifier can be used synchronously but I'm guessing the semantic/syntactic classifiers need to be used asynchronously. I can add those to some existing async logic for showing occurrences, finding errors as the user is typing etc. I was wondering though, do I call both the syntactic and semantic classifiers? Only one? I'm not really clear on when each is supposed to be used. Any advice would be greatly appreciated!

derekcicerone-zz · 2015-02-13T22:38:37Z

This looks great! Much better! Thanks!

DanielRosenwasser · 2015-02-14T01:37:59Z

@basarat might also be interested in this for TypeStrong/atom-typescript#71.

You can use the lexical classifier synchronously, and to ensure speed, you probably should, but it ultimately depends on your use case and platform - it's a matter of fine-tuning.

Ideally, you'll have the other two classifiers running concurrently as well (or you can let them take over entirely at some point). As long as you prioritize the classifications of tokens in a certain way such that some can be appropriately overwritten by others, the ordering won't matter in practice. Syntactic classifications should always overwrite lexical classifications, and semantic classifications will overwrite some syntactic classifications (but only for identifiers).

The semantic classifier can be run on the same thread as any other semantically-aware feature (e.g. completions lists), and the syntactic classifier can also be run on the same thread as any syntactically aware feature - though you may want to play around with this; syntactic classifications should also be as fast as possible, so we actually run it on a separate thread.

Glad to hear it's working well for you so far - let me know if you have any other questions/suggestions!

derekcicerone-zz · 2015-02-14T01:54:13Z

Cool, thanks for the explanation!

mhegazy added By Design Deprecated - use "Working as Intended" or "Design Limitation" instead and removed By Design Deprecated - use "Working as Intended" or "Design Limitation" instead labels Dec 14, 2014

DanielRosenwasser closed this as completed Dec 14, 2014

This was referenced Jan 16, 2015

String templates not colorized in VS 2013 #1698

Closed

Template strings and keyword colorization #1716

Closed

derekcicerone-zz mentioned this issue Jan 26, 2015

Better coloring for template strings palantir/eclipse-typescript#222

Closed

DanielRosenwasser mentioned this issue Feb 2, 2015

Template string aren't recognized by classifier #1884

Closed

This was referenced Feb 11, 2015

Would be great if multiline strings were categorized differently #2005

Closed

Advanced Dynamic Grammar TypeStrong/atom-typescript#71

Closed

DanielRosenwasser added Fixed A PR has been merged for this issue Bug A bug in TypeScript and removed By Design Deprecated - use "Working as Intended" or "Design Limitation" instead labels Feb 13, 2015

derekcicerone-zz mentioned this issue Feb 14, 2015

Implement syntactic/semantic classification support palantir/eclipse-typescript#228

Open

DanielRosenwasser mentioned this issue Jun 25, 2015

Strange colourisation of template strings when "#/" present after first placeholder #3630

Closed

AndrewJakubowicz mentioned this issue Feb 11, 2017

Unable to parse files with template strings AndrewJakubowicz/ts-depDraw#4

Open

microsoft locked and limited conversation to collaborators Jun 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Template string syntax highlighting (release-1.4) #1477

Template string syntax highlighting (release-1.4) #1477

derekcicerone-zz commented Dec 13, 2014

mhegazy commented Dec 14, 2014

CyrusNajmabadi commented Dec 14, 2014

mhegazy commented Dec 14, 2014

derekcicerone-zz commented Dec 14, 2014

DanielRosenwasser commented Dec 14, 2014

derekcicerone-zz commented Dec 15, 2014

DanielRosenwasser commented Jan 20, 2015

derekcicerone-zz commented Jan 20, 2015

DanielRosenwasser commented Feb 13, 2015

derekcicerone-zz commented Feb 13, 2015

derekcicerone-zz commented Feb 13, 2015

DanielRosenwasser commented Feb 14, 2015

derekcicerone-zz commented Feb 14, 2015

Template string syntax highlighting (release-1.4) #1477

Template string syntax highlighting (release-1.4) #1477

Comments

derekcicerone-zz commented Dec 13, 2014

mhegazy commented Dec 14, 2014

CyrusNajmabadi commented Dec 14, 2014

mhegazy commented Dec 14, 2014

derekcicerone-zz commented Dec 14, 2014

DanielRosenwasser commented Dec 14, 2014

derekcicerone-zz commented Dec 15, 2014

DanielRosenwasser commented Jan 20, 2015

derekcicerone-zz commented Jan 20, 2015

DanielRosenwasser commented Feb 13, 2015

derekcicerone-zz commented Feb 13, 2015

derekcicerone-zz commented Feb 13, 2015

DanielRosenwasser commented Feb 14, 2015

derekcicerone-zz commented Feb 14, 2015