-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Template string syntax highlighting (release-1.4) #1477
Comments
The issue here is that the there are two classifiers, one lexical and one syntactic. The lexical classifier is ment to be fast and stateless, something you run on the UI thread as the user is typing and get a result in ten milliseconds or so. The way this works is by only scanning, and looking at the lexim kind. This works well most of the time, but it can not get context sensitive constructs that require parsing, such as "var" in an object literal property name, without knowing you are inside an object literal it can be a keyword. The other classifier is syntactic. This is something that you run on a background thread with a lesser time limitation, say less than a 100 ms so you can afford to parse, and inspect the parse tree. This one is the accurate one as it had all the needed information. We are now adding incremental parsing support that should make syntactic parsing on an active file fast enough to be your only classifier you need. |
Also, we intend to have a dedicated thread for doing syntactic classification. That should help make sure you get correct classification in very reasonable times. |
Forgot to add the details: The lexical classifier on ts.Classifier interface, accessible through: The syntactic classifier is on the ts.LanguageService interface accessible through: For completeness, though not related to this issue, there is a third kind of classifiers: semantic classifier. ts.LanguageService.getSemanticClassifications, this one will give you more information about semantic classification for identifiers, e.g are they classes, interfaces, modules.. etc. this one uses the semantic state of the program along with the parse tree to figure out what identifiers really mean. |
Cool, thanks for the context (I have a much better sense of what each classifier does now). I'm currently calling getClassificationsForLine() (this is for the TypeScript plugin). I'm kinda curious, why are template strings treated differently from old school strings by the lexical classifier? Is there something specific about template strings that makes them different from normal strings in terms of syntax highlighting? |
Hey @derekcicerone, I actually had some work going on with the lexical classifier for template strings (as template strings were mostly my recent work), and for reasons @mhegazy mentioned, the experience was not very up to par. Just to elaborate for the sake of documentation, there are actually 4 types of template literal tokens:
So for all practical purposes in our lexical classifier, all we can really do is recognize a While you can see that the lexical classifier uses some basic heuristics to enhance the classifications it delivers per token, we try to keep it as simple as possible while still delivering a decent experience. Knowing how to differentiate between a So even so, you might say "Well why can't I at least get my `Hello${ ' ' }world!`; Here's the stream of tokens you'd get for that line without rescanning: Wait, what? Given that we're aiming to improve our syntactic classifier to subsume the need for fast-acting lexical classifications, it's not worth agitating our users by making the entire file flash to a different color for them when you enter what should be a Hope that answers your question! |
Whoa, that's really cool - thanks for the thorough explanation! |
@derekcicerone, you may be interested in my comment on #1698 (comment). In short, you can get very basic support if you cherry pick off e42ce9c. |
Ah cool, thanks for the tip! |
A fix is now in |
Awesome, thanks for the fix! I will check this out shortly. Related question: what is the proper design for using the 3 classifiers? It seems like the lexical classifier can be used synchronously but I'm guessing the semantic/syntactic classifiers need to be used asynchronously. I can add those to some existing async logic for showing occurrences, finding errors as the user is typing etc. I was wondering though, do I call both the syntactic and semantic classifiers? Only one? I'm not really clear on when each is supposed to be used. Any advice would be greatly appreciated! |
This looks great! Much better! Thanks! |
@basarat might also be interested in this for TypeStrong/atom-typescript#71. You can use the lexical classifier synchronously, and to ensure speed, you probably should, but it ultimately depends on your use case and platform - it's a matter of fine-tuning. Ideally, you'll have the other two classifiers running concurrently as well (or you can let them take over entirely at some point). As long as you prioritize the classifications of tokens in a certain way such that some can be appropriately overwritten by others, the ordering won't matter in practice. Syntactic classifications should always overwrite lexical classifications, and semantic classifications will overwrite some syntactic classifications (but only for identifiers). The semantic classifier can be run on the same thread as any other semantically-aware feature (e.g. completions lists), and the syntactic classifier can also be run on the same thread as any syntactically aware feature - though you may want to play around with this; syntactic classifications should also be as fast as possible, so we actually run it on a separate thread. Glad to hear it's working well for you so far - let me know if you have any other questions/suggestions! |
Cool, thanks for the explanation! |
I'm playing with the new support for template strings in release-1.4 and seeing potential issues with syntax highlighting.
In a single-line template string it looks like the string is classified as a keyword.
In a multi-line template string like this:
It looks like the "/hello/" is classified as a keyword and the "/world/" as a regexp.
Sorry if this is just not implemented yet. I looked for an issue first but didn't see one.
The text was updated successfully, but these errors were encountered: