Proposal: Ecosystem: CoffeeScript in Prettier #92

GeoffreyBooth · 2017-11-25T07:12:30Z

An oft-requested improvement to the CoffeeScript ecosystem is support for the language in Prettier. Our own @lydell is also a maintainer of that project, so I asked him what would be required to make it happen. He boiled it down to two major tasks:

Produce a detailed abstract syntax tree (AST)

Something would need to be able to produce a JSON representation of the nodes of the abstract syntax tree (AST). An AST is a representation of all the parts of syntax of a program, like AssignmentExpression; the site astexplorer.net has great examples. You can see a simplified version of CoffeeScript’s AST by running coffee --nodes test.coffee. A fuller version can be seen by going to http://asaayers.github.io/clfiddle/ and clicking the AST tab, then one of the nodes in the tree.

Since the CoffeeScript compiler itself already has the --nodes option, it seems logical to me to extend it to produce this JSON-based output. Currently the Node API for the coffeescript module doesn’t support a nodes option, so we could add one, and have its output be plain JavaScript objects that could be JSON.stringify’ed.

That wouldn’t be the end of the job, however. We would also need to ensure that this AST output is complete, with the same amount of information as the original source code, such that you could reconstruct the original source using nothing but this AST. In the CoffeeScript compiler, some simplifications are made at the lexer stage, before the nodes get generated: numbers lose
their original 0x, 0o or 0b prefix (if any), whitespace is lost in multiline strings, multiline regexes are turned into a RegExp() call, etc. These changes would need to be refactored to happen in nodes.coffee, or added detail about the node would need to be saved as a property on the node (like we currently tack on the source maps location data or comments). The goal is that this JSON representation of the source code could then be used to output new source code, formatted as Prettier deems it should be formatted. Which leads us to:

Write a CoffeeScript code generator

Once a JSON version of the AST is available, we’ll need some function that takes it as input and produces a string of CoffeeScript source code as output. You’ve probably seen one of these already: js2coffee takes an AST produced by a JavaScript parser and creates CoffeeScript source code from those nodes. The function that does this is called a code generator, and js2coffee’s is here. With dependencies, it’s over a thousand lines of code. There’s one other CoffeeScript code generator that I’m aware of, cscodegen produced by the CoffeeScriptRedux effort, but it hasn’t been updated since 2012.

Prettier is itself a code generator. If it were to support CoffeeScript, a new code generator would need to be written as part of Prettier itself. Within the Prettier codebase, the code generators for supported languages are in src/printer*.js. One code generator supports all of JavaScript plus TypeScript and Flow, and it’s plain printer.js. It’s 5,000 lines of code. Writing a similar generator for CoffeeScript might not be much simpler, but you would be able to use js2coffee and cscodegen’s codebases as reference (not to mention Prettier’s JavaScript code generator) so you’re not starting from scratch.

So . . .

I would be willing to tackle the first task, outputting a detailed JSON AST, if one or more volunteers were up for the second task. Does anyone desire CoffeeScript support in Prettier strongly enough to invest the time in writing a quality CoffeeScript code generator?

The text was updated successfully, but these errors were encountered:

GeoffreyBooth · 2017-11-25T07:15:17Z

By the way, either of these tasks are also investments in the extensibility of CoffeeScript in the future. The js2coffee project was possible in the first place because JavaScript has several excellent parsers that produce detailed ASTs. If js2coffee’s CoffeeScript code generation part could be replaced with a better code generator, js2coffee would be able to support the latest CoffeeScript features (and be more adaptable to future improvements). Coffeelint would be capable of greater things if it had a better AST to work with. And on and on.

CoffeeScriptRedux got so many things right, it’s a shame it never got completed. One of its insights was that code generation should be its own module that took an AST as input. (It supported both cscodegen to generate CoffeeScript, or escodegen to generate JavaScript.) This is also how Babel works, with babel-generator taking an AST and producing JavaScript. This modularity is one of the keys to Babel’s success, and the growth of the ecosystem around it. If the CoffeeScript compiler produced an AST compatible with Babel, the CoffeeScript compiler could outsource the JavaScript code generation to that and therefore jettison nodes.coffee’s 4,000 lines—a quarter of CoffeeScript’s entire codebase!

lydell · 2017-11-25T10:41:15Z

Well summarized!

I forgot to mention that every src/printer*.js file in Prettier basically just defines a single function with a big switch statement in it – with one case for every AST node type. In other words, all the function is doing is saying “Here is how you print a number, here is how you print a string, here is how you print an array, here is how you print an if statement, etc.”. Some cases call this function recursively, such as the array case for every item of the array.

One way to go about this would be to start writing a src/printer-coffeescript.js file, and see where you bump into problems. Then, go and improve the CoffeeScript parser around all those problems.

vendethiel · 2017-12-29T02:21:14Z

CSR did get a lot right, yes. We probably need Concrete Syntax Tree, but our lexer/rewriter code is... hairy to say the least.

GeoffreyBooth · 2017-12-29T02:45:10Z

What’s a “concrete” syntax tree?

vendethiel · 2017-12-29T11:21:39Z

https://eli.thegreenplace.net/2009/02/16/abstract-vs-concrete-syntax-trees/

A CST is very similar to an AST, but it keeps more parse information around (and doesn't remove seemingly useless nodes).

GeoffreyBooth · 2017-12-29T15:01:04Z

What I was thinking we would do is generate an AST add similar to Babel's as possible. Then we can crib code from the main JavaScript code path in Prettier to parse it. It'll also be useful for working with other tools to have as “standard” of an AST as possible. We would add extra properties to the AST nodes to preserve the info we would need to generate CoffeeScript again from the tree.

Are you interested in helping tackle this?

vendethiel · 2017-12-29T15:43:00Z

The project then becomes pretty much "rewrite the compiler" ... or "upgrade CSR to support all the things CS2 now supports", which is an insane amount of work.
Removing JS code generation from the CS compiler is a very noble goal, but a tremendous time-consuming one.

GeoffreyBooth · 2017-12-29T16:13:10Z

That's more ambitious than I had in mind. I was thinking of just doing what is proposed at the top of this thread: create a way for the compiler to output an AST as JSON, similar to how it currently outputs nodes data as text via --nodes; and create a printer file in Prettier for CoffeeScript, similar to its existing ones for JavaScript, TypeScript, Markdown and so on.

vendethiel · 2017-12-29T16:22:31Z

We're discard too much info imho. Just for implicit objects etc.

GeoffreyBooth · 2017-12-29T16:43:18Z

Right, that's what would need to be added to the nodes as extra info. Stuff like whether a boolean was written as true versus on or yes, etc. would all need to be added to the AST.

GeoffreyBooth · 2018-01-02T06:56:25Z

Okay, I’ve taken the first few baby steps in getting CoffeeScript to produce an AST. Check out this branch, then create a test.coffee at the root of the repo with whatever CoffeeScript code you want to see an AST of, then run:

coffee -e "console.log require('util').inspect require('./lib/coffeescript').compile(require('fs').readFileSync('./test.coffee').toString(), nodes: yes), {colors: yes, depth: 10}"

You should see pretty-printed JSON like this:

{ type: 'Block',
  loc: { start: { line: 0, column: 0 }, end: { line: 0, column: 10 } },
  expressions:
   [ { type: 'Assign',
       loc: { start: { line: 0, column: 0 }, end: { line: 0, column: 10 } },
       variable:
        { type: 'IdentifierLiteral',
          loc: { start: { line: 0, column: 0 }, end: { line: 0, column: 5 } },
          value: 'answer' },
       value:
        { type: 'NumberLiteral',
          loc: { start: { line: 0, column: 9 }, end: { line: 0, column: 10 } },
          value: '42' } } ] }

This is the same output as the CLI’s --nodes, in JSON form, with:

the node class names as type, which is how the Babel AST has them
the location data structured the way Babel does it
any primitive (boolean, number, string) properties included, as these are serializable as is into JSON
children included recursively

This was all done by adding just one method on the base node class, and for many nodes this is all the data we need. For more complicated nodes, the next step is to override this method to add additional serializable properties to flesh out the objects for those nodes; and in some of those cases, we’ll have to reach back to the lexer to make sure that currently-discarded data is forwarded along into nodes.coffee. Eventually, all objects for all node types should contain complete enough data that we can recreate the original source from this AST. There’s a ways to go to get from here to there, but it’s very doable. The above took less than 50 lines of code.

rattrayalex · 2018-01-06T20:59:30Z

This is exciting. I may be able to help with some of the prettier parts.

Could we write a test suite to ensure the ast being generated this way is always accurate?

GeoffreyBooth · 2018-01-06T22:18:34Z

@rattrayalex I would love the help. I’ve started a branch in a local copy of the Prettier repo, I’ll clean it up and post it soon and give you access.

We can certainly write tests around this. We could add a test/nodes.coffee in the CoffeeScript repo that calls CoffeeScript.compile(someCode, nodes: yes) and compares the response to some expected object. We should probably strip out stuff like the loc keys before comparing, so the tests don’t break for reasons we don’t care about as the compiler evolves over time.

rattrayalex · 2018-01-06T22:30:42Z

Babylon has a nice ast snapshot suite, might be worth checking out. But may not be needed for this. loc information should probably be tested somewhere, but I’m not actually sure it’s needed for prettier anyway. Can’t recall for sure.

GeoffreyBooth · 2018-01-08T07:39:12Z

@rattrayalex I’ve created a repo of my Prettier fork and branch here, and added you to it. If anyone else would like to contribute, please let me know and I would be happy to add you. In my fork, the default branch is coffeescript, so you can work in other branches and submit pull requests against coffeescript.

There was some major reorganization of the Prettier codebase in the last few months since I initially started my fork, so some of my work needed to be thrown out; but I had only just barely gotten started anyway. Look in the src/language-coffeescript folder, the files in there are where we’ll want to build out CoffeeScript support. See src/language-js and src/language-vue as points of reference.

GeoffreyBooth · 2018-01-08T07:49:17Z

Once you’ve checked out the Prettier CoffeeScript branch and run yarn to install dependencies, you can see the CoffeeScript code path in action by creating a test.coffee at the root of the repo and then running:

./bin/prettier.js --parser coffeescript test.coffee

Currently I’m just printing the AST, but you have to start somewhere 😄. This is using my nodes branch, so updates to that branch should affect this; you’ll probably want to link the CoffeeScript module installed by Prettier to a local copy, so you can develop in the two in tandem.

rattrayalex · 2018-01-29T03:29:24Z

Coming back to this, I might be able to help a bit in March but probably not earlier than that 😕 any progress in the last few weeks?

GeoffreyBooth · 2018-01-29T04:30:19Z

No, but I hope to have some time before March. I’ll push commits into both branches. Feel free to work in those repos, either in the same branches or other branches we can merge into them.

coffeescriptbot · 2018-02-19T12:12:10Z

Migrated to jashkenas/coffeescript#4984

GeoffreyBooth mentioned this issue Nov 25, 2017

Proposal: Ecosystem: Babel plugin? #76

Closed

GeoffreyBooth added the help wanted label Nov 26, 2017

GeoffreyBooth mentioned this issue Dec 29, 2017

Sourcemap: Mapping of end of blocks is unexpected jashkenas/coffeescript#4833

Closed

johnjeng mentioned this issue Feb 7, 2018

CS2 Discussion: Question: Survey for devs/teams with maintained Coffescript codebases: happy with CS, or considering migrating? #32

Closed

coffeescriptbot mentioned this issue Feb 19, 2018

CS2 Discussion: Question: Survey for devs/teams with maintained Coffescript codebases: happy with CS, or considering migrating? jashkenas/coffeescript#4928

Closed

coffeescriptbot changed the title ~~CoffeeScript in Prettier~~ Proposal: Ecosystem: CoffeeScript in Prettier Feb 19, 2018

This was referenced Feb 19, 2018

Proposal: Ecosystem: Babel plugin? jashkenas/coffeescript#4969

Open

Proposal: Ecosystem: CoffeeScript in Prettier jashkenas/coffeescript#4984

Closed

coffeescriptbot closed this as completed Feb 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Ecosystem: CoffeeScript in Prettier #92

Proposal: Ecosystem: CoffeeScript in Prettier #92

GeoffreyBooth commented Nov 25, 2017

GeoffreyBooth commented Nov 25, 2017

lydell commented Nov 25, 2017

vendethiel commented Dec 29, 2017

GeoffreyBooth commented Dec 29, 2017

vendethiel commented Dec 29, 2017

GeoffreyBooth commented Dec 29, 2017

vendethiel commented Dec 29, 2017

GeoffreyBooth commented Dec 29, 2017

vendethiel commented Dec 29, 2017

GeoffreyBooth commented Dec 29, 2017

GeoffreyBooth commented Jan 2, 2018

rattrayalex commented Jan 6, 2018 •

edited

Loading

GeoffreyBooth commented Jan 6, 2018

rattrayalex commented Jan 6, 2018 via email •

edited by GeoffreyBooth

Loading

GeoffreyBooth commented Jan 8, 2018

GeoffreyBooth commented Jan 8, 2018

rattrayalex commented Jan 29, 2018 •

edited

Loading

GeoffreyBooth commented Jan 29, 2018

coffeescriptbot commented Feb 19, 2018

Proposal: Ecosystem: CoffeeScript in Prettier #92

Proposal: Ecosystem: CoffeeScript in Prettier #92

Comments

GeoffreyBooth commented Nov 25, 2017

Produce a detailed abstract syntax tree (AST)

Write a CoffeeScript code generator

So . . .

GeoffreyBooth commented Nov 25, 2017

lydell commented Nov 25, 2017

vendethiel commented Dec 29, 2017

GeoffreyBooth commented Dec 29, 2017

vendethiel commented Dec 29, 2017

GeoffreyBooth commented Dec 29, 2017

vendethiel commented Dec 29, 2017

GeoffreyBooth commented Dec 29, 2017

vendethiel commented Dec 29, 2017

GeoffreyBooth commented Dec 29, 2017

GeoffreyBooth commented Jan 2, 2018

rattrayalex commented Jan 6, 2018 • edited Loading

GeoffreyBooth commented Jan 6, 2018

rattrayalex commented Jan 6, 2018 via email • edited by GeoffreyBooth Loading

GeoffreyBooth commented Jan 8, 2018

GeoffreyBooth commented Jan 8, 2018

rattrayalex commented Jan 29, 2018 • edited Loading

GeoffreyBooth commented Jan 29, 2018

coffeescriptbot commented Feb 19, 2018

rattrayalex commented Jan 6, 2018 •

edited

Loading

rattrayalex commented Jan 6, 2018 via email •

edited by GeoffreyBooth

Loading

rattrayalex commented Jan 29, 2018 •

edited

Loading