Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser syntax sparse bl #76

Merged
merged 6 commits into from
Oct 4, 2017

Conversation

zbraniecki
Copy link
Collaborator

That's more or less how I think it'll have to work if we want to keep the spans and proper tokenizer flow.

I can't say I like it... :(

@zbraniecki zbraniecki force-pushed the parser-syntax-sparse-bl branch from 9ce7bf8 to 188d717 Compare October 2, 2017 15:04
@zbraniecki zbraniecki requested a review from stasm October 2, 2017 15:38
@zbraniecki
Copy link
Collaborator Author

I think this is ready to be reviewed.

I analyzed the change in span ranges and the cause is that we now start the span for message value after we skip the inline whitespace. For example:

key = Value
     ^
     |---- start before
key = Value
      ^
      |---- start after

and:

key =
     ^
     |---- start before
    Value
key =
    Value
    ^
    |---- start after

@zbraniecki
Copy link
Collaborator Author

I'm still testing performance but first indications are that it doesn't impact runtime perf, just the fluent-syntax parsing one, which is not significant for us. I tested on node, need to recompile spidermonkey to test against it.

@zbraniecki
Copy link
Collaborator Author

Ok, got some more performance tests.

This patch does actually regress on jsshell:

parseFTL: 
  mean:   21812.23 (+24%)
  stdev:  2660.13
  sample: 30
parseFTLEntries: 
  mean:   3006.2 (+8%)
  stdev:  223.84
  sample: 30
format: 
  mean:   2752.9 (+76%)
  stdev:  452.41
  sample: 30

I have no idea why the format test gets such a penalty, since the AST produced with the patch is identical to the one produced on master. @stasm - any ideas?

@zbraniecki
Copy link
Collaborator Author

Here's the same data for nodejs:

parseFTL: 
  mean:   8712.83 (+16%)
  stdev:  620.47
  sample: 30
parseFTLEntries: 
  mean:   1983.53 (+13%)
  stdev:  352.69
  sample: 30
format: 
  mean:   744.17 (+7.000000000000001%)
  stdev:  88.91
  sample: 30

So, still a regression but not as significant as on jsshell. I have no idea what's up with jsshell... maybe with the patch we cross some boundary due to, I don't know, number of variables? Lines of code?

Copy link
Contributor

@stasm stasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. I'll review this in detail tomorrow. Leaving two quick comments for now.

@@ -31,6 +31,21 @@ export class FTLParserStream extends ParserStream {
}
}

peekSkipBlankLines() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is confusing. Does it peek or does it skip?

@@ -242,6 +243,7 @@ export default class FluentParser {

while (true) {
ps.expectChar('\n');
ps.skipBlankLines();
ps.skipInlineWS();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a common pattern. Perhaps it's worth having a function called expectIndent?

Copy link
Contributor

@stasm stasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was intrigued by the fact that neither getTags nor getAttributes didn't need skipBlankLines. It turns out that they currently use skipWS which is too tolerant about indentation. I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1405645.

resetPeek() {
this.peekIndex = this.index;
this.peekEnd = this.iterEnd;
resetPeek(pos = false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: pos doesn't require a default: it will be undefined if nothing is passed.

@@ -344,6 +367,16 @@ class RuntimeParser {
// by new line and `|` character at the beginning of the next one.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind fixing this comment while you're at it, please?

@zbraniecki zbraniecki requested a review from stasm October 4, 2017 12:48
Copy link
Contributor

@stasm stasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks! I'm happy about the change to the AST spans, too.

@stasm
Copy link
Contributor

stasm commented Oct 9, 2017

I filed bug 1406880 because it looks like the spans of multiline Patterns are slightly different from the ones visualized in your comment above. I'm not yet sure which approach is better, let's decide in the bug and fix this if needed.

@zbraniecki zbraniecki deleted the parser-syntax-sparse-bl branch December 12, 2017 04:43
stasm added a commit that referenced this pull request Jan 31, 2018
  - Implement Fluent Syntax 0.5.

    - Add support for terms.
    - Add support for `#`, `##` and `###` comments.
    - Remove support for tags.
    - Add support for `=` after the identifier in message and term
      defintions.
    - Forbid newlines in string expressions.
    - Allow trailing comma in call expression argument lists.

    In fluent-syntax 0.6.x the new Syntax 0.5 is supported alongside the old
    Syntax 0.4. This should make migrations easier.

    `FluentParser` will correctly parse Syntax 0.4 comments (prefixed with
    `//`), sections and message definitions without the `=` after the
    identifier. The one exception are tags which are no longer supported.
    Please use attributed defined on terms instead.

    `FluentSerializer` always serializes using the new Syntax 0.5.

  - Add `AST.Placeable` (#64)

    Added in Syntax Spec 0.4, `AST.Placeable` provides exact span data about
    the opening and closing brace of placeables.

  - Expose `FluentSerializer.serializeExpression`. (#134)

  - Serialize standalone comments with surrounding white-space.

  - Allow blank lines inside of messages. (#76)

  - Trim trailing newline from Comments. (#77)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants