Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework RTF/CRE code #1364

Merged
merged 9 commits into from
Oct 9, 2021
Merged

Conversation

benoit-pierre
Copy link
Member

@benoit-pierre benoit-pierre commented Jun 22, 2021

Summary of changes

New parser:

  • don't rely on regular expressions for parsing (only for tokenizing).
  • correctly handle multi-lines mappings
  • detect syntax errors (with recovery)
  • use \n\n for new paragraphs (instead of non-undoable {#Return}{#Returns})
  • similarly, use \t and \n for \tab and \line
  • faster than previous code (~1.8 faster at loading a 46.5 MB dictionary, ~2.4 when combined with use plover_stroke #1362)

New save code:

  • process each formatting atom separately
  • correctly escape {}\
  • use custom ignored groups for Plover macros and metas
  • use groups to improve round-tripping affixes (so there's no ambiguity when parsing back, e.g. {^in^}fix -> {\cxds in \cxds}fix instead of \cxds in\cxds fix)

Closes #209, #589, #773, #1128.

Pull Request Checklist

  • Changes have tests
  • News fragment added in news.d. See documentation for details

lambda: ('{^\n\n^}{-|}', r'{\cxds \par \cxds}\cxfc '),
# Plover custom formatting:
# - meta: command
lambda: ('{PLOVER:TOGGLE}', r'{\*\cxplovermeta PLOVER:TOGGLE}'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\cxplovercommand maybe? and the PLOVER: is redundant here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the goal is to avoid the need for a specific control word for each feature supported by metas, so only one general \cxplovermeta control word, with the content.

# - meta: command
lambda: ('{PLOVER:TOGGLE}', r'{\*\cxplovermeta PLOVER:TOGGLE}'),
# - meta: key combo
lambda: ('{#Return}', r'{\*\cxplovermeta #Return}'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\cxploverkeycombo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

plover/dictionary/rtfcre_dict.py Outdated Show resolved Hide resolved
@benoit-pierre benoit-pierre force-pushed the rtf_rework branch 2 times, most recently from d63a958 to 12d190e Compare June 23, 2021 02:10
@benoit-pierre
Copy link
Member Author

I forgot to mention there's one xfailed test left:

    # Why not 'eclipse command'?
    lambda: rtf_load_test('{eclipse command}', '{eclipse command}', xfail=True),

Which has the comment mention does not make sense: eclipse command it the logical output.

@user202729
Copy link
Member

user202729 commented Jun 23, 2021

I don't see any issue. (although I don't really use RTF)

Notes:

  • If you don't need to support Python <=3.3, use fullmatch instead of match and add $ may be clearer.
  • (possible improvement) Add some code to make RTF->Plover round-trip better for unsupported group (for example {plover:rtf_unknown:\cxsvatdictentrydate\yr2006\mo5\dy10})
  • It isn't yet possible to do rtf -> Plover -> rtf (for {^-^} -> \_.), but if/when it's possible to do Plover -> rtf -> Plover then it may make sense to also test round-trip in the test_format_translation.

@benoit-pierre
Copy link
Member Author

* If you don't need to support Python <=3.3, use `fullmatch` instead of `match` and add `$` may be clearer.

Good point, done.

* (possible improvement) Add some code to make RTF->Plover round-trip better for unsupported group (for example `{plover:rtf_unknown:\cxsvatdictentrydate\yr2006\mo5\dy10}`)

Yeah, no, saving in Plover is always going to be a destructive operation: between the unsupported control groups/words, the fact that they can stack, the mostly ignored header part of the RTF (stylesheet...).

* It isn't yet possible to do rtf -> Plover -> rtf (for  `{^-^}` -> `\_`.), but if/when it's possible to do Plover -> rtf -> Plover then it may make sense to also test round-trip in the `test_format_translation`.

I'll see if I can merge those tests with the save ones (which include a round-trip check).

- correctly escape special characters
- properly handle multiple formatting atoms
Use a custom ignored group to handle Plover' specific metas.
By using groups so they can be parsed back correctly.
@benoit-pierre
Copy link
Member Author

  • It isn't yet possible to do rtf -> Plover -> rtf (for {^-^} -> \_.), but if/when it's possible to do Plover -> rtf -> Plover then it may make sense to also test round-trip in the test_format_translation.

I'll see if I can merge those tests with the save ones (which include a round-trip check).

Perfect roundtrip is not always possible, e.g.: {*!} -> {\*\cxplovermacro retrospective_delete_space} -> =retrospective_delete_space.

And there are corresponding load tests for checking the conversion in the other direction, so I'm gonna leave it as is, except I added support for {^ ^} -> \~ and {^-^} -> \_.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

spacing bug when loading from RTF dictionary
3 participants