Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for Melani system #987

Closed
2 tasks done
benoit-pierre opened this issue Aug 1, 2018 · 11 comments
Closed
2 tasks done

support for Melani system #987

benoit-pierre opened this issue Aug 1, 2018 · 11 comments
Assignees
Milestone

Comments

@benoit-pierre
Copy link
Member

benoit-pierre commented Aug 1, 2018

2 changes are needed for fulling supporting a Melani system plugin:

  • support for prefix strokes (only match at the beginning of a new word)
  • formatting look ahead support (for conditional output)
@SeaLiteral
Copy link

What does "support for prefix strokes" mean? Does it mean tucked prefixes as described in #974 or does it mean applying orthography rules when adding prefixes? It seems orthography.py has an add_suffix function but no add_prefix function, but I don't know if Italian would need that (Spanish does have a rule that if a word starts with an r and you add a prefix that ends with a vowel, you have to double that r.

@benoit-pierre
Copy link
Member Author

Prefix strokes only match at the beginning of a word: this work with Melani because words are explicitly terminated (when using one of the terminating vowels). So all output from the orthographic Python dictionary is basically prefixes, unless ending with one of those terminating vowels.

Example blackbox test:

def test_prefix_strokes(self):
    r'''
    "/S": "{prefix^}",
    "S": "{^suffix}",
    "O": "{O'^}{$}",

    S/S/O/S/S  " prefixsuffix O'prefixsuffix"
    '''

Note: {$} is new formatting syntax to explicitly mark the end of a word

@benoit-pierre
Copy link
Member Author

benoit-pierre commented Aug 9, 2018

And some example blackbox tests for look ahead support:

def test_conditionals_1(self):
    r'''
    "*": "=undo",
    "S-": "{=(?i)t/true/false}",
    "TP-": "FALSE",
    "T-": "TRUE",

    S-   ' false'
    TP-  ' false FALSE'
    *    ' false'
    T-   ' true TRUE'
    *    ' false'
    S-   ' false false'
    TP-  ' false false FALSE'
    *    ' false false'
    T-   ' true true TRUE'
    '''
 
def test_conditionals_2(self):
    r'''
    "1": "{=(?i)([8aeiouxy]|11|dei|gn|ps|s[bcdfglmnpqrtv]|z)/agli/ai}",
    "2": "oc{^}chi",
    "3": "dei",
    "4": "sti{^}vali",

    1  ' ai'
    2  ' agli occhi'
    1  ' agli occhi ai'
    3  ' agli occhi agli dei'
    1  ' agli occhi agli dei ai'
    4  ' agli occhi agli dei agli stivali'
    '''

@SeaLiteral
Copy link

SeaLiteral commented Aug 9, 2018

So the second example has 1 write ai unless there's a word after it, in which case it writes agli. I assume it will write ai if it is followed by a punctuation mark or suffix. And the second one writes falls unless the next word starts with a t and is probably not case sensitive. Am I reading it right? So does there need to be word boundaries involved?

Here's how I imagine it used in a partial solution for #990.

"STAOÆL": "stil",
"STÆÅL": "stil{=(?i)\{\^[a-z]/l/}",
"-R": ": "{^er}"

r'''
STÆÅL	stil
STAOÆL	stil stil
-R	stil stiler
STÆÅL	stil stiler stil
-R	stil stiler stiller
'''

In practice, I think for Danish orthography rules with hints might be a more flexible solution, but should I implement hinting, wait for look ahead support and use that, or wait for look ahead support and then implement hinting (to do hinting, I'd mainly have to make changes to formatting.py.

Edit: Changed regex to look for {^[a-z] rather than just the caret. I'm using that to mean that the next word isn't a word but a suffix, and one that uses orthography.

@benoit-pierre
Copy link
Member Author

benoit-pierre commented Aug 9, 2018

It's: {=REGEXP/TRANSLATION_IF_FOLLOWING_TEXT_MATCH/TRANSLATION_IF_NOT}. And it's horrible... Honestly, I'd rather not add support. The way formatting work, is actions can depend on previous actions. With that code, you now have actions depending on future actions.... It's complex, it's buggy, and it's limited...

@SeaLiteral
Copy link

What I thought of doing for Danish was just have a do-nothing command that takes a parameter and stores it in the stroke class in the same way capitalise next stores the case for the next stroke, then let orthography rules access that parameter along with the word (maybe just by concatenating them since Danish doesn't change anything before the last vowel in the root). Also ugly, and it only works because of what I'm using it for. Here's what it looks like:

ORTHOGRAPHY_RULES = [
(r'^(.*)(\{\+:\|)(.*)(\})(.*) \^ er$', r'\3\5er'),
(r'^(.*)(\{\+:)(.*)(\})(.*) \^ er$', r'\1\3\5er'),
]

"STAOÆL": "stil",
"STÆÅL": "stil{+:l}",
"STAORT": "studér{+:|studer}
"-R": ": "{^er}"

r'''
STÆÅL	stil
STAOÆL	stil stil
-R	stil stiler
STÆÅL	stil stiler stil
-R	stil stiler stiller
STAORT	stil stiler stiller studér
-R	stil stiler stiller studerer
'''

@SeaLiteral
Copy link

Note: #974 also deals with prefix folding.

Question: I'm thinking of changing the way orthography rules get applied (this would apply to language systems that set certain attributes, it shouldn't cause the current English system to stop working) and I'm wondering if I should finish that now or wait for the look ahead to be in Plover first.

One of the features I want to add is like look ahead but within words (Danish orthography rules need pronunciation metadata, the spelling is too ambiguous to use on its own) so if you want I can try to apply orthography rules across word boundaries and see if I can turn that into something usable for word look ahead (e.g. replace the English word "a" with "an" if the next word starts with a vowel letter and it's dictionary entry doesn't start with something like "{+:consonant}"). Obviously, this would place the regexes in the system rather than in the dictionary, making them harder to edit, but if the regexes are just checking the gender and number of the next word or preventing one word from ending with a sound similar to the first sound in the next word, then I think having the regexes in the system rather than the dictionary would be less error prone (by not putting identical regexes in a bunch of dictionary entries). What do you think? I'm interested in trying this approach, but I obviously don't want to make the changes required for Danish at the same time as someone else is changing formatting.py, as I guess combining my changes with yours could require additional changes.

@morinted morinted modified the milestones: 4.0.0, < 4.0.0 Jan 20, 2019
@morinted
Copy link
Member

Are we still thinking of including Melani lookahead support in 4.0 or should we move ahead without it?

@nvdaes
Copy link
Contributor

nvdaes commented Sep 30, 2020

Hello, I have started using the Melani system with Plover just few days ago. We use it with our personal spanish database that has different fields for translations, like Completa, semplice (for joining different translations included in a single stroke) and another field to make prefixes. For us it's very hard to produce an equivalent dictionary. We'll be grateful if you can add support or at least document how to go on.
I have downloaded a preconfigured copy of Plover for Melani system, but we're trying to add our translations without using Italian rules to take more control.
I downloaded Plover from
https://www.stenolab.it/plover/
Thanks

@nvdaes
Copy link
Contributor

nvdaes commented Oct 2, 2020

Commenting this in my job, my employers are willing to pay if this feature can be added, since it maybe difficult to buy Melani machines and we need to use our dictionaries.
I would like to use tables in CSV format if possible, but otherwise a different format with just two columns is also valid for us.
The company where I work is at
http://www.mqd.es

We work on transcriptions and captions in real time.
Thanks

@nvdaes
Copy link
Contributor

nvdaes commented Oct 11, 2020

I have created a plugin for this, according by my personal usage for Spanish, at

https://github.com/nvdaes/plover_spanish_mqd

We have to add our personal database and test in real work, but it seems to work with keyboard, though more testing is needed.
Thanks for this wonderful program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants