RFC: more robust E notation parser in BigDecimal #9580

stevegeek · 2020-07-06T15:38:54Z

Related to #9547

Maybe a more robust parser of the scientific or E notation inputs in BigDecimal is worth the effort? Eg I started writing up this EBNF-ish grammar below:

<scientific> :== [<sign>] <number>
<number> :== <digits> [ <fractional-optionaldigits> ] [ <exponent> ]   |  <fractional> [ <exponent> ]
<fractional-optionaldigits> :== '.' [ <digits> ]
<fractional> :== '.' <digits>
<exponent> :== <exp> [ <sign> ] <digits>
<digits> :== <digit>+
<digit> :== '0' | '1' ... | '9'
<exp> :== 'E' | 'e'
<sign> :== '+' | '-'

then created the productions below (syntax below works in http://hackingoff.com/compilers/ll-1-parser-generator)

Scientific -> OptionalSign Number
OptionalSign -> s |
Number -> Digits OptionalFractional OptionalExponent | Fractional OptionalExponent
OptionalFractional -> '.' OptionalDigits  
Fractional -> '.' Digits
OptionalExponent -> e OptionalSign Digits | 
OptionalDigits -> Digits |  
Digits -> Digit RepeatDigit
RepeatDigit -> Digit |  
Digit -> n

Here is a proposal that implements a first pass at this (not stack based to avoid needing a stack, just a translation of the parse table into logic):

PR which is still a work in progress and would benefit from your inputs should we wish to progress with this RFC: #9581

I also note overflowing exponent values will raise Invalid UInt64 exceptions (e.g BigDecimal.new("1e18446744073709551616")). Maybe this is an implementation detail which could be more neatly wrapped in a ArgumentError/InvalidBigDecimalException on parsing?

(originally part of #9547)

The text was updated successfully, but these errors were encountered:

asterite · 2020-07-06T15:41:37Z

@stevegeek Please send a PR with the changes. Otherwise it's impossible to review.

asterite · 2020-07-06T15:41:58Z

Oh, nevermind, I though this was it. I see there's a PR...

asterite · 2020-07-06T15:42:27Z

I was confused because you mentioned master...stevegeek:feature_e_notation_big_decimal_parser so I thought we had to look into your fork.

stevegeek · 2020-07-06T15:43:39Z

Sorry about that was a copy paste from the previous issue I opened!

stevegeek mentioned this issue Jul 6, 2020

RFC: E notation BigDecimal parser #9581

Draft

HertzDevil added status:discussion topic:stdlib:numeric labels Jul 31, 2021

stevegeek mentioned this issue Mar 11, 2022

Fix E notation parsing in BigDecimal #9577

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: more robust E notation parser in BigDecimal #9580

RFC: more robust E notation parser in BigDecimal #9580

stevegeek commented Jul 6, 2020 •

edited

Loading

asterite commented Jul 6, 2020

asterite commented Jul 6, 2020

asterite commented Jul 6, 2020

stevegeek commented Jul 6, 2020

RFC: more robust E notation parser in BigDecimal #9580

RFC: more robust E notation parser in BigDecimal #9580

Comments

stevegeek commented Jul 6, 2020 • edited Loading

asterite commented Jul 6, 2020

asterite commented Jul 6, 2020

asterite commented Jul 6, 2020

stevegeek commented Jul 6, 2020

stevegeek commented Jul 6, 2020 •

edited

Loading