Performance issue #36

manuel-rubio · 2015-06-25T16:19:23Z

I was trying to use this PEG to parse INI files:

ini <- (comment / section)+ `
    lists:append(Node)
`;

section <- space? header space? (comment / config)* space? `
    [_,_Header,_,ConfigLines|_] = Node,
    [ CL || CL <- ConfigLines, CL =/= ignore ]
`;

header <- '[' (!']' .)+ ']' breakline? ~;

config <- key space? '=' space? (!(breakline / comment) .)* (breakline / comment)? `
    [Key,_,_,_,Value|_] = Node,
    {Key, iolist_to_binary(Value)}
`;

comment <- space? ';' (!breakline .)* breakline? `
    ignore
`;

key <- [a-zA-Z0-9_\.]* `
    iolist_to_binary(Node)
`;

space <- [ \t\n\s\r]+ ~;

breakline <- [\n\r]+ ~;

And when I try to use it against a php.ini file with more than 1000 lines (most of them are comment lines), this code spend more than 5 seconds to parse the whole file... finding another solution (eini & zucchini in github.com, those projects use yrl and xrl files) spend less than 1 second to parse the same file, what part of my code is wrong? Thanks.

The php.ini file is here: https://raw.githubusercontent.com/php/php-src/master/php.ini-production

The text was updated successfully, but these errors were encountered:

bookshelfdave · 2015-06-25T16:25:39Z

what happens if you comment out your semantic actions? (iolist_to_binary() etc)

seancribbs · 2015-06-25T17:31:30Z

@manuel-rubio There are known performance issues with large files and large grammars. Can you profile the parser to see where it is taking the most time? My gut suspicion is that the negative-lookahead+repeat is what's killing it (lots of backtracking).

ElectronicRU · 2017-03-22T11:42:52Z

@seancribbs The performance issue can be alleviated by getting rid of ETS table and explicitly threading through a dict/map.

It would be quite easy to do, except neotoma itself uses memo table for some auxiliary information. For grammars that don't use memo table explicitly, should be an easy transformation though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issue #36

Performance issue #36

manuel-rubio commented Jun 25, 2015

bookshelfdave commented Jun 25, 2015

seancribbs commented Jun 25, 2015

ElectronicRU commented Mar 22, 2017

Performance issue #36

Performance issue #36

Comments

manuel-rubio commented Jun 25, 2015

bookshelfdave commented Jun 25, 2015

seancribbs commented Jun 25, 2015

ElectronicRU commented Mar 22, 2017