Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster JSONata evaluation by switching from generators to async/await #583

Merged
merged 14 commits into from
Sep 26, 2022

Conversation

tlakomy
Copy link
Contributor

@tlakomy tlakomy commented Jun 22, 2022

This PR is greatly inspired by and builds upon @avaidyam's work in #508

Apart from allowing non-blocking/asynchronous functions - fetch(), fs.readFile etc. to run in parallel (as noted by @avaidyam's), there are additional benefits of switching to async/await - performance benefits.

I've included two test cases in performance directory which contain two example JSONata expressions that take a non-trivial amount of time given an example array with 256 elements (see items.json):

{ 
  "items": items.{ 
    "foo": "bar", 
    "count": $count($$.items[text != "" and text != "foo"].row) 
  } 
}

Link: https://try.jsonata.org/sPNirb3df (it actually causes Expression evaluation timeout: Check for infinite loop in JSONata Exercises)

and

{
  "items": items#$i.{
    "foo": "bar",
    "label": $$.items[$i].label & " " & $$.items[text!="" and text!="test"][$i].text
  }
}

Link: https://try.jsonata.org/K4_Qr1Hof

I've used those two cases as a benchmark, all numbers were measured on my Apple M1 Max (I acknowledge this is not an average machine, so your results may vary):

master branch:

Group: performance
      ✓ case000.json: {"items": items.{"foo": "bar","count": $count($$.items[text!="" and text!="foo"].row)}} (1418ms)
      ✓ case001.json: {"items": items#$i.{"foo": "bar","label": $$.items[$i].label & " " & $$.items[text!="" and text!="test"][$i].text}} (1706ms)

async/await version:

Group: performance
      ✓ case000.json: {"items": items.{"foo": "bar","count": $count($$.items[text!="" and text!="foo"].row)}} (241ms)
      ✓ case001.json: {"items": items#$i.{"foo": "bar","label": $$.items[$i].label & " " & $$.items[text!="" and text!="test"][$i].text}} (304ms)

case000.json went down from ~1400ms to ~240ms and case001.json from ~1700ms to ~300ms

I've also measured how long it takes to run the entire test suite (again, on my personal machine so your results may vary), my results below:
master branch:
npm t 20.83s user 1.09s system 138% cpu 15.776 total

async/await version:
npm t 8.14s user 0.62s system 128% cpu 6.846 total


In addition to those changes this PR introduces nyc package in favour of istanbul following this comment: gotwarlost/istanbul#904 (comment)

@tlakomy
Copy link
Contributor Author

tlakomy commented Jun 22, 2022

Because of inclusion of eslint-plugin-promise I had to bump eslint to newer version (8.0.0):

npm ERR! Could not resolve dependency:
npm ERR! peer eslint@"^7.0.0 || ^8.0.0" from [email protected]
npm ERR! node_modules/eslint-plugin-promise
npm ERR!   dev eslint-plugin-promise@"^6.0.0" from the root project

@tlakomy tlakomy force-pushed the switch-to-async-await branch from f64888d to c49bf09 Compare June 22, 2022 12:14
@tlakomy tlakomy force-pushed the switch-to-async-await branch from c49bf09 to 9be85b9 Compare June 22, 2022 12:19
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 9fba964 on Stedi:switch-to-async-await into 3cea53f on jsonata-js:master.

@avaidyam
Copy link

One issue my team has noticed in production is dealing with cancellation of processing. There's no clear way right now to cancel all promises spawned by a JSONata expression if another part of the expression encountered an error. I'm curious to know if your PR handle this somehow or you were able to mitigate this issue some other way?

@andrew-coleman
Copy link
Member

This is a great piece of work, thank you for the contribution.
Because this breaks the JS API (evaluate is now async), I will release this as a v2.0.0 package in npm.
Before I do that, I need to fix the exerciser which will be broken by this change (should be trivial to fix).

@andrew-coleman andrew-coleman merged commit e55991f into jsonata-js:master Sep 26, 2022
@koladilip
Copy link

@andrew-coleman, when can we expect to be released to npm?

@knolleary
Copy link
Contributor

Is there any documentation on what the JavaScript API changes are? The 2.0 release notes highlight this PR as a breaking change, but I can't find any docs on what changes are needed in order to migrate from 1.x to 2.x

@mattbaileyuk
Copy link
Member

The evaluate() function is now async, and while the README was updated to support this, it's a good point that there's not something explicit about the changes to make. The commit for the README update (c008688) probably shows it best at the moment.

@knolleary
Copy link
Contributor

Ah ok. Yeah, that's a pretty major change. So there's no synchronous option at all? We have a bunch of synchronous APIs that make use of JSONata... making them all async will be a major breaking change for us.

@avaidyam
Copy link

avaidyam commented Mar 6, 2023

I wonder if adding async and sync keywords to control this within the JSONata script would be beneficial? Or at least a toggle boolean to switch the entire engine to become sync vs. async?

@mattbaileyuk
Copy link
Member

@andrew-coleman Any thoughts on switching between async and sync?

This might be best taken to a new issue rather than on a closed PR...

@dakhnod
Copy link

dakhnod commented May 29, 2024

So, having multiple documents to be processed by some jsonata expression, how do I do that efficiently?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants