Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gatsby): node persistence #31371

Merged
merged 51 commits into from
Jun 2, 2021
Merged

feat(gatsby): node persistence #31371

merged 51 commits into from
Jun 2, 2021

Conversation

vladar
Copy link
Contributor

@vladar vladar commented May 11, 2021

Description

Currently gatsby stores all data in NodeJS heap and persists it between builds using v8.serialize. This causes significant memory pressure when there are too many nodes, especially during the persistence step.

This PR adds another experimental data storage option: lmdb-store. Instead of keeping nodes in memory, they are instantly saved to this persistent embeddable storage. So there is no need to have a separate persistence step.

It imposes some additional constraints to plugins and gatsby sites, namely:

  1. Circular references in nodes are not allowed (will throw when trying to save)
  2. Node mutations are not allowed (e.g. assigning NODE___featuredImage after a node was created)
  3. Others - to be discovered

Note: this PR only moves nodes and nodesByType to LMDB. The rest of the build state remains in redux cache and persisted as usual.

This is the first PR in the series as LMDB unblocks several potential performance improvements. But currently, it may actually cause some memory overhead (average memory usage increases but peak memory spikes decrease, so fewer OOMs) + 10-15% slowdown during sourcing. Eventually, we will cut those down but that's the first step.

Try it

  1. Make sure you use Node 14.10+
  2. Set env variable GATSBY_EXPERIMENTAL_LMDB_STORE=1 and run gatsby build (or develop) or alternatively, enable it in gatsby-config.js:
module.exports = {
  flags: {
    LMDB_STORE: true
  }
}
  1. yarn add lmdb-store (since it is an experimental feature, lmdb-store is not a dependency yet and you must add it to your site explicitly)

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label May 11, 2021
@vladar vladar removed the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label May 11, 2021
@vladar vladar marked this pull request as ready for review May 27, 2021 12:07
vladar added 5 commits May 27, 2021 19:56
- replace reducers instead of returning early from the original ones
- move isLmdbStore checks to datastore index from the utility fn
@vladar vladar dismissed a stale review via 38ee163 May 31, 2021 09:30
@vladar vladar added the topic: data Relates to source-nodes, internal-data-bridge, and node creation label Jun 2, 2021
@vladar vladar merged commit 334d2bc into master Jun 2, 2021
@vladar vladar deleted the vladar/node-persistence branch June 2, 2021 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: data Relates to source-nodes, internal-data-bridge, and node creation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants