Referential Integrity issue causes perf issues in Apollo 3 #6202

3nvi · 2020-04-27T10:14:38Z

Whenever I use fetchMore to fetch more items from the server, I couple it with updateQuery in order to provide the merge strategy of the new data. A common practice is to "append" the new items into the existing list like so [...prevResult.data.items, ...fetchMoreResult.data.items].

I would expect that this would retain referential equality of the existing items, since that happens in vanilla JS. Unfortunately, this doesn't happen when using Apollo, since the items array has objects that are all referentially different from the previous iteration. For example, If I had an array of one item:

[{ text: ' first' }]

and then by fetching more I received another new item

[{ text: 'second' }]

and I merged them in the updateQuery using the typical practice

return [...prevResult.data.items, ...fetchMoreResult.data.items]

then the { text: ' first' } object would not be the same between renders, which means that React would have to re-render and any React.memo would be rendered useless. In a scenario where we have 100 or 200 expensive components that need to un-necessarily re-render everytime something gets added to the array, this may cause real issues.

Intended outcome:

I would expect the objects would not be changed in any shape or form, meaning that Apollo would not "clone" or in any way manipulate what I return from the updateQuery handler

Actual outcome:

Apollo alters the data somehow and sadly they lose their referential integrity

How to reproduce the issue:

In this codesandbox you can clearly see the issue if you open the console.

Clicking "fetch more" continuously appends items to a list. You would expect that between renders the 1st item would be the same (since we' re only adding to a list), but unfortunately it's referentially different on each render pass.

https://codesandbox.io/s/apollo-reference-equality-1exes?file=/src/App.js

Versions

System:
OS: macOS 10.15.2
Binaries:
Node: 10.16.0 - /usr/local/bin/node
Yarn: 1.19.0 - /usr/local/bin/yarn
npm: 6.14.2 - /usr/local/bin/npm
Browsers:
Chrome: 81.0.4044.122
Firefox: 72.0.2
Safari: 13.0.4
npmPackages:
@apollo/client: 3.0.0-beta.41 => 3.0.0-beta.41
apollo-link-error: ^1.1.12 => 1.1.12

The text was updated successfully, but these errors were encountered:

benjamn · 2020-05-27T19:27:39Z

Quick solution: if you want referential equality here, you need to give those { text } objects a __typename and id so they will have stable object references across repeated reads.

You seem to be working under the impression that the objects you write into the cache are the very same objects that the cache gives you back as result objects, as if the cache simply holds onto those objects temporarily, and later returns them to you. This is an incorrect interpretation of how the result caching system (#3394) works.

The goal/promise of the result caching system is to give you the same (===) result objects when you read a particular selection set repeatedly from the cache without changing any of the fields values that were involved in the previous read. Since field values are identified by the parent object ID plus field name (#5617), the only result objects eligible for reuse are those that have IDs.

In your reproduction, the only object with an ID is the ROOT_QUERY object, because you have not given your { text } objects IDs of their own. After you write the new getItems list, obviously the cache cannot return exactly the same { getItems: ... } result object as before, because the field identified by (ROOT_QUERY, getItems) has indeed changed in a meaningful way (its length increased by one). This change triggers rereading starting from ROOT_QUERY and continuing down to any other normalized (ID-having) entity objects that are encountered. Since there are no other normalized objects involved in this query, the rereading produces an entirely fresh tree of objects.

All of which is to say: if you value referential identity preservation, you should make sure the objects in question are normalized in the cache. In situations where you don't care so much about fine-grained referential identity preservation, you can omit the necessary __typename and key fields (e.g. id, or whatever keyFields you specify), and the cache will still give you correctly-shaped (just not ===) data.

You could imagine the cache somehow canonicalizing unrelated result objects that happen to be deeply equal to each other, returning the same object reference automatically. This is one of the reasons I'm excited about the Record and Tuple proposal that TC39 is currently considering. However, please note that this canonicalization step takes roughly the same amount of time as a deep equality check, which is something you could do in application-land just as quickly. The beauty of the current result caching system is that it completely avoids any rereading work as long as the previous result remains valid, without doing any hidden/extra work to guarantee === equality when the previous result is no longer valid.

Happy to answer questions about this system, since these are subtle issues/tradeoffs, but I believe it is working as designed.

3nvi · 2020-05-27T20:42:03Z

Wow thanks a lot for the detailed answer. I was indeed not aware of some of the logic involved with the cache.

if you value referential identity preservation, you should make sure the objects in question are normalized in the cache

Perfect, good to know!

which is something you could do in application-land just as quickly.

With the new trend on hooks, components tend to get bloated with memoization logic, so I'd prefer if I handled it outside the "context" of a React component.

In our app we fetch a paginated list of heavy read-only data, that I'd explicitly like to not normalize, since they have no reason to be normalized (e.g. server logs displayed in a table). Based on what you said, the only way to have referential equality is to add normalization to hte game, but I feel that this contradicts the benefits described in the related section here. Specifically, this part:

Although normalization makes sense in most cases, normalizing certain types of data can negatively affect performance. This is especially true if you’re caching large quantities of transient data (such as metrics that are identified by a timestamp and never receive updates).

I understand that an app-land solution for memoization can do the trick, but I'm just wondering if this is indeed out of the scope of Apollo or just a limitation that's tied to the implementation.

Thanks again

benjamn · 2020-05-27T21:04:59Z

I do think some sort of final memoization/canonicalization step is in scope for future versions of Apollo Client. I'd like to see some more progress on the Record and Tuple proposal, so we can start recommending/providing a lightweight polyfill implementation. It's also not out of scope to implement similar behavior ourselves, it's just more code for us to ship (and for you to bundle).

benjamn · 2020-05-27T21:14:48Z

As for the performance tradeoffs of normalization, I think you're right to point out that contradiction. Specifically, I think our recommendation about disabling normalization for performance may have been somewhat speculative, and could easily be trumped by the downstream performance benefits of reference equality.

Nowadays, I think of disabling normalization (with keyFields: false) as a way of expressing "this object belongs to its parent, so don't bother storing it separately." While that kind of control can be valuable in some cases, I would be surprised if it significantly affects performance.

3nvi · 2020-05-27T21:36:15Z

Gotcha,

I'll run some tests, compare timings with and get back to you if something seems off. For the time being, since what I originally described is expected from your side, I think it's fair to close this issue.

3nvi changed the title ~~Referential Integrity issue during fetchMore causes perf issues in Apollo 3~~ Referential Integrity issue causes perf issues in Apollo 3 Apr 27, 2020

3nvi mentioned this issue May 14, 2020

Apollo v3 merge function fails when spreading existing non-normalized data #6245

Closed

hwillson added this to the Release 3.0 milestone May 21, 2020

hwillson assigned hwillson and benjamn and unassigned hwillson May 26, 2020

benjamn added the ✍️ working-as-designed label May 27, 2020

3nvi closed this as completed May 27, 2020

niekert mentioned this issue Jul 23, 2020

Referential equality of individual items in a list is lost when appending a new normalized item to the cache #6683

Closed

github-actions bot locked as resolved and limited conversation to collaborators Feb 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Referential Integrity issue causes perf issues in Apollo 3 #6202

Referential Integrity issue causes perf issues in Apollo 3 #6202

3nvi commented Apr 27, 2020 •

edited

Loading

benjamn commented May 27, 2020 •

edited

Loading

3nvi commented May 27, 2020

benjamn commented May 27, 2020

benjamn commented May 27, 2020 •

edited

Loading

3nvi commented May 27, 2020 •

edited

Loading

Referential Integrity issue causes perf issues in Apollo 3 #6202

Referential Integrity issue causes perf issues in Apollo 3 #6202

Comments

3nvi commented Apr 27, 2020 • edited Loading

benjamn commented May 27, 2020 • edited Loading

3nvi commented May 27, 2020

benjamn commented May 27, 2020

benjamn commented May 27, 2020 • edited Loading

3nvi commented May 27, 2020 • edited Loading

3nvi commented Apr 27, 2020 •

edited

Loading

benjamn commented May 27, 2020 •

edited

Loading

benjamn commented May 27, 2020 •

edited

Loading

3nvi commented May 27, 2020 •

edited

Loading