Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC withCache utility #600

Merged
merged 14 commits into from
Mar 29, 2023
Merged

PoC withCache utility #600

merged 14 commits into from
Mar 29, 2023

Conversation

frandiox
Copy link
Contributor

@frandiox frandiox commented Mar 1, 2023

Proof of concept for a storefront.withCache withCache utility similar to useQuery. This comes from an internal draft doc by @blittle

Notes:

  • We already have a storefront.cache, so I've changed its name to storefront.withCache for now. It's not a standalone withCache returned by const {storefront, withCache} = createStorefrontClient(...)
  • It reuses most of the existing internal logic so it doesn't add much.

Questions:

  • Should this really be inside storefront? It's not going to be used for storefront queries so perhaps it should be returned from const {storefront, withCache} = createStorefrontClient(...) instead, or a different utility?
  • Should this be created by a different utility (i.e. not in const {storefront, withCache} = createStorefrontClient(...)).
  • Just like we had useQuery and fetchSync to make cache easier with fetch, would it make sense to have both withCache and fetchWithCache exposed?
  • This uses shouldCacheResult parameter to know if the result should be cached or not (similar to the internal shouldCacheResponse in storefront.query, useful when there are GraphQL errors in the payload). Should we allow signaling "no cache" with a return value null instead? Or, alternatively, allow passing the cache strategy as part of the result:
      // Current:
      withCache('key', () => {/*... fetch data*/; return data}, {shouldCacheResult: (data) => !data.errors})
      // vs `null`
      withCache('key', () => {/*... fetch data*/; return data.errors ? null : data})
      // vs dynamic strategy decided once the result is known, not before
      withCache('key', () => {/*... fetch data*/; return { data, strategy: data.errors ? NoCache() : CacheShort() }})

How to test:
Add the following code to root.tsx and refresh the page several times.

export async function loader({context: {storefront, withCache}}: LoaderArgs) {
  await withCache(
    'test-with-cache',
    async () => console.log('this should only show up once every 10 secs'),
    {
      strategy: storefront.CacheCustom({
        mode: 'public',
        maxAge: 10,
      }),
    },
  );

...

}

@frandiox frandiox requested a review from a team March 1, 2023 07:19
@github-actions

This comment has been minimized.

@davidhousedev
Copy link
Contributor

Just my $0.02, I think it should be separate from the storefront object. We use useQuery for external CMS queries which are outside of the Shopify domain.

It looks like withCache is using the same Cloudflare cache as Hydrogen. Should consumers be able to use this utility with any arbitrary cache, e.g. caches.open('contentful') to avoid key namespace collision?

@blittle
Copy link
Contributor

blittle commented Mar 1, 2023

@davidhousedev that's great feedback. The real goal of this utility is to provide an easy cache API for 3p requests. And in that sense, it doesn't really make sense to be on the storefront object, because they aren't storefront requests. Initially I felt that people should just use the Cache API directly, but admittedly after PoC-ing a number of implementations, it is pretty nasty to work with. The fact that the cache key is a request object just makes things awkward in a number of ways. Especially because not everything in a cache is going to be related to a request.

Perhaps we could provide a helper function, that you'd call to pass something to your context:

import {createRequestHandler, createCacheHandler} from '@shopify/remix-oxygen';
export default {
  async fetch(request, env, executionContext) {
    const {storefront} = createStorefrontClient(...);
    const handleRequest = createRequestHandler({
      ...,
      getLoadContext: () => ({
        ...,
        storefront,
        sanityCache: createCacheHandler(await caches.open('sanity')),
        dynamicYield: createCacheHandler(await caches.open('dynamicYield')),
      })
    })
  }
}

And then from within a route loader:

export async function loader({ context: { storefront, sanityCache } }) {
  const [storefrontData, sanityData] = await Promise.all([
    storefront.query(`some query`),
    sanityCache(["some", "key"], async function () {
      return fetch("sanitiy-api");
    }), // or some other flavor like Fran mentions above
  ]);
}

@benjaminsehl
Copy link
Member

Yep, 100% agree — inside storefront is very confusing.

I also like your notion of a helper function for individual clients in server.ts @blittle. I'd suggest we make it relatively simple to have a somewhat generic useQuery-esque function … perhaps fetchWithCache if that's the best name we can think of … since that will cover a big set of the use cases.

Then we also expose these lower level things so folks can have more control — and so app integration partners can have a happy partners to create their own clients (like your examples of Sanity and DY).

@frehner
Copy link
Contributor

frehner commented Mar 1, 2023

Thanks! Questions, and forgive me if they don't make sense:

  1. Does this have the ability to have a more complex key? For example, looking at react-query, the query key can also be an array. Is that something we would want to support? https://react-query-v3.tanstack.com/guides/query-keys
  2. Previously, our useQuery() hook could optionally take in a null parameter so that they didn't actually query anything. I believe that was done to help respect the "rules of hooks"; in this case, since this isn't a hook, it can be called conditionally instead, right? Or would there be any advantage to still allowing that pattern?
  3. Do we need to have special handling for fetch calls that don't throw by default? e.g. if someone is using fetch and doesn't check if response.ok?

Thanks!

@frandiox
Copy link
Contributor Author

frandiox commented Mar 2, 2023

@davidhousedev @blittle

It looks like withCache is using the same Cloudflare cache as Hydrogen. Should consumers be able to use this utility with any arbitrary cache, e.g. caches.open('contentful') to avoid key namespace collision?

sanityCache: createCacheHandler(await caches.open('sanity')),

I'm not sure about the implications of opening multiple caches. Probably it's not a perf issue but, alternatively, you can just add a prefix in the cacheKey. Something like ['contentful', {body: {...}}] or ['sanity', {body: {...}}].

Or create a custom helper:

getLoadContext: () => ({
  storefront,
  sanityCache: (key: string, action: Function) => withCache(['sanity', key], action)
  // or even:
  fetchSanity: (body: string) => withCache(['sanity', body], async () => (await fetch('sanity-api/....', {body})).json())
})

However, I think this would probably be simplified if we provide fetchWithCache because that utility would automatically append the fetch URL and body to the key. I think most of the use cases of withCache will be related to fetch anyway.

Just to clarify, I'm not against the idea of using different caches, just providing alternatives here for brainstorming.


@frehner

Does this have the ability to have a more complex key? For example, looking at react-query, the query key can also be an array. Is that something we would want to support? https://react-query-v3.tanstack.com/guides/query-keys

Yes, it supports strings for simplicity, or arrays of serializable things: export type CacheKey = string | readonly unknown[];.

Previously, our useQuery() hook could optionally take in a null parameter so that they didn't actually query anything. I believe that was done to help respect the "rules of hooks"; in this case, since this isn't a hook, it can be called conditionally instead, right? Or would there be any advantage to still allowing that pattern?

I don't see any advantage for a null parameter since it can be called conditionally, as you said.

Do we need to have special handling for fetch calls that don't throw by default? e.g. if someone is using fetch and doesn't check if response.ok?

Perhaps in fetchWithCache. But in a generic withCache I think we shouldn't be aware of anything related to fetch 🤔 . In fact, with the current code you are not supposed to return a Response from withCache, but its body or any serializable value. Can be changed though.

@blittle
Copy link
Contributor

blittle commented Mar 2, 2023

@frandiox I think I agree with that. It seems like it probably would be overkill to have a separate cache per 3p. And it's easy enough to add another element to the array key to make sure it's scoped.

@davidhousedev
Copy link
Contributor

@frandiox Good point re: cache namespaces. We can definitely adopt that pattern in server.ts

However, I think this would probably be simplified if we provide fetchWithCache because that utility would automatically append the fetch URL and body to the key.

While I see the utility of fetchWithCache, I should mention that it isn't useful to us. We make heavy use of graphql-code-generator to build type-safe SDKs, both for Storefront API and Contentful. I suspect a fetchWithCache would not be the right abstraction for our type-safe SDK. We're excited to use withCache! Great function name by the way :), reminds me of Python's with.

@frandiox
Copy link
Contributor Author

frandiox commented Mar 3, 2023

@davidhousedev

We make heavy use of graphql-code-generator to build type-safe SDKs, both for Storefront API and Contentful. I suspect a fetchWithCache would not be the right abstraction for our type-safe SDK

Ah, interesting. I guess the generated SDK eventually calls raw fetch instead, right? In any case, fetchWithCache would be additive, it wouldn't replace withCache.

Related question: would you expect to be able to return a Response from withCache? Right now you would need to return response.json(), or return an object that contains the awaited body + headers if needed. This is because it needs a serializable value, and right now withCache is not tied to fetch or its request/response.

@blittle
Copy link
Contributor

blittle commented Mar 3, 2023

Related question: would you expect to be able to return a Response from withCache?

To me, I think withCache is just an easier API of reading/writing cache than https://developer.mozilla.org/en-US/docs/Web/API/Cache

@blittle
Copy link
Contributor

blittle commented Mar 3, 2023

@frandiox

This uses shouldCacheResult parameter to know if the result should be cached or not (similar to the internal shouldCacheResponse in storefront.query, useful when there are GraphQL errors in the payload). Should we allow signaling "no cache" with a return value null instead? Or, alternatively, allow passing the cache strategy as part of the result:

I don't really like the null return option, because sometimes you still want the result (like you want the error), but you don't want it to be cached. I like the other idea of the strategy apart of the return signature. The one downside of that approach is you can't just have something super simple like withCache('id', () => fetch()). But maybe that's a good thing, it requires you to do something with the API response, and not just pass it through?

@blittle
Copy link
Contributor

blittle commented Mar 3, 2023

@frandiox I wonder the best way to separate it from storefront? Seems like we'd still need to create a new utility? Where does withCache come from in your example? How is it bound to a cache instance?

@davidhousedev
Copy link
Contributor

Ah, interesting. I guess the generated SDK eventually calls raw fetch instead, right?

Yep! Good point :). As long as withCache is available, we're good!

Related question: would you expect to be able to return a Response from withCache? Right now you would need to return response.json(), or return an object that contains the awaited body + headers if needed. This is because it needs a serializable value, and right now withCache is not tied to fetch or its request/response.

Good question. I think we can work with any JSON-serializable return value from withCache.

One thing that comes to mind about the return is the typing. What do you expect the return type to look like? We like the return type from useQuery a lot. Great use of unions!

@frandiox
Copy link
Contributor Author

frandiox commented Mar 6, 2023

@blittle

To me, I think withCache is just an easier API of reading/writing cache than https://developer.mozilla.org/en-US/docs/Web/API/Cache

We could support responses but it will make harder something like shouldCacheResult because now that would get a Response as its parameter and the dev would need to .json() it... which consumes the body for the return value of withCache:

const response = withCache(() => fetch('...'), {shouldCacheResult: async (value) => !(await (value.json()).errors)});
await response.json() // error: body already consumed

We can clone responses but that means you need to await two .json(), etc.
Overall it feels like adding complexity to the feature so I'm not totally convinced on supporting responses 🤔

I don't really like the null return option, because sometimes you still want the result (like you want the error), but you don't want it to be cached.

Yeah 100% agree.

I like the other idea of the strategy apart of the return signature. The one downside of that approach is you can't just have something super simple like withCache('id', () => fetch()). But maybe that's a good thing, it requires you to do something with the API response, and not just pass it through?

Do you mean the one where you must return {data, strategy}? Yes, and we remove the problems mentioned above regarding shouldCacheResult. Therefore we could support {data: Response, strategy: Strategy} if we want since now the developer is responsible for checking the response if they want to change the strategy.

I wonder the best way to separate it from storefront? Seems like we'd still need to create a new utility? Where does withCache come from in your example? How is it bound to a cache instance?

I've changed it to be returned by the existing utility: const {storefront, withCache} = createStorefrontApi({cache, ...}); so that it's not inside storefront anymore. We could separate it to a different utility (might be more semantically correct) but not sure if it's worth it 🤔

--

@davidhousedev

One thing that comes to mind about the return is the typing. What do you expect the return type to look like? We like the return type from useQuery a lot. Great use of unions!

The return type is the same you are returning from the action:

image

Or do you have something else in mind?
An option would be returning const {data, error} = withCache(...) instead of throwing but I think I prefer to stick to the usual try/catch pattern like we do in storefront.query 🤔

@frandiox
Copy link
Contributor Author

frandiox commented Mar 6, 2023

@blittle @davidhousedev Some other possibilities for the withCache signature.

  • Similar to the dynamic option mentioned in OP but passing strategies as parameters to the action function (so that withCache doesn't depend on storefront.[strategy]:

    const data = await withCache('key', ({NoCache, CacheShort}) => {
      const response = await fetch('....');
      const data = await response.json();
    
      return { data, strategy: data.errors ? NoCache() : CacheShort() }
    })
  • Here the strategies passed as parameters can optionally wrap the returned value. The idea with this is that we can support simple returned values (responses or serializable values) but it's also possible to specify a cache strategy dynamically (returnedValue instanceof CacheWrapper ? use CacheWrapper options : default cache).

    // Simple response using default cache options (CacheShort) without checking response
    const response = await withCache('key', () => fetch(...));
    
    // Custom cache without checking errors:
    const response = await withCache('key', async ({CacheCustom}) => {
      const response = await fetch('....');
    
      return CacheCustom(response, {maxAge: 15});
    })
    
    // Check body errors and cache accordingly:
    const data = await withCache('key', async ({NoCache, CacheLong}) => {
      const response = await fetch('....');
      const data = await response.json();
    
      return data.errors ? NoCache(data) : CacheLong(data);
    })

    Perhaps the most flexible but maybe hard to understand/explain. Also, having different signatures for the strategy builders might be confusing (e.g. withCache's CacheCustom wraps a value while storefront.CacheCustom doesn't).

@davidhousedev
Copy link
Contributor

@frandiox

Similar to the dynamic option mentioned in OP but passing strategies as parameters to the action function

I like this option more than wrapping the returned value in a Cache* function, both because it's closer in API structure to storefront.query and because it's a little less magical. I suppose the strategy attribute may also be optional and default to CacheShort.

Regarding return types, maybe it's the time I'm spending with Rust, but I see a lot of value in being required to handle an error from a function that may error. It's easy to forget a try/catch, but typescript won't let you forget to discriminate a union. I understand if you all have prior art moving towards relying on try/catch, that you wouldn't want to mix patterns.

Really appreciate the work you're doing on this! 🙇🏻‍♂️

@jplhomer
Copy link
Contributor

jplhomer commented Mar 7, 2023

👋 thanks for taking the time to iterate on this proposal!

A couple thoughts swirling in my head:

Do we really need a generic withCache utility?

Other players in the industry are skating toward a built-in fetch cache.

It's a head-scratcher at first, but I think it might be a smart pattern:

  • Most expensive calls in your application will be HTTP requests made with fetch. This is especially true in Workers environments where everything is an HTTP request.
  • No fiddling with new APIs — just declare how long you want the fetched response to last
  • Easy to support SWR, since the platform knows how to revalidate a simple request (vs having to implement it in our custom function with the mess of lock keys).

Downsides certainly would be:

  • Can't cache non-HTTP requests (but how many are there?)
  • Difficulty patching existing SDKs which might not allow you to define a custom fetcher or pass non-standard options to RequestInit
  • Can't cache multiple things in one closure.

@davidhousedev Do you often find the need to cache multiple things in the same closure? If it were limited to caching each individual fetch call, would this make things difficult?

Even if we decide to ship a withCache utility, maybe it would be valuable to explore a fetch monkey-patched cache implementation for users who simply wish to cache a single fetch call without having to interact with the Cache instance proper.

Nitpicks on withCache

  • withCache should only accept a single CacheStrategy.
  • If the user doesn't like what they got back (e.g. data.errors), they should throw in the closure.
  • The user should wrap withCache in a try/catch block to properly handle any exceptions they throw in the closure and deal accordingly.
try {
  const myData = await withCache('key', CacheCustom, async () => {
    const response = await fetch('....');
    const data = await response.json();
  
    if (data.errors) {
      throw new Error('We had errors!');
    }
  
    return data;
  });
} catch (e) {
  return json({ error: e.message });
}

@blittle
Copy link
Contributor

blittle commented Mar 7, 2023

@jplhomer previously in h1, we bound useQuery and family to the request. Will we also need to do that in h2? Like, would we have to also pass a fetch function within the context. I'm not sure when AsyncLocalStorage will become available (maybe already is?).

@davidhousedev
Copy link
Contributor

davidhousedev commented Mar 8, 2023

Do you often find the need to cache multiple things in the same closure? If it were limited to caching each individual fetch call, would this make things difficult?

@jplhomer Alas, we do and it would. I've walked @blittle through this problem space before, happy to get into detail out-of-band. I'll outline it below:

We build our pages via a headless CMS. We query the data from the CMS over graphql. We structure our pages as a stack of "modules" where each module can be independently reordered on the page to enable rich customization options.

We have two options when querying for page data:

  1. Fetch the entire page, and all nested page content
  2. Fetch the page, then make follow-up fetch requests for each "module"

Option 1 is not technically feasible. If we try to fetch the entire page in one request, we hit graphql query complexity limits and our query is rejected.

Option 2 is only feasible if we're able to put all queries behind the same cache key. If each request is cached independently, we expose a race condition where one module might be revalidated before another but both modules have been updated in the CMS. The result would be a page that's marketing one promotion (Free product A) whereas a module further down the page is marketing a a different promotion.

Again, happy to discuss further!

@blittle blittle force-pushed the fd-poc-3pcache-util branch from 3bc1005 to 46c2402 Compare March 16, 2023 20:48
@github-actions github-actions bot had a problem deploying to preview March 16, 2023 20:48 Failure
@github-actions github-actions bot had a problem deploying to preview March 16, 2023 20:51 Failure
@github-actions github-actions bot had a problem deploying to preview March 16, 2023 21:02 Failure
@maxfrigge
Copy link

Are there plans to include a utility to generate a cache key from object? Third party APIs might want to pass user specific headers, which need to be part of the key.

Having something like this build in could be helpful?

@frandiox
Copy link
Contributor Author

Are there plans to include a utility to generate a cache key from object? Third party APIs might want to pass user specific headers, which need to be part of the key.
Having something like this build in could be helpful?

You can pass an object or an array as the cache key already, it will be stringified internally for you:

withCache(['my-3p-api', request.headers.get('x-authentication')], CacheShort(), () => {...})

@frandiox frandiox merged commit 737f83e into 2023-01 Mar 29, 2023
@frandiox frandiox deleted the fd-poc-3pcache-util branch March 29, 2023 16:47
@github-actions github-actions bot mentioned this pull request Mar 29, 2023
@benjaminsehl
Copy link
Member

@blittle — thoughts on cache observability? Any path here to allow devs to look under the hood on cache hits/misses?

@richardscarrott
Copy link

richardscarrott commented Oct 26, 2023

We have a use case where we have to make two Storefront API calls in order to handle translated variant options. These need to be in sync to ensure they marry up to one another, meaning they can't live under two separate cache keys.

// URL options are always in english
https://hydrogen.shop/products/the-full-stack?Size=158cm&Color=Syntax
# Returns options in english so we can map up the always-english URL query params to the translated options by options[index]
query DefaultLangProduct($handle: String!) {
  product(handle: $handle) {
    options {
      name
      values
    }
  }
}

# Returns options in users language, along with all other product data
query Product(
  $country: CountryCode
  $language: LanguageCode
  $handle: String!
  $selectedOptions: [SelectedOptionInput!]!
) @inContext(country: $country, language: $language) {
  product(handle: $handle) {
    id
    title
    descriptionHtml
    options {
      name
      values
    }
    # etc.
  }
}
const [defaultLangProduct, product] = Promise.all([
  storefront.query(DEFAULT_LANGUAGE_PRODUCT_QUERY, {
    variables: {
      handle,
      selectedOptions
    },
    cache: CacheNone(),
  }),
  storefront.query(PRODUCT_QUERY, {
    variables: {
      handle,
      selectedOptions,
      country: locale.country,
      language: locale.language,
    },
    cache: CacheNone(),
  }),
])

Is there a way to combine the cache key of two storefront API calls so we can pass it to withCache?

For now we're doing this which is a bit grim because the cache entry is invalidated after a build (in case the code changed):

const [defaultLangProduct, product] = await withCache(
  // This uuid is specific to this call site. It includes the build number so we can safely change the code below without
  // having to remember to update the uuid.
  [env.BUILD_NUMBER, '876979b6-c2ad-4d05-8865-7b16b5d7fa48'],
  CacheLong(),
  () =>
    Promise.all([
      storefront.query(DEFAULT_LANGUAGE_PRODUCT_QUERY, {
        variables: {
          handle,
          selectedOptions,
        },
        cache: CacheNone(),
      }),
      storefront.query(PRODUCT_QUERY, {
        variables: {
          handle,
          selectedOptions,
          country: locale.country,
          language: locale.language,
        },
        cache: CacheNone(),
      }),
    ]),
);

@frandiox
Copy link
Contributor Author

Is there a way to combine the cache key of two storefront API calls so we can pass it to withCache?

You can check how we create a cache key for a single storefront call and then concatenate the other one. Something like:

[url, method, headers, body1 + body2]

Where:

  • url: the SFAPI url. You can get it with storefront.getApiUrl()
  • method: POST
  • headers: All the important headers. Currently you can do JSON.stringify(storefront.getPublicTokenHeaders())
  • body: each of the bodies is basically JSON.stringify({query, variables}).

@richardscarrott
Copy link

richardscarrott commented Oct 27, 2023

Thanks @frandiox. It got kinda complex so we ended up caching them both independently as before, but handled the mismatch more gracefully (redirect to the default variant) and set custom CF cache tags so that we can purge the cache on product update webhooks.

It would have been great if the Storefront API supported query batching or @inContext could be moved to the field, rather than the entire query. Then it could be a single fetch call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants