Caching #13

lea-33 · 2024-11-28T14:52:13Z

Brainstorm about Caching:

saving pdfs local and only retrieving images from those I have selected (e.g. via the sliceplot viewer)

haesleinhuepf · 2024-11-28T18:12:14Z

Also it would be great if the embedding vectors of slides could be cached. In this way, we only run the (costly) embedding once.

lea-33 · 2024-12-04T15:46:15Z

do you have any other ideas on that?
git-bob comment

github-actions · 2024-12-04T15:47:36Z

^{This message was generated by git-bob (version: 0.12.0, model: github_models:gpt-4o, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.}

It sounds like we have two main caching ideas to explore: local PDF storage and caching embedding vectors. For local PDF caching, we could save selected PDFs to a dedicated cache directory. When PDFs are needed, we'd check this directory first before downloading new data.

For caching embedding vectors, we could implement a mechanism to store these in memory or on disk after computation. This way, we can reuse them without recomputation. We might use a dictionary with unique identifiers (like slide IDs) as keys and the embedding vectors as values.

Here's a rough proposal for embedding vector caching:

embedding_cache = {}

def get_embedding(slide_id, compute_embedding):
    if slide_id in embedding_cache:
        return embedding_cache[slide_id]
    else:
        embedding = compute_embedding(slide_id)
        embedding_cache[slide_id] = embedding
        return embedding

This function takes a slide_id and a compute_embedding function as parameters, checking if the embedding is cached before computing and storing it.

lea-33 added the help wanted Extra attention is needed label Nov 28, 2024

haesleinhuepf mentioned this issue Dec 4, 2024

Caching #28

Closed

lea-33 closed this as completed Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching #13

Caching #13

lea-33 commented Nov 28, 2024

haesleinhuepf commented Nov 28, 2024

lea-33 commented Dec 4, 2024

github-actions bot commented Dec 4, 2024

Caching #13

Caching #13

Comments

lea-33 commented Nov 28, 2024

haesleinhuepf commented Nov 28, 2024

lea-33 commented Dec 4, 2024

github-actions bot commented Dec 4, 2024