Add support for cache control in ChatCompletionMessage #897

Paikan · 2024-11-12T09:39:26Z

I am using go-openai to request Braintrust AI proxy which allows to access models from OpenAI, Anthropic, Google, AWS, Mistral, and third-party inference providers through a single, unified API (openai).

When requesting anthropic models with prompt caching I need to perform these kind of requests

curl https://api.anthropic.com/v1/messages \
  -H "content-type: application/json" \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: prompt-caching-2024-07-31" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 1024,
    "system": [
      {
        "type": "text",
        "text": "You are an AI assistant tasked with analyzing literary works. Your goal is to provide insightful commentary on themes, characters, and writing style.\n"
      },
      {
        "type": "text",
        "text": "<the entire contents of Pride and Prejudice>",
        "cache_control": {"type": "ephemeral"}
      }
    ],
    "messages": [
      {
        "role": "user",
        "content": "Analyze the major themes in Pride and Prejudice."
      }
    ]
  }'

Would you consider adding CacheControl in ChatCompletionMessage for this kind of use case even if it is not part of openai per se?

The text was updated successfully, but these errors were encountered:

JanRuettinger · 2024-11-12T20:28:21Z

I am interested as well!

Paikan added the enhancement New feature or request label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for cache control in ChatCompletionMessage #897

Add support for cache control in ChatCompletionMessage #897

Paikan commented Nov 12, 2024

JanRuettinger commented Nov 12, 2024

Add support for cache control in ChatCompletionMessage #897

Add support for cache control in ChatCompletionMessage #897

Comments

Paikan commented Nov 12, 2024

JanRuettinger commented Nov 12, 2024