Support generation configuration for LLM #62

RyanMarten · 2024-11-12T06:03:31Z

e.g.

top_p
temperature
etc.

OpenAI completion parameters
https://platform.openai.com/docs/api-reference/chat/create

RyanMarten · 2024-11-12T21:25:36Z

https://docs.litellm.ai/docs/completion/input

LiteLLM very nicely tracks supported openai params for any model + provider.

litellm.get_supported_openai_params()

Nice system for dealing with unsupported params:

By default, LiteLLM raises an exception if the openai param being passed in isn't supported.
To drop the param instead, set litellm.drop_params = True or completion(..drop_params=True).
This ONLY DROPS UNSUPPORTED OPENAI PARAMS.
LiteLLM assumes any non-openai param is provider specific and passes it in as a kwarg in the request body

def completion(
    model: str,
    messages: List = [],
    # Optional OpenAI params
    timeout: Optional[Union[float, int]] = None,
    temperature: Optional[float] = None,
    top_p: Optional[float] = None,
    n: Optional[int] = None,
    stream: Optional[bool] = None,
    stream_options: Optional[dict] = None,
    stop=None,
    max_completion_tokens: Optional[int] = None,
    max_tokens: Optional[int] = None,
    presence_penalty: Optional[float] = None,
    frequency_penalty: Optional[float] = None,
    logit_bias: Optional[dict] = None,
    user: Optional[str] = None,
    # openai v1.0+ new params
    response_format: Optional[dict] = None,
    seed: Optional[int] = None,
    tools: Optional[List] = None,
    tool_choice: Optional[str] = None,
    parallel_tool_calls: Optional[bool] = None,
    logprobs: Optional[bool] = None,
    top_logprobs: Optional[int] = None,
    deployment_id=None,
    # soon to be deprecated params by OpenAI
    functions: Optional[List] = None,
    function_call: Optional[str] = None,
    # set api_base, api_version, api_key
    base_url: Optional[str] = None,
    api_version: Optional[str] = None,
    api_key: Optional[str] = None,
    model_list: Optional[list] = None,  # pass in a list of api_base,keys, etc.
    # Optional liteLLM function params
    **kwargs,

) -> ModelResponse:

RyanMarten · 2024-11-12T22:13:43Z

Based on litellm: https://docs.litellm.ai/docs/completion/input#input-params-1

List also in code here: https://github.com/BerriAI/litellm/blob/main/litellm/main.py#L843

We will support:

    model: str,
    messages: List = [],
    stop=None,
    max_completion_tokens: Optional[int] = None,
    max_tokens: Optional[int] = None,
    presence_penalty: Optional[float] = None,
    frequency_penalty: Optional[float] = None,
    logit_bias: Optional[dict] = None,
    seed: Optional[int] = None,
    tools: Optional[List] = None,
    tool_choice: Optional[str] = None,
    parallel_tool_calls: Optional[bool] = None,
    logprobs: Optional[bool] = None,
    top_logprobs: Optional[int] = None,
    # set api_base, api_version, api_key
    base_url: Optional[str] = None,
    api_version: Optional[str] = None,
    api_key: Optional[str] = None,

Right now we default to structured output { "type": "json_schema", "json_schema": {...} } instead of json output { "type": "json_object" }. If we want to support both, we need to change the way we are doing it. See the API reference.

    response_format: Optional[dict] = None,

We can support later if we choose, but won't include now

    user: Optional[str] = None,
    deployment_id=None,

We will not support:

    timeout: Optional[Union[float, int]] = None,
    stream: Optional[bool] = None,
    stream_options: Optional[dict] = None,

Others to think about

    model_list: Optional[list] = None,  # pass in a list of api_base,keys, etc.
    # Optional liteLLM function params
    **kwargs,

Optional liteLLM functions params that look interesting:

input_cost_per_token: float (optional) - The cost per input token for the completion call
output_cost_per_token: float (optional) - The cost per output token for the completion call

RyanMarten · 2024-11-12T22:51:22Z

https://docs.litellm.ai/docs/completion/batching#send-multiple-completion-calls-to-1-model

LiteLLM also supports messages being a list of lists, sending multiple completions

We should benchmark this vs doing threadpool over async completion with litellm.

EDIT: added to #74

RyanMarten · 2024-11-13T00:30:36Z

Lessons learned from #77

Don't think it makes sense to have all of these args through all the functions. Instead do kwargs and have a programmatic way of adding to the request body.
Add these args to cache fingerprint
Do we want to add all of these to GenericRequest - we can, but we don't have to.

Instead we can just pass in the kwargs to the constructor of the RequestProcessor. And then use

from litellm import get_supported_openai_params
supported_params = get_supported_openai_params(model="anthropic.claude-3", custom_llm_provider="bedrock")

And if kwarg in supported_params we can add it to the request body.

Can also share this main request body code between OpenAI batch and online so don't have to duplicate it.

RyanMarten · 2024-11-13T00:40:59Z

There is also this litellm.OpenAIConfig object: https://docs.litellm.ai/docs/completion/provider_specific_params

RyanMarten self-assigned this Nov 12, 2024

This was referenced Nov 12, 2024

Add more model support with liteLLM #74

Closed

Add temperature and top-p #77

Merged

RyanMarten mentioned this issue Dec 2, 2024

Add LiteLLM+instructor (for structured output) backend for curator #141

Merged

This was referenced Dec 4, 2024

OnlineRequestProcessor V2 Megathread #204

Open

Setting generation params #217

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support generation configuration for LLM #62

Support generation configuration for LLM #62

RyanMarten commented Nov 12, 2024 •

edited

Loading

RyanMarten commented Nov 12, 2024 •

edited

Loading

RyanMarten commented Nov 12, 2024 •

edited

Loading

RyanMarten commented Nov 12, 2024 •

edited

Loading

RyanMarten commented Nov 13, 2024

RyanMarten commented Nov 13, 2024

Support generation configuration for LLM #62

Support generation configuration for LLM #62

Comments

RyanMarten commented Nov 12, 2024 • edited Loading

RyanMarten commented Nov 12, 2024 • edited Loading

RyanMarten commented Nov 12, 2024 • edited Loading

RyanMarten commented Nov 12, 2024 • edited Loading

RyanMarten commented Nov 13, 2024

RyanMarten commented Nov 13, 2024

RyanMarten commented Nov 12, 2024 •

edited

Loading

RyanMarten commented Nov 12, 2024 •

edited

Loading

RyanMarten commented Nov 12, 2024 •

edited

Loading

RyanMarten commented Nov 12, 2024 •

edited

Loading