Popular AI company compatible LLM Server

This is a very simple Flask application that provides a popular compatible API for other large language models.

Very useful if you have tests or lots of running Collaborative Agent Modules :-)

It currently supports Llama2, Mistral-7b and RWKV since these models can run pretty easily on local hardware which makes it a great fit for the agent use case.

Streaming is supported as well.

Setup

Create a venv python3 -m venv venv
Activate venv source venv/bin/activate (or venv\Scripts\activate on Windows)
Install dependencies pip install -r requirements.txt
Create a symlink to your models. Example ln -s /mnt/ssd/models/rwkv models/rwkv
Run the server using python app.py.

Sending Requests

curl http://localhost:5000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer WE_DONT_NEED_NO_STINKING_TOKENS" \
  -d '{
    "model": "mistral-7b-instruct",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Popular AI company compatible LLM Server

Setup

Sending Requests

About

Releases

Packages

Contributors 3

Languages

License

XpressAI/xai-llm-server

Folders and files

Latest commit

History

Repository files navigation

Popular AI company compatible LLM Server

Setup

Sending Requests

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages