jkeegan

Follow

Jeff Keegan jkeegan

Follow

1 follower · 2 following

Popular repositories Loading

distributed-llama distributed-llama Public

Forked from b4rtaz/distributed-llama

Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.

C++