Burr Serving with BentoML

This repository shows how to deploy a Burr Application with BentoML.

Overview

Burr and BentoML help you build the application and serving layers of your system.

Application layer

Burr creates easy to understand and debug applications with a clear path to production. It supports synchronous, asynchronous, and streaming actions. Persistence & durability, hooks, and telemetry features are built-in.

Serving layer

BentoML is a specialized tool to package, deploy, and manage AI services. Get the most performance from your system by specifying resource requirements (CPU, GPU, RAM, concurrency, workers, etc.), autoscaling, and adaptive batching for requests. It also automatically generates synchronous and asynchronous clients for your service.

Directory Content

web_page_qna/ is an introductory example to deploy with BentoML a Burr Application that uses LLMs to answer questions about a web page.

Community

Join the BentoML developer community on Slack for more support and discussions!

Join the Burr Discord server for help, questions, and feature requests.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
web_page_qna		web_page_qna
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Burr Serving with BentoML

Overview

Application layer

Serving layer

Directory Content

Community

About

Releases

Packages

Languages

bentoml/BentoBurr

Folders and files

Latest commit

History

Repository files navigation

Burr Serving with BentoML

Overview

Application layer

Serving layer

Directory Content

Community

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages