Note: This started as a personal/toy project I created to ask questions to podcasts I listen to. I've made it generic so others can use it with their own audio content. Feel free to adapt it for your needs!
An MP3 to AI Chat Assistant - A configurable AI chat assistant that can be customized for your content and use case. Transform audio content into interactive, searchable conversations.
This is a demonstration project and should not be deployed to production without implementing additional security measures:
- The application currently has no user authentication
- Lack of authentication could lead to:
- Unauthorized access to your API endpoints
- Potential abuse of your API rate limits
- Excessive costs from unrestricted API usage
- Malicious use of your services
- Consider implementing proper authentication, rate limiting, and usage monitoring before any production deployment
- Frontend Framework: SvelteKit 2.x with Svelte 5
- UI/Styling:
- TailwindCSS for styling
- DaisyUI for UI components
- Typography plugin for content formatting
- AI/Machine Learning:
- Anthropic Claude 3 for natural language processing
- Deepgram for audio transcription
- Voyage for vector database and similarity search
- Database: Turso (LibSQL) for data storage
- Development Tools:
- TypeScript for type safety
- Vite for development and building
- ESLint and Prettier for code quality
- Vitest and Playwright for testing
- Performance:
- LRU Cache for response caching
- Server-sent events for real-time progress updates
- Clone the repository
- Copy
.env.example
to.env
:
cp .env.example .env
- Fill in your environment variables in
.env
- Install dependencies:
npm install
- Start the development server:
npm run dev
Required environment variables in your .env
file:
# Required
ANTHROPIC_API_KEY=your_api_key_here
TURSO_URL=your_turso_url_here
TURSO_AUTH_TOKEN=your_turso_auth_token_here
DEEPGRAM_API_KEY=your_deepgram_key_here
VOYAGE_API_KEY=your_voyage_key_here
The application can be customized by modifying
src/lib/config/app-config.ts
. This file contains non-sensitive
configuration options:
{
// Application name and description
app_name: 'My Custom Assistant',
app_description: 'Chat with your content using AI',
// AI Configuration
ai: {
// Available models:
// - claude-3-opus-20240229 (most capable)
// - claude-3-sonnet-20240229 (balanced)
// - claude-3-haiku-20240307 (fastest)
model: 'claude-3-haiku-20240307',
max_tokens: 1024,
// Customize the AI's behaviour
system_prompt: 'Your custom system prompt...',
// Define different response styles
style_instructions: {
normal: {
description: 'Balanced and clear responses',
instruction: 'Your custom instruction...'
},
// Add more styles as needed
}
},
// Set default response style
default_response_style: 'normal'
}
The application supports multiple response styles that can be configured:
normal
: Balanced and clear responsesconcise
: Brief and direct responsesexplanatory
: Detailed explanationsformal
: Professional and structured responses
You can modify existing styles or add new ones in the configuration file.
- Audio file upload and processing
- Real-time transcription progress updates
- Vector-based semantic search
- Configurable AI response styles
- Interactive chat interface
- Server-side streaming responses
- Response caching for performance
- Never commit your
.env
file - Keep all sensitive information (API keys, credentials) in environment variables
- The
app-config.ts
file should only contain non-sensitive application settings - Implement proper authentication before deploying to production
- Consider adding rate limiting to protect your API usage
- Monitor API usage to prevent excessive costs
To start the development server:
npm run dev
Visit http://localhost:5173
to see your application.
MIT