Skip to content

An MP3 to AI Chat Assistant - A configurable AI chat assistant that can be customized for your content and use case. Transform audio content into interactive, searchable conversations.

Notifications You must be signed in to change notification settings

spences10/audiomind

Repository files navigation

AudioMind

Note: This started as a personal/toy project I created to ask questions to podcasts I listen to. I've made it generic so others can use it with their own audio content. Feel free to adapt it for your needs!

An MP3 to AI Chat Assistant - A configurable AI chat assistant that can be customized for your content and use case. Transform audio content into interactive, searchable conversations.

⚠️ Important Security Notice

This is a demonstration project and should not be deployed to production without implementing additional security measures:

  • The application currently has no user authentication
  • Lack of authentication could lead to:
    • Unauthorized access to your API endpoints
    • Potential abuse of your API rate limits
    • Excessive costs from unrestricted API usage
    • Malicious use of your services
  • Consider implementing proper authentication, rate limiting, and usage monitoring before any production deployment

Technologies Used

  • Frontend Framework: SvelteKit 2.x with Svelte 5
  • UI/Styling:
    • TailwindCSS for styling
    • DaisyUI for UI components
    • Typography plugin for content formatting
  • AI/Machine Learning:
    • Anthropic Claude 3 for natural language processing
    • Deepgram for audio transcription
    • Voyage for vector database and similarity search
  • Database: Turso (LibSQL) for data storage
  • Development Tools:
    • TypeScript for type safety
    • Vite for development and building
    • ESLint and Prettier for code quality
    • Vitest and Playwright for testing
  • Performance:
    • LRU Cache for response caching
    • Server-sent events for real-time progress updates

Setup

  1. Clone the repository
  2. Copy .env.example to .env:
cp .env.example .env
  1. Fill in your environment variables in .env
  2. Install dependencies:
npm install
  1. Start the development server:
npm run dev

Configuration

Environment Variables

Required environment variables in your .env file:

# Required
ANTHROPIC_API_KEY=your_api_key_here
TURSO_URL=your_turso_url_here
TURSO_AUTH_TOKEN=your_turso_auth_token_here
DEEPGRAM_API_KEY=your_deepgram_key_here
VOYAGE_API_KEY=your_voyage_key_here

Application Configuration

The application can be customized by modifying src/lib/config/app-config.ts. This file contains non-sensitive configuration options:

{
    // Application name and description
    app_name: 'My Custom Assistant',
    app_description: 'Chat with your content using AI',

    // AI Configuration
    ai: {
        // Available models:
        // - claude-3-opus-20240229 (most capable)
        // - claude-3-sonnet-20240229 (balanced)
        // - claude-3-haiku-20240307 (fastest)
        model: 'claude-3-haiku-20240307',

        max_tokens: 1024,

        // Customize the AI's behaviour
        system_prompt: 'Your custom system prompt...',

        // Define different response styles
        style_instructions: {
            normal: {
                description: 'Balanced and clear responses',
                instruction: 'Your custom instruction...'
            },
            // Add more styles as needed
        }
    },

    // Set default response style
    default_response_style: 'normal'
}

Response Styles

The application supports multiple response styles that can be configured:

  • normal: Balanced and clear responses
  • concise: Brief and direct responses
  • explanatory: Detailed explanations
  • formal: Professional and structured responses

You can modify existing styles or add new ones in the configuration file.

Features

  • Audio file upload and processing
  • Real-time transcription progress updates
  • Vector-based semantic search
  • Configurable AI response styles
  • Interactive chat interface
  • Server-side streaming responses
  • Response caching for performance

Security Notes

  • Never commit your .env file
  • Keep all sensitive information (API keys, credentials) in environment variables
  • The app-config.ts file should only contain non-sensitive application settings
  • Implement proper authentication before deploying to production
  • Consider adding rate limiting to protect your API usage
  • Monitor API usage to prevent excessive costs

Development

To start the development server:

npm run dev

Visit http://localhost:5173 to see your application.

License

MIT

About

An MP3 to AI Chat Assistant - A configurable AI chat assistant that can be customized for your content and use case. Transform audio content into interactive, searchable conversations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published