🎨 Inspiration

VoiceCanvas was born from a mission to democratize digital art creation—making it truly accessible to everyone, regardless of artistic skill, ability, or technical experience. Inspired by Bolt.new’s rapid prototyping and the advent of powerful voice and video AI tools, we wanted to explore how these technologies could transform the creative process. The core insight was simple: your voice can be the paintbrush.

🧩 What It Does

Users speak their creative vision aloud, and VoiceCanvas:

  • Converts speech to text using ElevenLabs voice AI.
  • Presents a responsive Tavus video avatar that refines prompts (“Would you like bold colors or pastel?”).
  • Sends refined prompts to an AI image engine (e.g., Stable Diffusion) to generate artwork instantly.
  • Stores creations in a personal gallery and optionally shares to a community feed.

🛠 How We Built It

  • Platform: Entirely built in Bolt.new using its live frontend/backend environment.
  • Frontend: React-driven UI with a voice recorder and conversational avatar overlay.
  • Backend: Node.js + Supabase for authentication and art storage, deployed on Netlify.
  • APIs Integrated:

    • ElevenLabs for speech-to-text and TTS
    • Tavus for the video AI consultant
    • Stability.ai (Stable Diffusion) for image generation
    • (Optional stretch) RevenueCat for subscription handling

🚧 Challenges We Ran Into

  • Bolt’s live environment sometimes mis-handled AI-generated code—required iterative debugging and prompt adjustments.
  • ElevenLabs integration brought API latency and occasional emotion flatness, requiring prompt engineering and voice parameter tuning.
  • Balancing prompt length and token/execution costs during art generation demanded careful optimization against desired output quality.

🏆 Accomplishments We’re Proud Of

  • Delivered a near production-ready prototype in one session, driven purely by high-quality prompts that powered Bolt from zero to functional within minutes.
  • Harnessed free or low-cost AI tools in a convincing, creative flow—showing voice-driven art is now viable and engaging with accessible technologies.

📚 What We Learned

  • Planning is non-negotiable: Even with AI’s help, methodical mapping of user flows and component logic was essential.
  • AI-assisted coding works—but vigilance is key. Bolt.new helps rapid setup, but developers must steer the AI output closely.
  • Speech and art AI tech are sufficiently mature to create real-time, voice-first creative tools, though they require careful tuning for richness and responsiveness.

🚀 What’s Next for VoiceCanvas

  1. Add multiple art engines & style presets, enabling users to choose their creative direction.
  2. Enable collaborative canvassing—multiple contributors speaking in sequence.
  3. Integrate AR/VR support for immersive voice-guided painting.
  4. Add advanced editing tools—layers, filters, brush presets.
  5. Build educational modules, guiding users through art fundamentals using the conversational avatar.

Built With

  • elevenlabs
  • framer
  • lucide
  • next.js
  • node.js
  • postcss
  • stability.ai
  • tailwind
  • tavus
Share this project:

Updates