🎙️ Jan 8 Edge TTS Voice Generator - AI Voice Synthesis Working!

✅ VOICE GENERATION SYSTEM COMPLETE: **Script Created:** scripts/generate_voices_edge_tts.py - Async voice generation using Microsoft Edge TTS - Multiple character voices configured - English + Slovenian support - Adjustable rate and pitch **Voice Configurations:** - Kai (EN): en-US-AvaNeural (young female) - Kai (SL): sl-SI-PetraNeural - Ana (EN): en-US-JennyNeural (warm, friendly) - Narrator (EN): en-US-GuyNeural (deep, storytelling) **Test Generation SUCCESS:** ✅ Generated: kai_test_01.mp3 (17,280 bytes) Text: 'My name is Kai, and I will find my sister.' Voice: en-US-AvaNeural Quality: High-quality AI voice synthesis **Features:** - Automatic MP3 generation - Organized output to /assets/audio/voices/[character]/ - Configurable speech rate (-50% to +100%) - Configurable pitch (-50Hz to +50Hz) - Batch generation functions ready **Usage:** python3 scripts/generate_voices_edge_tts.py **Next Steps:** 1. Uncomment generate_kai_voices() for full Kai dialogue 2. Generate Ana, Narrator voices 3. Add sound effects using similar approach (or freesound.org) 4. Generate background music (use AI music tools) 🎯 Audio Status: 67/99 files (68% complete + voice generator ready)
2026-01-08 15:55:16 +01:00
parent 5b07de56da
commit 820815e1a5
2 changed files with 139 additions and 0 deletions
--- a/assets/audio/voices/kai/kai_test_01.mp3
+++ b/assets/audio/voices/kai/kai_test_01.mp3
--- a/scripts/generate_voices_edge_tts.py
+++ b/scripts/generate_voices_edge_tts.py
@@ -0,0 +1,139 @@
+#!/usr/bin/env python3
+"""
+Edge TTS Voice Generator
+Generate voice-over audio using Microsoft Edge TTS
+"""
+
+import asyncio
+import edge_tts
+from pathlib import Path
+
+# Output directory
+OUTPUT_DIR = Path("/Users/davidkotnik/repos/novafarma/assets/audio/voices")
+
+# Voice configurations
+VOICES = {
+    "kai_en": "en-US-AvaNeural",  # English - female, young
+    "kai_sl": "sl-SI-PetraNeural",  # Slovenian
+    "ana_en": "en-US-JennyNeural",  # English - warm, friendly
+    "ana_sl": "sl-SI-PetraNeural",
+    "narrator_en": "en-US-GuyNeural",  # English - deep, storytelling
+    "narrator_sl": "sl-SI-RokNeural",  # Slovenian male
+}
+
+async def generate_voice(text: str, voice: str, output_path: Path, rate: str = "+0%", pitch: str = "+0Hz"):
+    """
+    Generate voice audio from text
+    
+    Args:
+        text: Text to convert to speech
+        voice: Voice ID (e.g. "en-US-AvaNeural")
+        output_path: Where to save the MP3
+        rate: Speech rate (-50% to +100%)
+        pitch: Speech pitch (-50Hz to +50Hz)
+    """
+    print(f"🎙️  Generating: {output_path.name}")
+    print(f"   Voice: {voice}")
+    print(f"   Text: {text[:50]}...")
+    
+    # Create output directory
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    
+    # Generate speech
+    communicate = edge_tts.Communicate(text, voice, rate=rate, pitch=pitch)
+    await communicate.save(str(output_path))
+    
+    # Check file size
+    size = output_path.stat().st_size
+    print(f"   ✅ Saved: {size:,} bytes\n")
+
+async def generate_test():
+    """Generate test voice for Kai"""
+    text = "My name is Kai, and I will find my sister."
+    output = OUTPUT_DIR / "kai" / "kai_test_01.mp3"
+    
+    await generate_voice(
+        text=text,
+        voice=VOICES["kai_en"],
+        output_path=output
+    )
+
+async def generate_kai_voices():
+    """Generate Kai's voice lines"""
+    lines = [
+        "My name is Kai, and I will find my sister.",
+        "Ana, where are you? I won't give up.",
+        "This farm... it reminds me of home.",
+        "I need to keep farming. For Ana.",
+        "Another day, another harvest. But I won't forget."
+    ]
+    
+    for i, text in enumerate(lines, 1):
+        output = OUTPUT_DIR / "kai" / f"kai_{i:02d}.mp3"
+        await generate_voice(text, VOICES["kai_en"], output)
+
+async def generate_ana_voices():
+    """Generate Ana's voice lines (memories)"""
+    lines = [
+        "Kai... can you hear me?",
+        "Remember the farm... remember our home.",
+        "I'm still here, Kai. Don't forget me.",
+        "The valley holds secrets... find them."
+    ]
+    
+    for i, text in enumerate(lines, 1):
+        output = OUTPUT_DIR / "ana" / f"ana_{i:02d}.mp3"
+        await generate_voice(text, VOICES["ana_en"], output, rate="-10%", pitch="-5Hz")
+
+async def generate_narrator_voices():
+    """Generate narrator voice lines"""
+    lines = [
+        "In the Valley of Death, a young farmer searches for answers.",
+        "Long ago, this valley was green and full of life.",
+        "But the dead walk now, and the living must survive."
+    ]
+    
+    for i, text in enumerate(lines, 1):
+        output = OUTPUT_DIR / "narrator" / f"narrator_{i:02d}.mp3"
+        await generate_voice(text, VOICES["narrator_en"], output, rate="-5%")
+
+async def list_available_voices():
+    """List all available Edge TTS voices"""
+    print("\n📋 Available Edge TTS Voices:\n")
+    
+    voices = await edge_tts.list_voices()
+    
+    # Filter to relevant languages
+    relevant = [v for v in voices if v["Locale"].startswith(("en-", "sl-"))]
+    
+    for voice in relevant[:20]:  # Show first 20
+        print(f"  {voice['ShortName']}")
+        print(f"    Language: {voice['Locale']}")
+        print(f"    Gender: {voice['Gender']}")
+        print()
+
+async def main():
+    """Main execution"""
+    print("="*60)
+    print("🎙️  EDGE TTS VOICE GENERATOR")
+    print("="*60)
+    
+    # Generate test first
+    print("\n🧪 GENERATING TEST VOICE:\n")
+    await generate_test()
+    
+    print("\n" + "="*60)
+    print("✅ TEST COMPLETE! Check: assets/audio/voices/kai/kai_test_01.mp3")
+    print("="*60)
+    
+    # Ask if user wants to continue
+    print("\n📝 To generate all voices, uncomment the function calls below.")
+
+if __name__ == "__main__":
+    # Run async main
+    asyncio.run(main())
+    
+    # UNCOMMENT BELOW TO GENERATE ALL VOICES:
+    # asyncio.run(generate_kai_voices())
+    # asyncio.run(generate_ana_voices())
+    # asyncio.run(generate_narrator_voices())