🎵 Audio Guide

Audio Import Guide

How to prepare TTS dialogue and SFX sound effects for AutoFlowCut

🚀 Quick Start 3 Steps

Generate audio files externally, then import the audio package folder into AutoFlowCut.

1

Generate TTS Audio from Your Script

Use Typecast or ElevenLabs to generate per-character dialogue voices. Organize files by character folder.

2

Generate SFX Sound Effects

Use ElevenLabs Sound Generation API to create ambient sounds, footsteps, props, and other effects by category.

3

Import Audio Package in AutoFlowCut

Place all audio files in the media/ folder of your project directory. AutoFlowCut automatically detects and imports the audio package.

💡 Tip: Audio files with timecodes in their filenames are automatically matched to the corresponding SRT subtitle timestamps.

📁 Audio Package Structure

Organize your audio files in the following directory structure inside your project folder:

media/
├── voices/                    # Per-character TTS files
│   ├── narrator/
│   │   ├── narrator_001_0000.mp3
│   │   └── narrator_002_0035.mp3
│   └── scholar/
│       └── scholar_001_0120.mp3
├── sfx/                       # Sound effects by category
│   ├── 01_props/
│   ├── 02_ambience_wind/
│   ├── 03_breath/
│   ├── 04_footsteps/
│   ├── 05_metal_doors/
│   ├── 06_writing/
│   └── 07_crowd/
└── .audio_review.json         # Tracks unsuitable audio

💡 Note: The voices/ folder contains per-character subfolders. The sfx/ folder contains category-based subfolders.

🎙️ TTS (Dialogue Voice)

Per-character dialogue audio generated via Text-to-Speech APIs.

Generation API: Typecast

API: https://api.typecast.ai/v1/text-to-speech

File Naming Convention

{character}_{number}_{MMSS}.mp3

Examples:
  narrator/narrator_001_0000.mp3    # narrator, line 1, at 00:00
  scholar/scholar_003_0245.mp3      # scholar, line 3, at 02:45

The timecode (MMSS) enables automatic matching with SRT subtitle timestamps.

Emotion Parameters

Emotion Description
normalDefault tone
happyBright / joyful
sadSad / melancholic
angryAngry / intense

🔊 SFX (Sound Effects)

Ambient sounds, foley, and effects generated via AI sound generation.

Generation API: ElevenLabs

API: https://api.elevenlabs.io/v1/sound-generation

7 SFX Categories

🧮
Props
01_props/ — Object interactions, clicks
🌿
Ambience
02_ambience_*/ — Wind, rain, birds
💨
Breath
03_breath/ — Breathing, sighing
👣
Footsteps
04_footsteps/ — Walking, running
🚪
Metal / Doors
05_metal_doors/ — Doors, locks, impacts
✍️
Writing
06_writing/ — Brush, pen strokes
👥
Crowd
07_crowd/ — Murmuring, chatter

File Naming

{category}/{descriptive_name}.mp3

Timecoded SFX (synced to a specific scene):
  abacus_beads_dark_01_0015.mp3   # for the 00:15 scene
  abacus_beads_01_0134.mp3        # for the 01:34 scene

Timecoded filenames (ending in _MMSS) are automatically matched to corresponding scenes.

🔍 Audio Review System

Flag unsuitable audio files through the app UI or via Claude Code MCP tools. Flagged files are tracked in .audio_review.json.

.audio_review.json Structure

{
  "media/sfx/02_ambience_wind/wind_howl_01.mp3": {
    "status": "flagged",
    "reason": "No timecode",
    "flaggedAt": "2026-03-16T05:44:55.228Z"
  }
}

MCP Tools

list_audio_reviews — View list of flagged audio files
update_audio_review — Add or remove flags

🎧 Audio Tab in App

Once imported, the Audio tab provides a full overview of your audio package.

📊 Summary View

Voice count per character, SFX count per category, total duration breakdown.

⏱️ Timeline View

Chronological list with automatic scene matching via timecodes.

▶️ Playback

Preview any audio file directly in the app with inline player.

🚩 Flag & Sort

Flag unsuitable audio for replacement. Sort by name, duration, or status.