File Import Guide
How to import CSV, SRT, and text files into Whisk2CapCut
π Get Started in 5 Minutes Recommended
No complex setup needed. A simple text file is all you need to start!
Create a Text File
Write scene descriptions, one per line, in any text editor
A traveler walking through the forest
The hero arrives at the village
Import the File
In Whisk2CapCut, click "π Import" button β Select "Prompt Text File"
Generate Images!
Click "Start Generation" button at the bottom and Whisk will automatically create images for each scene
π‘ Next: Start with a text file. When you need subtitles or consistent characters, use the CSV formats below.
π Supported File Types
One line per scene, simple format
Scene information with prompts, subtitles, and durations
Character/background/style definitions (for consistency)
Standard subtitle format, auto-converted to scenes
π Workflow Overview
See how the files connect at a glance
β Basic (Text only)
β Advanced (Character consistency)
king
king
π‘ Summary: When characters, scene_tag, style_tag in Scene CSV match name in Reference CSV, reference images are automatically applied.
π Import Modal Preview
This modal appears when you click the "π Import" button. Click on the file format you want.
Select a file format
Help buttons below each option:
π Prompt Text File Easiest
The simplest format - just write one scene description per line. Each line becomes a separate scene with a default duration.
Example
A sunrise over mountain peaks Two travelers walking through a valley A campfire under the stars The journey's end at a peaceful village
π Scene CSV File
The Scene CSV file defines each scene's content, including prompts for image generation, subtitles, and timing.
π€ Why use CSV?
Text files are simple, but CSV gives you more control:
- β’ Subtitles separate from prompts (different content)
- β’ Character/Background/Style tags β Auto-match with reference images (see below)
- β’ Per-scene duration settings (3s, 5s, etc.)
π‘ When tags match references, you'll see β in the scene list
Required Columns
| Column | Description | Example |
|---|---|---|
| prompt | Image generation prompt | A hero stands on a cliff at sunset |
| subtitle | Subtitle text for the scene | The journey begins here. |
| characters | Character names (comma-separated) | Hero, Mentor |
| scene_tag | Scene category tag | opening, action, dialogue |
| style_tag | Visual style tag | cinematic, anime, realistic |
| duration | Scene duration in seconds | 5 |
CSV Example
prompt,subtitle,characters,scene_tag,style_tag,duration "A wise old king sits on golden throne","The wise old king sits on his golden throne",King,palace,cinematic,5 "Beautiful queen enters through doors","The beautiful queen enters through grand doors",Queen,palace,cinematic,4 "King and queen discuss matters","The king and queen discuss important matters",King;Queen,palace,cinematic,5
| prompt | subtitle | characters | scene_tag | style_tag | duration |
|---|---|---|---|---|---|
| A wise old king sits on golden throne | The wise old king sits on his golden throne | King | palace | cinematic | 5 |
| Beautiful queen enters through doors | The beautiful queen enters through grand doors | Queen | palace | cinematic | 4 |
| King and queen discuss matters | The king and queen discuss important matters | King;Queen | palace | cinematic | 5 |
πΌοΈ Reference CSV File
The Reference CSV file defines characters, backgrounds, and styles that can be used across scenes for consistent visual generation.
β No images required!
You can leave image_path empty in the CSV. Add images later using:
- β’ Direct upload - Click image area in References tab β Select file
- β’ AI generation - Generate reference images in Whisk based on description
- β’ Add later - Set up the structure now, add images anytime
π Tag Matching System (Very Important!)
The name field in Reference CSV automatically matches with tags in Scene CSV.
When generating images, matched reference images are uploaded to Whisk to maintain consistent style/characters.
type: charactername: King
characters column
type: backgroundname: palace
scene_tag column
type: stylename: cinematic
style_tag column
Required Columns
| Column | Description | Example |
|---|---|---|
| type | Reference type (character / background / style) | character |
| name | Name for tag matching (must match Scene CSV) | King, palace, cinematic |
| image_path | Path or URL to reference image | ./images/king.png |
| description | Detailed description (for image generation) | Wise old king with white beard |
πΌοΈ Plugin Reference Panel Preview
This is the Reference panel displayed when you click "πΌοΈ Ref" button. When you import a CSV, cards are created here.
π‘ Cards with prompt but no image can generate AI images with "π¨ Generate" button
CSV Example
type,name,image_path,description character,King,./images/king.png,Wise old king with white beard and golden crown character,Queen,./images/queen.png,Elegant queen in red dress background,palace,./images/palace.png,Grand royal palace interior with ornate decorations style,cinematic,./images/cinematic.png,Film-like dramatic lighting
| type | name | image_path | description |
|---|---|---|---|
| character | King | ./images/king.png | Wise old king with white beard and golden crown |
| character | Queen | ./images/queen.png | Elegant queen in red dress |
| background | palace | ./images/palace.png | Grand royal palace interior with ornate decorations |
| style | cinematic | ./images/cinematic.png | Film-like dramatic lighting |
π¨ Style Tag List 87 styles
Available style IDs for the style_tag column.
These can also be used as the name for style type in Reference CSV.
π‘ Click a style tag to copy it.
π¬ Subtitle SRT File
Import standard SRT subtitle files. Each subtitle block is automatically converted to a scene with the subtitle text used as both the prompt and subtitle. Duration is calculated from the timecodes.
SRT Structure
1 00:00:00,000 --> 00:00:05,000 The hero awakens in a mystical forest. 2 00:00:05,000 --> 00:00:09,500 Strange lights guide the way forward. 3 00:00:09,500 --> 00:00:14,000 A mysterious figure appears in the distance.
Each numbered block becomes one scene. The timecode determines scene duration automatically.
ποΈ Auto-generate SRT from TTS Services
These TTS (Text-to-Speech) services automatically generate SRT subtitle files along with the audio. Get your narration and subtitles ready to import directly into Whisk2CapCut.
π‘ Tip: After generating audio with a TTS service, download the SRT file and import it into Whisk2CapCut for perfectly timed scenes.
π€ Auto-generate CSV with AI
Paste the prompts below into Claude, ChatGPT, Gemini, or other AI tools along with your story to automatically generate CSV files.
π Scene CSV Generation Prompt
You are a scene breakdown assistant for Whisk2CapCut video production. Given a story or script, create a Scene CSV file with these columns: - prompt: English scene description for AI image generation (include composition, lighting, mood, camera angle) - subtitle: English subtitle text for the scene (concise, under 50 chars) - characters: Character names in the scene (semicolon-separated if multiple, e.g., King;Queen) - scene_tag: Background/location tag (use consistent tags like palace, forest, village) - style_tag: Art style (keep consistent, e.g., cinematic, ghibli, ink-wash) - duration: Scene duration in seconds (3-5 for normal, 2 for quick cuts, 6+ for dramatic) Rules: 1. Each row = one visual scene (single image frame) 2. Keep prompts descriptive but under 200 characters 3. Use consistent character names throughout (these match reference images) 4. Group similar locations under one scene_tag 5. Consider pacing: vary duration for dramatic effect 6. Output as CSV with header row, quote values containing commas Example output: prompt,subtitle,characters,scene_tag,style_tag,duration "A wise old king sits on golden throne, dramatic lighting","The wise king sits on his throne",King,palace,cinematic,5 Now break down this story into scenes:
πΌοΈ Reference CSV Generation Prompt
You are a reference planning assistant for Whisk2CapCut. Given a story, create a Reference CSV with these columns: - type: Reference type (character, background, or style) - name: Name for tag matching (MUST match scene CSV tags exactly) - image_path: Suggested image filename - description: Detailed description for generating/finding the reference image For characters: List all unique characters with their visual appearance For backgrounds: List all unique locations/settings mentioned For styles: Suggest 1-2 art styles that fit the story mood IMPORTANT: The 'name' field must match exactly with: - character type β matches 'characters' column in scene CSV - background type β matches 'scene_tag' column in scene CSV - style type β matches 'style_tag' column in scene CSV Example output: type,name,image_path,description character,King,./images/king.png,"Wise old king with white beard and golden crown" background,palace,./images/palace.png,"Grand palace throne room with red carpets" style,cinematic,./images/cinematic.png,"Dramatic movie lighting with depth" Now create the reference list for this story:
π‘ AI Tool Tips
Great for long stories. Use Artifacts feature to download CSV directly.
Use Code Interpreter to create and download CSV. Preview as table.
Works with Google Docs. Easy export to spreadsheets.
π How to Import Files
Open Whisk2CapCut extension
Click the extension icon in your Chrome toolbar
Click the Import button
Located in the sidebar or toolbar
Select your file
Choose a CSV, SRT, or TXT file from your computer
Review and generate
Check the imported scenes and start generating images
π₯ Download Sample Files
Download these sample files to get started quickly: