Generate Subtitles from Audio and Transcripts
Prompt workflow that generates time-aligned subtitles in SRT or VTT format from audio files and transcripts using ElevenLabs forced alignment API.
Why it matters
Automate the creation of time-aligned subtitles (SRT/VTT) for audio and transcript content. This asset leverages ElevenLabs' forced alignment capabilities to accurately synchronize spoken words with timestamps.
Outcomes
What it gets done
Process audio files to extract speech.
Utilize ElevenLabs for accurate speech-to-text alignment.
Generate SRT and VTT subtitle files based on aligned data.
Ensure precise timing for subtitles matching the audio.
Install
Add it to your toolbox
Run in your project directory:
curl -fsSL https://spark.entire.vc/get/pfoo-elevenlabs-alignment | bash Capabilities
What this chain does
Converts audio or video speech to written text.
Pulls structured data fields from unstructured text.
Overview
Elevenlabs Alignment
What it does
This prompt chain automates subtitle generation by processing audio files and transcripts through the ElevenLabs forced alignment API. It outputs time-aligned subtitle files in SRT or VTT format, with each line of text synchronized to the corresponding audio segment. The workflow handles the technical alignment process that matches transcript words to precise audio timestamps.
How it connects
Use this when you need to add subtitles or captions to video or audio content and already have a transcript. It's ideal for content creators, video editors, and accessibility teams who want to automate the time-consuming process of manually timing subtitle cues to match spoken audio.
Source README
yaml-language-server: $schema=../../site/static/config-schema.json
description: ElevenLabs Forced Alignment - Subtitle generation
Alignment uses audio files + transcripts - pass via vars
prompts:
- '{{transcript}}'
providers:
Basic alignment (JSON output)
- id: elevenlabs:alignment:json
label: Alignment (JSON)
SRT subtitle format
- id: elevenlabs:alignment:srt
label: Alignment (SRT Subtitles)
Default test configuration
defaultTest:
All tests will require alignment to complete
assert:
- type: not-contains
value: error
tests:
description: Align Armstrong moon landing speech
vars:
audioFile: examples/elevenlabs-stt/audio/sample1.mp3
transcript: "That's one small step for man, one giant leap for mankind."
format: json
assert:- type: javascript
value: output.includes('words') - type: not-contains
value: error
- type: javascript
description: Align Armstrong to SRT format
vars:
audioFile: examples/elevenlabs-stt/audio/sample1.mp3
transcript: "That's one small step for man, one giant leap for mankind."
format: srt
assert:- type: javascript
value: output.includes('-->') && output.includes('small step')
- type: javascript
description: Align sample2 hello message
vars:
audioFile: examples/elevenlabs-stt/audio/sample2.wav
transcript: "Hello. What's today's date? Could you please let me know?"
format: json
assert:- type: javascript
value: output.includes('words') - type: not-contains
value: error
- type: javascript
Discussion
Questions & comments · 0
Sign In Sign in to leave a comment.