BuildWithMatija
Back to Builds
ToolActivePublic

Long Audio Transcriber

A small Python tool for turning long audio files into resumable, timestamped transcripts.

  • Python
  • OpenAI transcription API
  • ffmpeg
  • ffmpeg-python
  • requests
  • python-dotenv
  • Docker Compose
GitHub
Problem
Long recordings are awkward to transcribe with a simple API call because large files can exceed upload limits, fail midway, or produce output that is hard to navigate. If a transcription stops after several chunks, starting again from zero wastes time and API cost. Even after transcription succeeds, a single wall of text is not useful when you need to find the part of the audio where something was said.
Thesis
The tool keeps the workflow simple: take one audio file, split it only when needed, transcribe each chunk, save progress after each successful chunk, and merge the result. Instead of building a full web app, it focuses on a CLI/Docker workflow that is easy to run locally. Timestamped output and interval grouping make the transcript easier to review after the raw transcription is done.
Validation
The public repo contains a working Python implementation, Docker Compose setup, README instructions, progress tracking, chunked transcription, timestamped JSON output, and interval-based transcript processing. The README documents support for large files, resume behaviour, timestamps, and multiple audio formats. There is no documented live demo, hosted product, user count, npm package, pricing, or customer validation.
Proof points
  • Public GitHub repository: https://github.com/matija2209/long-audio-transcriber
  • README documents automatic splitting for files over 25 MB.
  • README documents resume support, timestamped transcriptions, interval grouping, and multiple audio formats.
  • Docker Compose setup builds a Python 3.12 container with ffmpeg and runs main.py.
  • main.py saves transcription.txt, transcription_timestamps.json, and transcription_progress.json.
  • process_transcription.py groups words into configurable time intervals and writes transcription_by_intervals.txt.
  • No hosted demo, package registry release, pricing, or production usage metrics are documented.
Audience
  • Developers who need a local script for long audio transcription
  • People transcribing long recordings that exceed single-upload API limits
  • Users who want resumable transcription instead of restarting after a failed chunk
  • Anyone who needs timestamped or interval-grouped transcript output

What it is

Long Audio Transcriber is a small Python utility for transcribing long audio files with OpenAI's transcription API.

The core job is simple: point it at an audio file, let it split large files into safe chunks, transcribe each chunk, save progress, then merge everything back into a complete transcript.

It produces plain text, timestamped JSON, and an optional interval-based transcript that groups the text by time ranges.

What problem it solves

The annoying part of long audio transcription is not only speech-to-text.

The real friction is everything around it:

  • large files can exceed upload limits
  • long jobs can fail midway
  • restarting from zero wastes time and API cost
  • raw transcripts are hard to navigate without timestamps
  • a long wall of text is not useful when reviewing recordings

This project solves those practical workflow problems without adding a web interface or account system.

How it works

The tool reads configuration from environment variables, including the audio path, API key, and maximum chunk size.

If the audio file is larger than the configured limit, it uses ffmpeg to split it into smaller WAV chunks. Each chunk is sent to OpenAI's audio transcription endpoint. After every successful chunk, the result is written to transcription_progress.json.

That progress file is the resume mechanism. If the process is interrupted, already processed chunks can be reused instead of transcribed again.

Once all chunks are processed, the tool merges the transcription output into:

  • transcription.txt
  • transcription_timestamps.json
  • transcription_progress.json

A separate processing script can then group timestamped words into time intervals and write transcription_by_intervals.txt.

Features

  • Transcribes audio files locally through a Python script
  • Automatically splits files that exceed the configured size limit
  • Saves progress after each processed chunk
  • Resumes interrupted transcriptions from the progress file
  • Generates raw text output
  • Generates timestamped JSON output
  • Groups transcript text into configurable time intervals
  • Supports common audio formats documented in the README: mp3, mp4, mpeg, mpga, m4a, wav, and webm
  • Can run directly on a machine or through Docker Compose

Technical notes

This is a utility script, not a SaaS product.

There is no documented hosted app, dashboard, authentication layer, billing, or public demo. The README gives local and Docker-based setup instructions.

The implementation depends on ffmpeg for audio probing and chunking, and uses requests to call the OpenAI audio transcription API directly.

What exists today

The repository contains:

  • README.md with setup, usage, outputs, configuration, and error-handling notes
  • main.py for chunking, transcription, progress tracking, merging, and output writing
  • process_transcription.py for grouping transcript text into intervals
  • requirements.txt with Python dependencies
  • docker-compose.yml for running the tool in a Python container with ffmpeg
  • gen_dot_env.sample.sh for generating a local .env

What does not exist yet

There is no documented web UI.

There is no hosted demo.

There is no package release.

There is no documented pricing or revenue model.

There are no documented users, customers, or production metrics.

There is no license file in the repo, so the code is public, but the reuse terms are not defined in the repository.

Related services

  • AI systems & automation
  • Internal tools

Working through something similar?

If your company has a workflow, content system, or internal process that needs to become real software, this is the kind of work I can help with.

Get in touch
Build with Matija logo

Build with Matija

Modern websites, content systems, and AI workflows built for long-term growth.

Services

  • Headless CMS Websites
  • Next.js & Headless CMS Advisory
  • AI Systems & Automation
  • Website & Content Audit

Resources

  • Case Studies
  • How I Work
  • Blog
  • Topics
  • CMS Hub
  • E-commerce Hub
  • B2B Website Strategy
  • Dashboard

Headless CMS

  • Payload CMS Developer
  • CMS Migration
  • Multi-Tenant CMS
  • Payload vs Sanity
  • Payload vs WordPress
  • Payload vs Contentful

Get in Touch

Ready to modernize your stack? Let's talk about what you're building.

Book a discovery callContact me →
© 2026Build with Matija•All rights reserved•Privacy Policy•Terms of Service
BuildWithMatija
Get In Touch