WhisperX — Automated Transcripts w/ Timestamps and Speaker Tagging

dgdft@lemmy.world · 1 day ago

WhisperX — Automated Transcripts w/ Timestamps and Speaker Tagging

Optional@lemmy.world · 1 day ago

I’ve also used it in a hacky script pipeline to bulk download podcast episodes with yt-dlp, create searchable transcripts, and scrub ads by having an LLM sniff out timestamps to cut with ffmpeg.

This is genius. Could you appify this and I’ll pay you in real or pretend currency as you prefer

I’ve found it great for DMing TTRPGs — simply record your session with a conference mic, run a transcript with WhisperX, and pass the output to a long-context LLM for easy session summaries. It’s a great way to avoid slowing down the game by taking notes on minor events and NPCs.

Okay that’s just crazy. ;)

Justin@lemmy.jlh.name · edit-2 1 day ago

Probably not that hard to build a simple flask frontend around it.

Automatically processing files in an S3/WebDAV directory would also be useful.

WhisperX — Automated Transcripts w/ Timestamps and Speaker Tagging

WhisperX — Automated Transcripts w/ Timestamps and Speaker Tagging

GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)