Whisper -- Open Source Transcription Solution

I have recently come across an excellent program for video/audio transcription.

This is an open source project that covers many languages and has amazing accuracy right out of the box. It generates text in various formats such as:

  • plain text,

  • plain text with timestamps,

  • Video Text Track (VTT),

  • SubRib Subtitle File (SRT), and others.

  • works with dozens of languages;

  • has automatic language detection;

  • can be run from the command line;

  • punctuates sentences correctly;

  • and can even translate a transcription – e.g. you can feed it a video or audio file in another language, say, Japanese, and it will produce an English transcription.

  • It’s written in python;

  • it’s covered under the MIT open source license;

  • it’s pretty quick as an hour podcast transcribes in under 15 minutes; and

  • it comes from the folks who produced ChatGPT

Here is the screenshot of the content generated from the first 2 minutes of the latest podcast of Destination Linux. It looks to be 100% accurate – even skipping the music

Here is the link to the project’s github: