speech recognition / automated transcribing

Apple Final Cut Pro Legacy

speech recognition / automated transcribing

Stefan Hansen replied 13 years, 4 months ago 18 Members · 28 Replies

David Roth weiss
March 20, 2011 at 11:40 pm

I’ve been working on a review of GET for quite some time and will have that review here soon on the Cow. And yes, plenty of people are using GET, especially for documentary work. I consider it to be “an essential,” and recommend it highly.

Nothing yet completely replaces accurate timecoded transcripts when editing interviews for documentary work, but GET is by far the best tool of it’s type for FCP. If you know that any of your interviewees have spoken specific words or about specific a topic, GET can find every instance of them in just seconds. It does what computers are designed to do, and even with great transcripts, there’s no way a human being can come close to what it can do.

David Roth Weiss
Director/Editor/Colorist
David Weiss Productions, Inc.
Los Angeles
https://www.drwfilms.com

POST-PRODUCTION WITHOUT THE USUAL INSANITY ™

A forum host of Creative COW’s Business & Marketing and Apple Final Cut Pro forums. Formerly host of the Apple Final Cut Basics, Indie Film & Documentary, and Film History & Appreciations forums.
Benjamin Dewhurst
April 6, 2011 at 1:39 pm

David, as a silent observer I’m usually impressed with your posts and suggestions, and your tenacity as a COW leader, but this one takes the cake. Bravo on the find man. Include me in the ranks of those who have had ‘life saver’ moments on the COW thanks to you. GET’s unreal, and crucial.

——
The Hungry Frontiersmen – Sketch Comedy EXTREME!
https://youtube.com/hungryfrontiersmen
Terry Simpson
November 25, 2011 at 5:25 pm

The GET link leads nowhere now, and a search of their site turns up nothing.

Anyone else have a solution? I am taking a project over to a friend who has Premiere, which gets good reviews (particularly for it’s ability to edit to an edited text file–sounds fantastic!). BTW, THIS is what FCP 8 should have had to catch up to the Adobe release of 2 years ago…

There is also this tidbit for folks: YouTube has a CC capability, from which you can download a text file, then I found an internet based app (video-critters.org) which allows you to use keyboard strokes to pause the video while you type or correct the transcript. And then download the corrected transcription file (which could be uploaded to YouTube–if CC is what you are trying to do). If you are just trying to edit a talking heads tape and want to cut from a yellow highlighted paper transcript with TC numbers every line of text this will do it.

This workflow inserts TC every 4 seconds (depending on how long you select for a line of text to be seen onscreen). I’ve used it and because I only type about 25 words per minute, making the corrections usually is time-efficient. For those who type 40wpm, I have the feeling that starting with the YouTube CC file is a waste of time because you could type the whole line or phrase quicker than you can highlight and replace the 40-50% of the words that are wrong. So just uploading clips to YouTube and then skipping straight to video-critters to type in all the transcription may be the way to go.

FWIW, here’s the workflow:
Upload clip(s) to youTube (making the video private of course)
1) on the bottom right corner, click on the [CC] and make sure it chooses “English-filename
2)download the “Automatic” or “Machine Transcription” version of the subtitles, corrected the first couple of paragraphs and re-uploaded it; this is what you will do with it on all the others)
3) CC should now be red and they will appear at the bottom of the screen.
4) click on the tab: [Edit captions/subtitles]
5) click on [download] on English-filename

Then to correct the transcription with the Video Critters Captioning tool, see the tutorial at
https://www.youtube.com/watch?v=KvI-YjkxAFo

Actual tool at:
https://videocritter.org/

I offer this workflow without wholehearted endorsement. It doesn’t seem quite like a professional solution, since I have the phone number of a transcriptionist who will produce a similar document for $40/hr, taking 1.5x to 2.5x for every hour of interview footage, I always have to try to figure the difference between just ripping audio files and sending them to her to transcribe, then paying someone else to insert TC into her doc, or me trying to accomplish the same in a $100 an hour edit suite.

On low-budget projects, I have been somewhat successful requiring the client to produce the transcripts. They often pass on it the next project, and gladly pay the $300-$500 it adds to a budget.

Still looking for a better solution,

Terry
David Mercer
December 14, 2011 at 2:15 am

Strangest thing happened today. Left a message for family in Connecticut on their home phone. I’m in Guatemala.

I get an email from them, from their iphone/blackberry/etc, which includes not just a .wav copy of my message, but a text transcription of my voice message as well. I’d say it’s about 85% accurate. Looks like it came from Vonage.

So how do I get my hour long interview transcribed with similar accuracy? Do I need to call them on the phone and then playback the interview?

I had no idea how advanced this stuff was … but again I’m living in a highland village in Guatemala.
Erkan Karabacak
July 26, 2012 at 7:36 pm

Ernie,

No need to die. You can use VoxcribeCC for 60 video-minutes.

Take care,
Erkan Karabacak
July 26, 2012 at 7:50 pm

Mark,

You might want to check out VoxcribeCC.
You can use it for free for 60 video-minutes.
https://www.voxcribe.com

Have a good day.
Erkan
David Jones
November 24, 2012 at 12:26 pm

Automatic voice recognition technology is not still matured to produce accurate transcription for non-American accents, or with people speaking quickly or multiple speakers audio files. If you have more than one voice it is almost impossible to get a good transcript.To get accurate transcription try some manual transcription services
Stefan Hansen
January 6, 2013 at 8:48 pm

VoxcribeCC can handle speaker independent or multiple-speakers media (audio/video) recognition tasks.

As you can guess, still human performance is superior. But I can easily say “VoxcribeCC is the most accurate speaker- and topic-independent desktop speech recognition application in the world.”

VoxcribeCC is the most innovative editor for media captioning and transcription. Click here to see the seamless integration of speech recognition and captioning. This innovative interface gives full control on the quality, and saves captioners tedious work.

You can easily check its accuracy: Click on this link to see a few videos that were captioned by VoxcribeCC without any human editing or correction.

You can test the accuracy yourself: Click on this link for the download page.

Best regards,

Page 3 of 3

← 1 2 3

Reply to this Discussion! Login or Sign Up

Creative Communities of the World Forums