Creative Communities of the World Forums

The peer to peer support community for media production professionals.

Activity Forums Adobe Premiere Pro After Video Capture – Audio to Text

  • After Video Capture – Audio to Text

    Posted by Don Sommers on May 17, 2009 at 4:44 pm

    I routinely capture a LOT of corporate “talking head” or interview based video and then have to splice it together.

    Many times the video I shoot can end up being up to 1+ hours of different people that I have to condense down to a 5 minute video.

    Over the years, I have either watched all the snippets and hand-written what people have said .. or … watched the snippets and typed out what people have said .. or … watched the snippets and then dragged and dropped them into the timeline and edited.

    Using the text method allows me to quickly go through and highlight “what I want” and then pull the related snippets into the edit. I can also “search” for phrases and words after I’ve done this which is also helpful. Lastly, I can fire off the typed text to other individuals involved to get their input as well.

    Dragging clips into the timeline is great, but it precludes that you have the individual who is in charge of the project there with you .. and you can spend a lot of time in the day. These people are not available for this (in my case) so most times, it’s not an option. That’s why the text version, rough edits and .wmv or .flv output files for final approval are what I normally (90%) end up using.

    I was hoping that when I took Premiere Pro CS4 for a run, that it’s new “extract audio” feature would be revolutionary for me. It has been a great dissappointment instead in that it has been nothing short of lame in “desciphering” audio to text.

    My Question:
    Other than Dragon Naturally Speaking …. does anyone have any tips, suggestions, ideas or magic potions that would “decently” extract text from audio files. Project by project, typing out over 1 hour’s worth of talking is killing my productivity.

    P.S. – due to the economy … assistants (to extract the audio and do the typing) are not an option .. 🙁

    Thanks in advance,…

    Jason Rock replied 11 years, 10 months ago 9 Members · 16 Replies
  • 16 Replies
  • Vince Becquiot

    May 17, 2009 at 6:40 pm

    Yep, the speech to text is pretty bad in my experience as well. We’ve setup our nice VO mic next to the edit suite speakers and run that through a text to speech software when the client needs a transcript.

    Works very well, don’t use a cheap desktop mic or you’ll get the same result as Adobe, and of course the place has to be very quiet.

    There are applications out there that will do the conversion, but in my experience the formats supported are very limited.

    Vince Becquiot

    Kaptis Studios
    San Francisco – Bay Area

  • Don Sommers

    May 17, 2009 at 7:05 pm

    Thanks for your response, Vince.

    Comment from your posting:> “run that through a text to speech software when the client needs a transcript.”
    – I’m assuming you mean speech to text software in this comment…

    Can you tell me which speech to text software you have (or are) using. I’ll try most anything at this point because I really need to get some time back.

    Thanks again…

  • Vince Becquiot

    May 17, 2009 at 9:03 pm

    Dragon Naturally Speaking is what we used.

    Vince Becquiot

    Kaptis Studios
    San Francisco – Bay Area

  • Don Sommers

    May 17, 2009 at 11:18 pm

    I’ll give it a try .. and cross my fingers … 🙂

    I use a Sennheiser Lav mic in what is usually a “studio like” environment (no “noise”, etc.) so my sound is always really good. Hopefully, good enough for the software to recognize.

    Thank you again for your help….

  • Adrian Sancho

    May 18, 2009 at 12:46 am

    Typically speech recognition software works best when it’s “trained”. This usually involves reading predefined test that the program “knows” and uses it to decipher your speech patterns and create a profile of you.

    Needless to say, this is next to impossible when doing videos like you do, unless you can somehow get the speakers to read at least a short verse the program can analyze. Even that sounds like a long shot.

    I’ve never worked with Dragon, so I don’t know how “smart” it is out of the box. It’s been a while since I’ve dealt with speech recognition, and it would be very nice indeed if a program like Dragon can work right out of the box, or if Dragon has the ability to be “taught” from existing audio instead of predefined text reads. Then it would be a great boon to speech recognition indeed.

  • Don Sommers

    May 18, 2009 at 1:11 am

    I did think of that as well. Not sure what Vince’s success rate has been with it … but it just occurred to me that I could listen and repeat what was being said in the video into the software.

    Thereby, the software would be “trained” to my voice (as you suggest) and I could literally “dictate” to it via the video.

    Still very crude .. but a far cry better than the hours of typing that I am currently doing.

    I’m going to try it out …. It’ll cost me around $120. It will either be the best money I’ve spent in a long time … or, I’ll be back to typing again and out a few bucks…

  • Vince Becquiot

    May 18, 2009 at 1:39 am

    It’s not perfect but it’s a light years away from the Adobe system. We tried the “repeating” way of doing things. I’ll be ready to bet that’s you’ll higher a transcriptionist after 2 days of work 🙂

    Vince Becquiot

    Kaptis Studios
    San Francisco – Bay Area

  • Mike Cohen

    May 18, 2009 at 5:50 pm

    We use a medical transcription service. We just upload MP3 files to their FTP site and within a few days or the next day if we ask for a rush, we have a 99% accurate transcript.
    I too tried the speech to text feature in Premiere, and it is pretty bad compared to the marketing.
    I am a pretty fast typist so for short clips I sometimes transcribe myself, but in most cases it is easier and more cost effective to let a trained professional do it.

    Remember that doing something yourself may seem like it is saving money, but let’s say you make $30/hour editing video. If you are spending 3 hours transcribing something, you are not billing $90 of editing work you are doing work that someone earning $15/hour could do in half the time.

    Mike Cohen

  • Don Sommers

    May 18, 2009 at 8:19 pm

    Thanks Mike,

    Going with your thoughts on “external transcription services” … Can you (or anyone else) make recommendations on a such a service (services) … or do privacy issues prevent that on this forum?

    As another quick note … This is one of the other links I found that uses a stop/start foot pedal so the typist can readily listen to a line, stop the audio to catch up, then start the audio again… https://www.nch.com.au/scribe/index.html

    It’s rather “secretarial” to say the least … but at least you can stop and start the audio WHILE typing instead of always clicking to stop … typing …. then clicking to start, and stop again.

  • Don Sommers

    June 22, 2009 at 2:06 am

    Well … I shelled out the money for Dragon Naturally speaking, and … after about 20 minutes of “voice training” took on a 1.5 hour video.

    I did this by listening to the clips in media player, and then speaking them back into the program (which saves them in an .rtf Word format). Low and behold, it worked!

    This software has come a long way since I first tested it, and it was about 95% accurate. Most times it wasn’t it ended up being my fault for getting a bit bored and not articulating. It also “learns” as it goes … picking up a revision of a company name once, and then recognizing it for the rest of the time.

    This is light years better than listening … stopping …. then typing. It’s STILL a pain in the butt to have to do this, however, this software does make it more bearable.

    I did (for the heck of it) attempt to just have the software pull decipher via the audio. It got confused quickly, and only about 30% came through correctly. It NEEDS to be “trained” for the voice to be effective.

    And now, I can get anyone (especially bad typists) to use this and pass the work off to others as around me as well if required.

    Nice …. 🙂

Page 1 of 2

We use anonymous cookies to give you the best experience we can.
Our Privacy policy | GDPR Policy