Thanks! This is very good. The assumption for this work flow is that you already have a word-perfect transcript file (my case 🙂 – this makes the AME voice-to-text engine FAR more accurate, because it only has to find word timings – less deciphering of words. I used this workflow to produce a valid .srt file for use elsewhere. For work inside PrPro CC, it is indeed the missing bridge between speech recognition and caption editing.
Captiontube seemed promising, in that it is supposed to accept an ordinary .txt file as the transcript, but I could not get it to accept any movie URLs I entered.
If you do NOT have a literal transcript to start with, then the speech recognition in Camtasia Studio 8 (uses the Windows speech recognition engine, I think) is about the same (in)accuracy as AME.