Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Speech to Text - OffLLine?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Speech to Text - OffLLine?

Is there any reliable software/ service which can help convert recorded lectures and transcribe it?

I would prefer even a paid software, but, working on windows would be ideal and off line transcribing.

Some of the lectures contain confidential data, so, if its an online service, I would like to see it to be HIPAA compliant.

Any suggestions welcome.

Comments

  • I would be looking for something like this as well.
    All I know are services where people transcribe your recordings manually (billed per minute or line of text). As it’s manual work, it doesn’t come cheap (for private use cost levels).

    I was and am still hoping some day this could be done by a cloud service algorithm software thing, but I think to this day there is no such service.

  • TionTion Member

    Afaik most of consumer Text to Speech still relies on Ivona and Nuance voices which are rather poor and should never be used in a commercial product. You can cheap out by hiring foreign semi professionals.

  • AlexanderMAlexanderM Member, Top Host, Host Rep

    Check out Jaws / Dragon

  • @Tion said:
    Afaik most of consumer Text to Speech still relies on Ivona and Nuance voices which are rather poor and should never be used in a commercial product. You can cheap out by hiring foreign semi professionals.

    Thanks. But with some confidential data at stake I am not sure how offshoring works

  • jvnadrjvnadr Member
    edited March 2018

    plumberg said: Is there any reliable software/ service which can help convert recorded lectures and transcribe it?

    Dragon Speech Recognition Software is one of the best out there, with ability to download language files and do the transcoding on your pc without any internet connection. I don't know if it is HIPAA compliant, but their company also produces a lot of software for medical companies usage, so, it may be.
    A very nice free solution but with the need of using an internet connection, is Google's solution (you have to use chrome browser and google docs, though, to do the transcription).
    See this

    Tion said: Afaik most of consumer Text to Speech still relies on Ivona and Nuance voices

    I think OP is looking for the opposite (speech to text), not text to speech. And there are a lot of more products that deploying their own machines for synthesizing voices, even a Greek company has deployed IIRC a really good software for that with many languages already working.

    P.S. I recently started to use both solutions and I am really impressed of the accuracy they have achieved developing the solutions over the years. I used just yesterday google's solution on a public argument in a conference capturing the sound from my mobile phone's microphone in a distance of several meters from the loudspeakers and with a lot of noise there. i was impressed by the results!

  • I don't know anything offline; but google docs can help you online. Simply navigate to Tools - Voice Typing, and it's pretty self-explanatory.

  • AWS has an online service like this, which might qualify for HIPAA because of all the expensive certifications they have. I dunno how much the service costs. Speaker-independent STT is still generally pretty crappy, and tools like Dragon iirc want to be trained for specific speakers. You also want good, noise-free recordings so if you're sitting in a lecture hall with a recorder, you're probably out of luck no matter what software you use.

    If these are for class notes or something like that, transcribe them manually yourself, summarizing them rather than transcribing word for word. That will be a huge help to you in learning the material, while dumping the audio through a program will do nothing to help you.

  • @jvnadr said:
    Dragon Speech Recognition Software

    Wow. This video is pretty impressive, I must say.
    I tried dragon like 5-8 years ago and I must say, they seem to have come quite a long way since then.

    However the problem here as well as with the "google docs" -> tools -> voice type
    solution is that I have speeches in mp3 files that I need transcribed.

    So, as you seem to have some experience: Is there a solution where the input can be a file instead of a live dictate?
    This obviously means that it's not a dictate of that sort, with dictating line breaks and full stops, commas etc, but rather simply a tool that dumps all the speech as a text in one document and lets me sort out the rest?

    Also I need it for german, not english.

    I used just yesterday google's solution on a public argument in a conference capturing the sound from my mobile phone's microphone in a distance of several meters from the loudspeakers and with a lot of noise there. i was impressed by the results!

    "voicy typing"? or some way of "file upload"?

    BR southy

  • There are a bunch of cloud ML services -

    https://cloud.google.com/speech/

    https://dialogflow.com/

    https://azure.microsoft.com/en-us/services/cognitive-services/speech/

    Google is rated the highest. Offline open-source speech-to-text systems like "CMU Sphinx" and Kaldi are not going to be as good. CMU has a mobile app called PocketSphinx I think.

  • @rincewind said:
    There are a bunch of cloud ML services -

    https://cloud.google.com/speech/

    Hmm - I don't get it: they say they can also take input from files, but there's just the button with the microphone = supporting live speech input.
    I can't find a function to upload a file anywhere.
    Of course there are references to the general google cloud platform services but: hey: I don't want to develop something, I just want my recording to be transcribed.

    How can I upload a file there?

  • @southy said:

    @rincewind said:
    There are a bunch of cloud ML services -

    https://cloud.google.com/speech/

    Hmm - I don't get it: they say they can also take input from files, but there's just the button with the microphone = supporting live speech input.
    I can't find a function to upload a file anywhere.
    Of course there are references to the general google cloud platform services but: hey: I don't want to develop something, I just want my recording to be transcribed.

    How can I upload a file there?

    Because that's a demo. You need to sign in to the console and have a look there

  • rpollestadrpollestad Member
    edited March 2018

    Google is definitely the best of the best with regards to speaker independent speech to text. (We use it all the time.) They charge you in 15 second blocks, so it could get quite expensive, but I suppose it depends how much the project is worth to you. (3 hours of audio is ~$4.50.)

    Another solution not mentioned above is IBM's speech to text service, which we also use. It's not on the same level as Google but better than Sphinx or any homegrown solution. They do also give you 1000 free minutes per month, which is nice.

    Neither of those are "offline" solutions, though. They're API based. Reading above, the only recommended software offline that would work well is Dragon, but with some caveats. As others mentioned, that has to be trained to work effectively and last we checked, they wanted $10,000 upfront just to let us install a demo in our environment.

  • willie said: Speaker-independent STT is still generally pretty crappy, and tools like Dragon iirc want to be trained for specific speakers. You also want good, noise-free recordings so if you're sitting in a lecture hall with a recorder, you're probably out of luck no matter what software you use.

    This is... last year. The evolving of such software is incredible due to new algorithms and computer power. That's why more and more automatic systems (telephone support, automations etc.) are using exclusively voice recognition.
    If Dragon is trained, the results are approaching 100% accuracy. But even in a vanilla installation, the accuracy is reaching over 99% and the corrections needed are minimal. If you put your recorder in a hall and not close to the lecture, then, yes, the results will be not so good or even poor. But even an ear can lose words in a situation like this.

  • jvnadrjvnadr Member
    edited March 2018

    rpollestad said: that has to be trained to work effectively and last we checked, they wanted $10,000 upfront just to let us install a demo in our environment.

    Where did you find that? And when did you last check? 1985? The cost of the home edition (Dragon NaturallySpeaking - that's what I use) is ~90$ and for the pro version (Dragon Professional Individual 15) is ~400$ for PC, a really good price for what this offers. Just search a little.
    And the basic voice training of the software is just reading a ~100 words paragraph just after installation. Please, search first, then post and avoid false claims.

    EDIT: If you mean by asking 10.000 upfront to integrate it in your own software, then, I don't know the cost but this is totally different conversation and not translating speech to text. It is paying royalties to use a software in your own platform, something OP is not wanting.

  • @jvnadr said:

    rpollestad said: that has to be trained to work effectively and last we checked, they wanted $10,000 upfront just to let us install a demo in our environment.

    Where did you find that? And when did you last check? 1985? The cost of the home edition (Dragon NaturallySpeaking - that's what I use) is ~90$ and for the pro version (Dragon Professional Individual 15) is ~400$ for PC, a really good price for what this offers. Just search a little.

    It was 3-4 years ago, but this was for a corporate license demo/install, not a home user setup.

  • jvnadr said: Dragon NaturallySpeaking - ~90$ and for the pro version... ~400$

    What's the difference between those two versions? Thanks.

  • willie said: What's the difference between those two versions? Thanks.

    The main machine is the same (basic edition) but pro version can control Excell and write into cells, create and write into powerpoint, has an extended pallet of commands (editing pictures, control software etc.), extended functionality in custom commands (combining commands adding extra custom phrases). But for the most of us, home edition can work fine and do the job.

    rpollestad said: It was 3-4 years ago, but this was for a corporate license demo/install, not a home user setup.

    it was probably for a bunch of licenses (many dozens of them) because they have similar price model at least since early '10s when i started using it (2012 particularly for me).

    Thanked by 1willie
Sign In or Register to comment.