Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


What is a good or cheap audio to text transcription service?
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

What is a good or cheap audio to text transcription service?

I have around 10-15 hours of audio recorded lectures to transcribe to text. I've done a google and there's loads from machine transcription to human at £/$1 per minute. I just need the lectures transcribed, machine would be OK as I will need to listen to them anyway. Time stamp would be useful for ease of corrections.

What free or cheap sites/apps have people used or recommend?

Comments

  • unfortunatelyunfortunately Member
    edited December 2022

    try
    https://tinywow.com/video/audio-to-text

    Just stumbled upon this on the internet.
    Its free.
    But I haven't had a chance to try it yet.

  • @unfortunately said:
    try
    https://tinywow.com/video/audio-to-text

    Just stumbled upon this on the internet.
    Its free.
    But I haven't had a chance to try it yet.

    only 5 mins free!

    My lectures are from 30 mins to around 75 mins long.

  • not really what I'm looking for. I'm looking for a service/website that people have used and can recommend.

  • ChatGPT, transcribe these audio files (?)

  • rpollestadrpollestad Member
    edited December 2022

    There are a lot of players in the space, yes, but only a handful that would deliver quality + automation with no commitment and those would come at a cost. (Google, IBM) Very doubtful that you will find a quality service to do that man hours of audio for free (or cheap).

    Google does give you 60 minutes free per month, and if you haven't used it before, you can get $300 in trial credits for cloud services, which would cover you.

    You could roll your own with something like Sphinx/PocketSphinx -- haven't tried whisper yet -- but it would take a lot of work to get any decent transcription out of most audio. (But it would be free.)

    Of course, if you're not looking for any kind of self-service and just want to pass it off to another human to transcribe, most services range around $1/min like you noted. (I've seen some for like ~$0.80/min but with some kind of commitment.)

    Thanked by 1asterisk14
  • @rpollestad said:
    There are a lot of players in the space, yes, but only a handful that would deliver quality + automation with no commitment and those would come at a cost. (Google, IBM) Very doubtful that you will find a quality service to do that man hours of audio for free (or cheap).

    Google does give you 60 minutes free per month, and if you haven't used it before, you can get $300 in trial credits for cloud services, which would cover you.

    You could roll your own with something like Sphinx/PocketSphinx -- haven't tried whisper yet -- but it would take a lot of work to get any decent transcription out of most audio. (But it would be free.)

    Of course, if you're not looking for any kind of self-service and just want to pass it off to another human to transcribe, most services range around $1/min like you noted. (I've seen some for like ~$0.80/min but with some kind of commitment.)

    Thanks I'll look into those.

    I've seen Notta.ai that does something like 1800 mins a month for 13.99/month billed monthly which seems like an OK price for 1800 mins but obviously I don't know the quality of the service. So I'm quite happy to pay for a month as long as it covers around 15 hours (900 mins) and then I could cancel after doing the 15 hours.

    So if anyone knows a GOOD service billed monthly then please speak up.

  • Youtube for free alternative.

    Thanked by 1asterisk14
  • Which language (s)?

  • @asterisk14 said:

    Thanks I'll look into those.

    Just to follow up: I did finally try whisper. It is pretty impressive for an out-of-the-box free solution. You have to do a bit of legwork (install python, ffmpeg, etc.) and have a decent RAM/CPU linux box, but the transcription is superb. No need to mess with audio training and the like.

    Don't know about an audio size limit, though, so you may want to research that. But I remain impressed.

    Also, I found this while checking out the github for whisper: https://freesubtitles.ai/

    Limited to 1 hour duration and there is a file size limit, but might be worth trying out.

  • MaouniqueMaounique Host Rep, Veteran
    edited December 2022

    @Dazzle said: Youtube for free alternative.

    Yeah, but sucks. In general, lecture sound quality is kind of low, unlike someone speaking in a microphone of a good quality. Youtube needs pretty clear speech with low ambient noise, preferably music or synthesized sound effects which can recognize. A lecture, unless done specifically for recording with good mic et all, usually does not meet the criteria.

    Thanked by 1asterisk14
  • asterisk14asterisk14 Member
    edited December 2022

    @vyas11 said:
    Which language (s)?

    English (British accents)

    @Dazzle said:
    Youtube for free alternative.

    Didn't think of that. Can the subtitles/CC be downloaded?

    @rpollestad said:

    @asterisk14 said:

    Thanks I'll look into those.

    Just to follow up: I did finally try whisper. It is pretty impressive for an out-of-the-box free solution. You have to do a bit of legwork (install python, ffmpeg, etc.) and have a decent RAM/CPU linux box, but the transcription is superb. No need to mess with audio training and the like.

    Don't know about an audio size limit, though, so you may want to research that. But I remain impressed.

    Don't have time to spend on this type of "project" at the minute, need the audio translated ASAP and then I can check and correct any errors

    Also, I found this while checking out the github for whisper: https://freesubtitles.ai/

    Limited to 1 hour duration and there is a file size limit, but might be worth trying out.

    I'll give that a look. So much thanks, that's the best thing about LET, someone usually knows the answer!

  • @asterisk14 said:

    @Dazzle said:
    Youtube for free alternative.

    Didn't think of that. Can the subtitles/CC be downloaded?

    Yeah, sure. You can copy the subtitles along with timestamp. Pretty good if the voice is clear.

    Thanked by 1asterisk14
  • @Dazzle said:

    @asterisk14 said:

    @Dazzle said:
    Youtube for free alternative.

    Didn't think of that. Can the subtitles/CC be downloaded?

    Yeah, sure. You can copy the subtitles along with timestamp. Pretty good if the voice is clear.

    I'll check it out. Some of the audio is good quality so maybe it'll work. Thanks

  • vyas11vyas11 Member
    edited December 2022

    @asterisk14 said:

    @Dazzle said:

    @asterisk14 said:

    @Dazzle said:
    Youtube for free alternative.

    Didn't think of that. Can the subtitles/CC be downloaded?

    Yeah, sure. You can copy the subtitles along with timestamp. Pretty good if the voice is clear.

    I'll check it out. Some of the audio is good quality so maybe it'll work. Thanks

    Based on my experience:

    (Entire November I recorded for abt 1.5 hrs/day and use speech to text to “write” 4 books.) Here is the workflow:

    Use audacity or ocenaudio to clean up background noise no matter what.

    Reduce audio speed to about 92 to 95 percent of original. Slower speaking speed will irritate human ear but STT loves it.

    Export as min 192 kbps mp3 or ogg at similar.

    Get Otter.ai subscription for a month. Will be well worth it.

    Other suggestions above are good, but in 1.5-2 hours, the above steps will give you what you need. Under 10 us dollars

  • rpollestadrpollestad Member
    edited December 2022

    @asterisk14 said:

    Don't have time to spend on this type of "project" at the minute, need the audio translated ASAP and then I can check and correct any errors

    It's pretty simple to get up and running if you have a spare VPS with 2+ cores laying around. It's just a bunch of yum commands and a git pull and you're good to go. (It does also run on Windows, and there are tutorials for that which are easily findable.)

    For those interested:

    I tested this on AlmaLinux 8, just because having newer versions of apps is easy to make everything work. Should work on any distro, though, as long as you can install python 3.8 and ffmpeg.

    update packages, install dev tools and python (required for whisper)

    yum update -y
    yum -y groupinstall "Development Tools"
    yum -y install python38 yum-utils

    do a bunch of extra stuff to install ffmpeg from a repo

    yum -y install https://download.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
    yum-config-manager --enable powertools
    yum install --nogpgcheck https://mirrors.rpmfusion.org/free/el/rpmfusion-free-release-8.noarch.rpm -y
    yum install --nogpgcheck https://mirrors.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-8.noarch.rpm -y

    install ffmpeg

    yum install -y ffmpeg ffmpeg-devel

    install whisper

    pip3.8 --no-cache-dir install git+https://github.com/openai/whisper.git

    And you're all set to start translating.

    [~]# whisper --task transcribe --model tiny.en --fp16 False /tmp/011-44-800055xxxx-2-2x.wav
    [00:00.000 --> 00:05.920] In today's interview, you will be asked several simple questions that only require accurate
    [00:05.920 --> 00:12.280] yes or no responses. Before you answer yes or no, please wait for the question to be
    [00:12.280 --> 00:18.760] completely finished. When the question is finished, you'll hear the following tone, which
    [00:18.760 --> 00:24.360] is the signal for you to answer either yes or no.

    Thanked by 1TimboJones
  • ninjatkninjatk Signature Restricted

    i only recommended Youtube, their sub generator is a blessing, and they even have timestamp too, that work perfềctly for me in the past two years.

  • vyas11vyas11 Member
    edited December 2022

    @rpollestad said:

    @asterisk14 said:

    Don't have time to spend on this type of "project" at the minute, need the audio translated ASAP and then I can check and correct any errors

    It's pretty simple to get up and running if you have a spare VPS with 2+ cores laying around. It's just a bunch of yum commands and a git pull and you're good to go. (It does also run on Windows, and there are tutorials for that which are easily findable.)

    For those interested:

    I tested this on AlmaLinux 8, just because having newer versions of apps is easy to make everything work. Should work on any distro, though, as long as you can install python 3.8 and ffmpeg.

    update packages, install dev tools and python (required for whisper)

    yum update -y
    yum -y groupinstall "Development Tools"
    yum -y install python38 yum-utils

    do a bunch of extra stuff to install ffmpeg from a repo

    yum -y install https://download.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
    yum-config-manager --enable powertools
    yum install --nogpgcheck https://mirrors.rpmfusion.org/free/el/rpmfusion-free-release-8.noarch.rpm -y
    yum install --nogpgcheck https://mirrors.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-8.noarch.rpm -y

    install ffmpeg

    yum install -y ffmpeg ffmpeg-devel

    install whisper

    pip3.8 --no-cache-dir install git+https://github.com/openai/whisper.git

    And you're all set to start translating.

    [~]# whisper --task transcribe --model tiny.en --fp16 False /tmp/011-44-800055xxxx-2-2x.wav
    [00:00.000 --> 00:05.920] In today's interview, you will be asked several simple questions that only require accurate
    [00:05.920 --> 00:12.280] yes or no responses. Before you answer yes or no, please wait for the question to be
    [00:12.280 --> 00:18.760] completely finished. When the question is finished, you'll hear the following tone, which
    [00:18.760 --> 00:24.360] is the signal for you to answer either yes or no.


    This might be useful. I came across a couple of projects that use docker, so stayed away for now (personal preference, lets leave it at that, thanks)
  • trycatchthistrycatchthis Member
    edited December 2022

    temi.com paid 0.25c/ min I think
    https://stacksocial.com/sales/vidtags-deluxe-plan-lifetime-subscription $79 lifetime

    I think they are other transcription services on stack social.

    As someone who edits text transcription at times I will let you know you will need to do a lot of work on the transcript.

Sign In or Register to comment.