Amazon has a voice transcription service called Transcribe with pricing based on the number of minutes of audio you transcribe, the base pricing would be ~ $1.24 / hour of audio.

OpenAI released an open source model called Whisper1, running an hour of audio at the base model takes approximately 3 minutes on an Nvidia T4 GPU. With Replicate’s usage pricing of .033 / usage minute this works out to ~ 10 cents / hour of audio.

1. Radford, A. et al. Robust Speech Recognition via Large-Scale Weak Supervision. 28 (2022).