OpenAI Whisper tutorial: Creating OpenAI Whisper API in a Docker Container

VISIT

Whisper is an automatic State-of-the-Art speech recognition system from OpenAI that has been trained on 680,000 hours of multilingual and multitask supervised data collected from the web. This large and diverse dataset leads to improved robustness to accents, background noise and technical language. In addition, it enables transcription in multiple languages, as well as translation from those languages into English. OpenAI released the models and code to serve as a foundation for building useful applications that leverage speech recognition.

Table of Contents

1How to start with Docker

2So what is happening exactly in the Dockerfile?

3How to create our route

4How to run the container?

5How to test the API?

6How to deploy the API?

How to start with Docker

1. First of all if you are planning to run the container on your local machine you need to have Docker installed. You can find the installation instructions here.

2. Creating a folder for our files, lets call it whisper-api

3. Create a file called requirements.txt and add flask to it.

4. Create a file called Dockerfile

In the Dockerfile we will add the following lines:

So what is happening exactly in the Dockerfile?

1. Choosing a python 3.10 slim image as our base image.

2. Creating a working directory called python-docker

3. Copying our requirements.txt file to the working directory

4. Updating the apt package manager and installing git

5. Installing the requirements from the requirements.txt file

6. installing the whisper package from github.

7. Installing ffmpeg

8. And exposing port 5000 and running the flask server.

How to create our route

1. Create a file called app.py where we import all the necessary packages and initialize the flask app and whisper.

2. Add the following lines to the file:

3. Now we need to create a route that will accept a post request with a file in it.

4. Add the following lines to the app.py file:

How to run the container?

1. Open a terminal and navigate to the folder where you created the files.

2. Run the following command to build the container:

3. Run the following command to run the container:

How to test the API?

1. You can test the API by sending a POST request to the route http://localhost:5000/whisper with a file in it. Body should be form-data.

2. You can use the following curl command to test the API:

3. In result you should get a JSON object with the transcript in it.

How to deploy the API?

This API can be deployed anywhere where Docker can be used. Just keep in mind that this setup currently using CPU for processing the audio files. If you want to use GPU you need to change Dockerfile and share the GPU. I won't go into this deeper as this is an introduction. Docker GPU

Article source