OpenAI Whisper tutorial: Updating our Whisper API with GPT-3

VISIT

Whisper is an automatic State-of-the-Art speech recognition system from OpenAI that has been trained on 680,000 hours of multilingual and multitask supervised data collected from the web. This large and diverse dataset leads to improved robustness to accents, background noise and technical language. In addition, it enables transcription in multiple languages, as well as translation from those languages into English. OpenAI released the models and code to serve as a foundation for building useful applications that leverage speech recognition.

Table of Contents

1What is GPT-3?

2OpenAI API key

3Updates to requirement.txt

4Creating file for gpt3 function

5Update app.py

6Update the /whisper route

7How to run the container?

8How to test the API?

9How to deploy the API?

What is GPT-3?

GPT-3 is a language model from OpenAI that can generate text. It is trained on a large dataset of text from the web.

OpenAI API key

If you don't have it already please go to OpenAI and create an account. And create your API key. Never share your API key in public repository!

Updates to requirement.txt

We are adding openai package to our file

Creating file for gpt3 function

We will create a new file called gpt3.py and add the following code to it. In the prompt I was using the summary option to summarize the text, but you can use anything you want. And you can tweak the parameters as well.

Update app.py

On the top we will update our imports. Instead of "MY_API_KEY" please insert the API key you created earlier.

Update the /whisper route

We will integrate our new GPT3 function into the route. So when we are getting the result from whisper we will pass it to the gpt3 function and return the result.

How to run the container?

1. Open a terminal and navigate to the folder where you created the files.
2. Run the following command to build the container:

3. When built is ready, run the following command to run the container:

How to test the API?

1. You can test the API by sending a POST request to the route http://localhost:5000/whisper with a file in it. Body should be form-data.
2. You can use the following curl command to test the API:

3. In result you should get a JSON object with the transcript and summary in it.

How to deploy the API?

This API can be deployed anywhere where Docker can be used. Just keep in mind that this setup currently using CPU for processing the audio files. If you want to use GPU you need to change Dockerfile and share the GPU. I won't go into this deeper as this is an introduction. Docker GPU

Article source