Transcribe Audio with Azure OpenAI Whisper
Transcribe audio files to text using Azure OpenAI's Whisper model. Example shows setup and usage.
Why it matters
Integrate Azure OpenAI's Whisper model into your applications to automatically transcribe audio files, enabling efficient speech-to-text conversion for various use cases.
Outcomes
What it gets done
Configure the OpenAI SDK for Azure
Authenticate with Azure OpenAI Service using API keys or Azure AD
Transcribe audio files using the `openai.Audio.transcribe` method
Extract text from audio streams
Install
Add it to your toolbox
Run in your project directory:
curl -fsSL https://spark.entire.vc/get/oai-whisper | bash Steps
Steps in the chain
Install the necessary dependencies for the Azure OpenAI Whisper example.
Import libraries and configure the Python OpenAI SDK to work with the Azure OpenAI service. Set the following variables: OPENAI_API_BASE, OPENAI_API_KEY, OPENAI_API_TYPE, OPENAI_API_VERSION. For development, consider setting these as environment variables instead of in code.
Create the proper resources at the Azure Portal to properly access the Azure OpenAI Service. Refer to Microsoft Docs for a detailed guide on how to create resources.
Get the endpoint from the 'Keys and Endpoints' section under 'Resource Management' in the Azure Portal. Set up the SDK using this endpoint information.
Set up the OpenAI SDK to use an Azure API Key by setting api_type to 'azure' and api_key to a key associated with your endpoint. Find the key in 'Keys and Endpoints' under 'Resource Management' in the Azure Portal.
Get a key via Microsoft Active Directory Authentication. Refresh an expiring token by hooking into requests.auth to ensure a valid token is sent with every request.
Use the openai.Audio.transcribe method to transcribe an audio file stream to text. Get sample audio files from the Azure AI Speech SDK repository at GitHub.
Overview
Azure audio whisper (preview) example
What it does
This example demonstrates how to use the Azure OpenAI Whisper model to transcribe audio files. It shows how to convert spoken words into text.
How it connects
Use this example when you need to transcribe audio files using the Azure OpenAI Whisper model.
Source README
Azure audio whisper (preview) example
Note: There is a newer version of the openai library available. See https://github.com/openai/openai-python/discussions/742
The example shows how to use the Azure OpenAI Whisper model to transcribe audio files.
Setup
First, we install the necessary dependencies.
Next, we'll import our libraries and configure the Python OpenAI SDK to work with the Azure OpenAI service.
Note: In this example, we configured the library to use the Azure API by setting the variables in code. For development, consider setting the environment variables instead:
OPENAI_API_BASE
OPENAI_API_KEY
OPENAI_API_TYPE
OPENAI_API_VERSION
To properly access the Azure OpenAI Service, we need to create the proper resources at the Azure Portal (you can check a detailed guide on how to do this in the Microsoft Docs)
Once the resource is created, the first thing we need to use is its endpoint. You can get the endpoint by looking at the "Keys and Endpoints" section under the "Resource Management" section. Having this, we will set up the SDK using this information:
Authentication
The Azure OpenAI service supports multiple authentication mechanisms that include API keys and Azure credentials.
Authentication using API key
To set up the OpenAI SDK to use an Azure API Key, we need to set up the api_type to azure and set api_key to a key associated with your endpoint (you can find this key in "Keys and Endpoints" under "Resource Management" in the Azure Portal)
Authentication using Azure Active Directory
Let's now see how we can get a key via Microsoft Active Directory Authentication.
A token is valid for a period of time, after which it will expire. To ensure a valid token is sent with every request, you can refresh an expiring token by hooking into requests.auth:
Audio transcription
Audio transcription, or speech-to-text, is the process of converting spoken words into text. Use the openai.Audio.transcribe method to transcribe an audio file stream to text.
You can get sample audio files from the Azure AI Speech SDK repository at GitHub.
Step 1: Install dependencies
Install the necessary dependencies for the Azure OpenAI Whisper example.
Step 2: Import libraries and configure SDK
Import libraries and configure the Python OpenAI SDK to work with the Azure OpenAI service. Set the following variables: OPENAI_API_BASE, OPENAI_API_KEY, OPENAI_API_TYPE, OPENAI_API_VERSION. For development, consider setting these as environment variables instead of in code.
Step 3: Create Azure resources
Create the proper resources at the Azure Portal to properly access the Azure OpenAI Service. Refer to Microsoft Docs for a detailed guide on how to create resources.
Step 4: Get endpoint and configure SDK
Get the endpoint from the 'Keys and Endpoints' section under 'Resource Management' in the Azure Portal. Set up the SDK using this endpoint information.
Step 5: Authenticate using API key
Set up the OpenAI SDK to use an Azure API Key by setting api_type to 'azure' and api_key to a key associated with your endpoint. Find the key in 'Keys and Endpoints' under 'Resource Management' in the Azure Portal.
Step 6: Authenticate using Azure Active Directory
Get a key via Microsoft Active Directory Authentication. Refresh an expiring token by hooking into requests.auth to ensure a valid token is sent with every request.
Step 7: Transcribe audio to text
Use the openai.Audio.transcribe method to transcribe an audio file stream to text. Get sample audio files from the Azure AI Speech SDK repository at GitHub.
Discussion
Questions & comments · 0
Sign In Sign in to leave a comment.