Audio Transcription App using OpenAI and FastAPI

Welcome to the Audio Transcription App! This repository contains a Python application that leverages the power of OpenAI and the FastAPI framework to transcribe audio files. With this app, you can easily upload MP3 audio files, which will then be converted into text using OpenAI's transcription model, providing you with the transcribed content.

Technologies Used

This application is built on the following technologies:

Python: A versatile programming language known for its simplicity and readability.
OpenAI API: Utilized to transcribe audio content and convert it into text.
FastAPI: A modern, fast, and highly performant web framework for building APIs with Python.
pydub: A library for audio file manipulation, used to handle audio conversion and processing.
tempfile: A built-in Python module used for managing temporary files.

Prerequisites

Before you begin, ensure you have the following:

Python (>=3.6) installed on your system.
An OpenAI API key. If you don't have one, you can sign up and obtain an API key from the OpenAI website.
Basic familiarity with FastAPI, OpenAI API, audio file handling, and Python programming.

Setup

Clone this repository to your local machine:

git clone https://github.com/your-username/your-repo.git
cd your-repo

Install the required Python packages using pip:
```
pip install fastapi uvicorn openai pydub
```
Open the main.py file and replace 'OPENAI_API_KEY' with your actual OpenAI API key.

Usage

Run the FastAPI server using the following command:
```
uvicorn main:app
```
Once the server is running, you can access the FastAPI documentation at http://127.0.0.1:8000/docs. Here, you can test the /api/transcribe endpoint by uploading an MP3 audio file.
The uploaded audio file will be transcribed using the OpenAI API, and the resulting transcription will be returned as a JSON response.

How It Works

When you upload an MP3 file to the /api/transcribe endpoint, the application reads the binary content of the audio file.
The binary content is converted into an AudioSegment object using the pydub library, which makes the audio data suitable for processing.
The AudioSegment is temporarily saved as an MP3 file using the tempfile.NamedTemporaryFile function.
The temporary MP3 file's content is read and passed to the transcription function, which sends an API call to the OpenAI transcription model via the OpenAI Python library.
The transcription response is extracted from the API response and returned as a JSON response to the user.
The temporary MP3 file is cleaned up by deleting it.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Transcription App using OpenAI and FastAPI

Technologies Used

Prerequisites

Setup

Usage

How It Works

About

Uh oh!

Releases

Packages

Uh oh!

Languages

lucaskampi/audio-transcription

Folders and files

Latest commit

History

Repository files navigation

Audio Transcription App using OpenAI and FastAPI

Technologies Used

Prerequisites

Setup

Usage

How It Works

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages