Poetry Pronunciation Learning App

The Poetry Pronunciation Learning App is an interactive AI-powered tool that helps users practice and improve their pronunciation of poems. It uses real-time speech recognition, voice activity detection, and fuzzy word matching to provide instant feedback on spoken verses. The app guides learners through two phases — identifying the poem title and reciting it line by line — while tracking progress and accuracy.

✨ Features

🎙️ Real-time speech recognition powered by Whisper (via faster-whisper)
📝 Two-phase learning: recognize the poem title → recite the poem
✅ Word-by-word feedback with similarity scoring (Levenshtein distance)
📊 Progress tracking: shows accuracy and recitation status
🔊 Noise handling & VAD (Voice Activity Detection) for reliable recognition
🔄 Multi-word processing: handle small chunks of spoken words naturally
📂 Poem database in JSON, easy to extend with more poems or different languages
🧪 Unit tests for audio, similarity, and transcriber logic

🛠️ Technologies Used

ASR (Automatic Speech Recognition):
faster-whisper + CTranslate2 backend
Deep Learning Frameworks:
PyTorch (torch, torchaudio)
Audio Processing:
sounddevice,
soundfile,
resampy,
ffmpeg-python,
webrtcvad-wheels for voice activity detection
Text Processing:
python-Levenshtein for word similarity
Testing:
pytest, pytest-asyncio

to create the venv :

Windows (PowerShell):

python -m venv venv
.\venv\Scripts\Activate.ps1

Linux/macOS (Bash):

python -m venv venv
source venv/bin/activate

to install requirements :

! For GPU inference, ensure compatible versions of CUDA, torch, and torchaudio (mine CUDA 11.8).

Update requirements.txt with appropriate versions. !

pip install -r requirements.txt

project structure :

poetry_app/
│── poetry_main.py               ## Entry point 
│── requirements.txt             ## Dependencies
│── README.md                    ## Project description
│
├── poetry/                      ## Main package
│   ├── config.py                ## Global constants & default settings
│   ├── audio.py                 ## AudioProducer, NoiseEstimator, VAD logic
│   ├── similarity.py            ## Levenshtein, matching logic
│   ├── utils.py                 ## Helpers functions
│   ├── models.py                ## PoemData, WordMatch, MultiWordResult classes
│   ├── transcriber.py           ## PoetryTranscriber class (main app logic)
│   └── state.py                 ## AppState enum and related state management
│
├── data/
│   ├── poems_1.json             ## Poems dataset
│   └── ...                      ## Any additional poem files
│
└── tests/                       ## Unit tests
    ├── test_similarity.py
    ├── test_audio.py
    └── test_transcriber.py

for testing :

pytest tests/ -v

example to run on CPU :

python poetry_main.py --model small --device cpu --compute int8_float32 --chunk 1 --overlap 0.5 --lang fr --poems data/poems_1.json

example to run on GPU :

python poetry_main.py --model medium --device cuda --chunk 1 --overlap 0.5 --lang fr --poems data/poems_1.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Poetry Pronunciation Learning App

✨ Features

🛠️ Technologies Used

to create the venv :

to install requirements :

! For GPU inference, ensure compatible versions of CUDA, torch, and torchaudio (mine CUDA 11.8).

Update requirements.txt with appropriate versions. !

project structure :

for testing :

example to run on CPU :

example to run on GPU :

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
images		images
poems_data_pdf_image		poems_data_pdf_image
poetry		poetry
tests		tests
README.md		README.md
poetry_main.py		poetry_main.py
requirements.txt		requirements.txt

tahangz/Poetry-Practicing

Folders and files

Latest commit

History

Repository files navigation

Poetry Pronunciation Learning App

✨ Features

🛠️ Technologies Used

to create the venv :

to install requirements :

! For GPU inference, ensure compatible versions of CUDA, torch, and torchaudio (mine CUDA 11.8).

Update requirements.txt with appropriate versions. !

project structure :

for testing :

example to run on CPU :

example to run on GPU :

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages