Spaces:

PatienceIzere
/

AudioTranscriber

Running

App Files Files Community

AudioTranscriber / README.md

PatienceIzere

Update README.md

5513800 verified 11 days ago

preview code

raw

history blame contribute delete

1.7 kB

metadata

title: Audio Trabscriber
sdk: docker
emoji: 💻
colorFrom: green
colorTo: blue
short_description: Monophonic audio transcription

Audio to Sheet Music Transcriber

A web application that converts monophonic audio recordings into sheet music using machine learning. This app can transcribe audio files (WAV, MP3) or record live audio and convert it to MIDI and MusicXML formats.

Features

Upload audio files (WAV, MP3) for transcription
Record audio directly in the browser
Choose between different transcription models
Download MIDI and MusicXML files
View basic audio visualizations

How to Use

Input Audio:
- Upload an audio file using the file uploader
- OR record audio directly in the browser
Transcription Settings:
- Select your preferred transcription model
- Adjust audio parameters if needed
Process:
- Click "Transcribe" to start the transcription
- Wait for the processing to complete
Download:
- Download the generated MIDI file
- Download the MusicXML file for sheet music

Models

Facebook wav2vec2: Fast and accurate speech recognition
Microsoft SpeechT5: High-quality speech recognition with better intonation

Technical Details

This app uses:

PyTorch and Transformers for audio processing
Librosa for audio feature extraction
PrettyMIDI and Music21 for MIDI and MusicXML generation
Streamlit for the web interface

Limitations

Works best with clean, monophonic recordings
May have difficulty with fast passages or complex articulations
Performance depends on the quality of the input audio

License

MIT License - See the LICENSE file for details.