AudioTranscriber / README.md
PatienceIzere's picture
Update README.md
5513800 verified
metadata
title: Audio Trabscriber
sdk: docker
emoji: 💻
colorFrom: green
colorTo: blue
short_description: Monophonic audio transcription

Audio to Sheet Music Transcriber

A web application that converts monophonic audio recordings into sheet music using machine learning. This app can transcribe audio files (WAV, MP3) or record live audio and convert it to MIDI and MusicXML formats.

Features

  • Upload audio files (WAV, MP3) for transcription
  • Record audio directly in the browser
  • Choose between different transcription models
  • Download MIDI and MusicXML files
  • View basic audio visualizations

How to Use

  1. Input Audio:

    • Upload an audio file using the file uploader
    • OR record audio directly in the browser
  2. Transcription Settings:

    • Select your preferred transcription model
    • Adjust audio parameters if needed
  3. Process:

    • Click "Transcribe" to start the transcription
    • Wait for the processing to complete
  4. Download:

    • Download the generated MIDI file
    • Download the MusicXML file for sheet music

Models

  • Facebook wav2vec2: Fast and accurate speech recognition
  • Microsoft SpeechT5: High-quality speech recognition with better intonation

Technical Details

This app uses:

  • PyTorch and Transformers for audio processing
  • Librosa for audio feature extraction
  • PrettyMIDI and Music21 for MIDI and MusicXML generation
  • Streamlit for the web interface

Limitations

  • Works best with clean, monophonic recordings
  • May have difficulty with fast passages or complex articulations
  • Performance depends on the quality of the input audio

License

MIT License - See the LICENSE file for details.