metadata

title: Fillersark
emoji: 📉
colorFrom: pink
colorTo: blue
sdk: gradio
sdk_version: 5.28.0
app_file: app.py
pinned: false
license: other
short_description: filler

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

🎙️ CrisperWhisper Speech-to-Text

This Hugging Face Space provides a speech-to-text transcription service powered by the nyrahealth/CrisperWhisper model. Upload audio files and get transcribed text with word-level timestamps.

Features

Transcribe audio files to text with word-level timestamps
Support for multiple audio formats (MP3, WAV, M4A, OGG, FLAC)
Up to 30MB file size support
Simple web interface using Gradio
REST API endpoint for programmatic access

How to Use

Upload an audio file using the interface
Click "Transcribe"
View both the plain text transcription and detailed JSON output with timestamps

API Usage

You can also use this Space programmatically via the REST API:

import requests

url = "https://your-space-name.hf.space/api/predict"
files = {'audio_input': open('/path/to/your-audio-file.mp3', 'rb')}

response = requests.post(url, files=files)
print(response.json())

Model Details

This app uses the nyrahealth/CrisperWhisper model, which is optimized for high-quality speech transcription with timestamp information.

System Requirements

For optimal performance, this Space should be run with:

GPU acceleration
At least 8GB RAM

tags:

speech-to-text
transcription
whisper
gradio
audio-processing