fillersark / README.md
cheesecz's picture
Update README.md
656229d verified

A newer version of the Gradio SDK is available: 5.44.1

Upgrade
metadata
title: Fillersark
emoji: πŸ“‰
colorFrom: pink
colorTo: blue
sdk: gradio
sdk_version: 5.28.0
app_file: app.py
pinned: false
license: other
short_description: filler

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

πŸŽ™οΈ CrisperWhisper Speech-to-Text

This Hugging Face Space provides a speech-to-text transcription service powered by the nyrahealth/CrisperWhisper model. Upload audio files and get transcribed text with word-level timestamps.

Features

  • Transcribe audio files to text with word-level timestamps
  • Support for multiple audio formats (MP3, WAV, M4A, OGG, FLAC)
  • Up to 30MB file size support
  • Simple web interface using Gradio
  • REST API endpoint for programmatic access

How to Use

  1. Upload an audio file using the interface
  2. Click "Transcribe"
  3. View both the plain text transcription and detailed JSON output with timestamps

API Usage

You can also use this Space programmatically via the REST API:

import requests

url = "https://your-space-name.hf.space/api/predict"
files = {'audio_input': open('/path/to/your-audio-file.mp3', 'rb')}

response = requests.post(url, files=files)
print(response.json())

Model Details

This app uses the nyrahealth/CrisperWhisper model, which is optimized for high-quality speech transcription with timestamp information.

System Requirements

For optimal performance, this Space should be run with:

  • GPU acceleration
  • At least 8GB RAM

tags:

  • speech-to-text
  • transcription
  • whisper
  • gradio
  • audio-processing