Spaces:
Configuration error
Configuration error
Commit
Β·
5730f05
1
Parent(s):
2476aa6
Fixed Hugging Face Space configuration
Browse files
README.md
CHANGED
@@ -1,101 +1,10 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
|
|
|
|
|
|
7 |
---
|
8 |
-
|
9 |
-
## π Overview
|
10 |
-
|
11 |
-
This notebook demonstrates how to set up and run **Sesame's CSM-1B** Text-to-Speech model on **Google Colab** using Gradio for a browser-based UI.
|
12 |
-
|
13 |
-
- π Input: Text
|
14 |
-
- ποΈ Output: Realistic speech audio via pretrained TTS model
|
15 |
-
- π€ Model: [`sesame/csm-1b`](https://www.google.com/search?q=site%3Ahuggingface.co+sesame%2Fcsm-1b)
|
16 |
-
|
17 |
-
---
|
18 |
-
|
19 |
-
## π Quick Start
|
20 |
-
|
21 |
-
### π Run in Google Colab
|
22 |
-
Click the badge above to launch the notebook directly in Google Colab.
|
23 |
-
|
24 |
-
### π§© Steps to Execute
|
25 |
-
|
26 |
-
1. **Install Gradio and dependencies**
|
27 |
-
2. **Clone the CSM repo** and install additional Python libraries via `requirements.txt`
|
28 |
-
3. **Authenticate** with HuggingFace using `notebook_login()`
|
29 |
-
4. **Load the model** using the helper from `generator.py`
|
30 |
-
5. **Launch Gradio** with either:
|
31 |
-
- β
Simple `gr.Interface`
|
32 |
-
- π‘ Full-featured `gr.Blocks` app
|
33 |
-
|
34 |
-
---
|
35 |
-
|
36 |
-
## π οΈ Requirements
|
37 |
-
|
38 |
-
> All dependencies are pre-installed in the notebook via `pip install`
|
39 |
-
|
40 |
-
Main libraries:
|
41 |
-
|
42 |
-
- `gradio`
|
43 |
-
- `torch`, `torchaudio`
|
44 |
-
- `transformers`
|
45 |
-
- `huggingface_hub`
|
46 |
-
- `moshi`
|
47 |
-
- `torchtune`
|
48 |
-
- `torchao`
|
49 |
-
- `silentcipher` (from GitHub)
|
50 |
-
|
51 |
-
---
|
52 |
-
|
53 |
-
## π§ͺ Model Source
|
54 |
-
|
55 |
-
- **Model**: [sesame/csm-1b](https://www.google.com/search?q=site%3Ahuggingface.co+sesame%2Fcsm-1b)
|
56 |
-
- **Repository**: https://github.com/SesameAILabs/csm
|
57 |
-
- **Audio Generation**: `generator.generate()` from cloned repo
|
58 |
-
|
59 |
-
---
|
60 |
-
|
61 |
-
## πΌοΈ UI Modes
|
62 |
-
|
63 |
-
### Simple Interface
|
64 |
-
|
65 |
-
```python
|
66 |
-
gr.Interface(
|
67 |
-
fn=gradio_interface,
|
68 |
-
inputs=[gr.Textbox(...), gr.Slider(...)],
|
69 |
-
outputs=gr.Audio(...),
|
70 |
-
title="Sesame CSM-1B Text-to-Speech"
|
71 |
-
).launch(share=True)
|
72 |
-
```
|
73 |
-
|
74 |
-
### Advanced Blocks UI
|
75 |
-
|
76 |
-
- π€ Text Input + File Upload
|
77 |
-
- ποΈ Speaker Selector
|
78 |
-
- ποΈ Audio Controls (play, pause, stop)
|
79 |
-
- π Volume Slider
|
80 |
-
- π Event Binding via `.click()`
|
81 |
-
|
82 |
-
---
|
83 |
-
|
84 |
-
## π§βπ» Author
|
85 |
-
|
86 |
-
- π€ Malhar Ujawane
|
87 |
-
- π¦ [Twitter](https://x.com/justmalhar)
|
88 |
-
- π» [GitHub](https://github.com/justmalhar)
|
89 |
-
|
90 |
-
---
|
91 |
-
|
92 |
-
## β οΈ Notes
|
93 |
-
|
94 |
-
- Ensure your HuggingFace account has access to the model before logging in.
|
95 |
-
- If you encounter `Model.__init__() missing required argument: 'config'`, verify model loading code inside `generator.py`.
|
96 |
-
|
97 |
-
---
|
98 |
-
|
99 |
-
## 𧬠License
|
100 |
-
|
101 |
-
MIT License (for the notebook). Model license terms apply as per [HuggingFace model card](https://huggingface.co/sesame/csm-1b).
|
|
|
1 |
+
---
|
2 |
+
title: Sesame AICSM
|
3 |
+
emoji: π§
|
4 |
+
colorFrom: indigo
|
5 |
+
colorTo: blue
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 3.50.2
|
8 |
+
app_file: app.py
|
9 |
+
pinned: false
|
10 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|