pallavi1428 commited on
Commit
5730f05
Β·
1 Parent(s): 2476aa6

Fixed Hugging Face Space configuration

Browse files
Files changed (1) hide show
  1. README.md +9 -100
README.md CHANGED
@@ -1,101 +1,10 @@
1
- # 🧠 Sesame CSM-1B Google Colab Notebook
2
-
3
- [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Justmalhar/csm-google-collab/blob/main/Sesame_AI_CSM_Notebook.ipynb)
4
-
5
- > **Text-to-Speech Demo using Sesame's CSM-1B Model, Gradio UI, and HuggingFace Hub**
6
-
 
 
 
7
  ---
8
-
9
- ## πŸ“Œ Overview
10
-
11
- This notebook demonstrates how to set up and run **Sesame's CSM-1B** Text-to-Speech model on **Google Colab** using Gradio for a browser-based UI.
12
-
13
- - πŸ”Š Input: Text
14
- - πŸŽ™οΈ Output: Realistic speech audio via pretrained TTS model
15
- - πŸ€– Model: [`sesame/csm-1b`](https://www.google.com/search?q=site%3Ahuggingface.co+sesame%2Fcsm-1b)
16
-
17
- ---
18
-
19
- ## πŸš€ Quick Start
20
-
21
- ### πŸ”— Run in Google Colab
22
- Click the badge above to launch the notebook directly in Google Colab.
23
-
24
- ### 🧩 Steps to Execute
25
-
26
- 1. **Install Gradio and dependencies**
27
- 2. **Clone the CSM repo** and install additional Python libraries via `requirements.txt`
28
- 3. **Authenticate** with HuggingFace using `notebook_login()`
29
- 4. **Load the model** using the helper from `generator.py`
30
- 5. **Launch Gradio** with either:
31
- - βœ… Simple `gr.Interface`
32
- - πŸ’‘ Full-featured `gr.Blocks` app
33
-
34
- ---
35
-
36
- ## πŸ› οΈ Requirements
37
-
38
- > All dependencies are pre-installed in the notebook via `pip install`
39
-
40
- Main libraries:
41
-
42
- - `gradio`
43
- - `torch`, `torchaudio`
44
- - `transformers`
45
- - `huggingface_hub`
46
- - `moshi`
47
- - `torchtune`
48
- - `torchao`
49
- - `silentcipher` (from GitHub)
50
-
51
- ---
52
-
53
- ## πŸ§ͺ Model Source
54
-
55
- - **Model**: [sesame/csm-1b](https://www.google.com/search?q=site%3Ahuggingface.co+sesame%2Fcsm-1b)
56
- - **Repository**: https://github.com/SesameAILabs/csm
57
- - **Audio Generation**: `generator.generate()` from cloned repo
58
-
59
- ---
60
-
61
- ## πŸ–ΌοΈ UI Modes
62
-
63
- ### Simple Interface
64
-
65
- ```python
66
- gr.Interface(
67
- fn=gradio_interface,
68
- inputs=[gr.Textbox(...), gr.Slider(...)],
69
- outputs=gr.Audio(...),
70
- title="Sesame CSM-1B Text-to-Speech"
71
- ).launch(share=True)
72
- ```
73
-
74
- ### Advanced Blocks UI
75
-
76
- - πŸ”€ Text Input + File Upload
77
- - 🎚️ Speaker Selector
78
- - πŸŽ›οΈ Audio Controls (play, pause, stop)
79
- - πŸ”‰ Volume Slider
80
- - πŸ” Event Binding via `.click()`
81
-
82
- ---
83
-
84
- ## πŸ§‘β€πŸ’» Author
85
-
86
- - πŸ‘€ Malhar Ujawane
87
- - 🐦 [Twitter](https://x.com/justmalhar)
88
- - πŸ’» [GitHub](https://github.com/justmalhar)
89
-
90
- ---
91
-
92
- ## ⚠️ Notes
93
-
94
- - Ensure your HuggingFace account has access to the model before logging in.
95
- - If you encounter `Model.__init__() missing required argument: 'config'`, verify model loading code inside `generator.py`.
96
-
97
- ---
98
-
99
- ## 🧬 License
100
-
101
- MIT License (for the notebook). Model license terms apply as per [HuggingFace model card](https://huggingface.co/sesame/csm-1b).
 
1
+ ---
2
+ title: Sesame AICSM
3
+ emoji: 🧠
4
+ colorFrom: indigo
5
+ colorTo: blue
6
+ sdk: gradio
7
+ sdk_version: 3.50.2
8
+ app_file: app.py
9
+ pinned: false
10
  ---