Spaces:
Running
on
Zero
Running
on
Zero
Update README.md
Browse files
README.md
CHANGED
@@ -1,86 +1,5 @@
|
|
1 |
---
|
2 |
sdk: gradio
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
We propose **SongBloom**, a novel framework for full-length song generation that leverages an interleaved paradigm of autoregressive sketching and diffusion-based refinement. SongBloom employs an autoregressive diffusion model that combines the high fidelity of diffusion models with the scalability of language models.
|
7 |
-
Specifically, it gradually extends a musical sketch from short to long and refines the details from coarse to fine-grained. The interleaved generation paradigm effectively integrates prior semantic and acoustic context to guide the generation process.
|
8 |
-
Experimental results demonstrate that SongBloom outperforms existing methods across both subjective and objective metrics and achieves performance comparable to the state-of-the-art commercial music generation platforms.
|
9 |
-
|
10 |
-

|
11 |
-
|
12 |
-
Demo page: [https://cypress-yang.github.io/SongBloom_demo](https://cypress-yang.github.io/SongBloom_demo)
|
13 |
-
|
14 |
-
ArXiv: [https://arxiv.org/abs/2506.07634](https://arxiv.org/abs/2506.07634)
|
15 |
-
|
16 |
-
## Prepare Environments
|
17 |
-
|
18 |
-
```bash
|
19 |
-
conda create -n SongBloom python==3.8.12
|
20 |
-
conda activate SongBloom
|
21 |
-
|
22 |
-
# yum install libsndfile
|
23 |
-
# pip install torch==2.2.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu118 # For different CUDA version
|
24 |
-
pip install -r requirements.txt
|
25 |
-
```
|
26 |
-
|
27 |
-
## Data Preparation
|
28 |
-
|
29 |
-
A .jsonl file, where each line is a json object:
|
30 |
-
|
31 |
-
```json
|
32 |
-
{
|
33 |
-
"idx": "The index of each sample",
|
34 |
-
"lyrics": "The lyrics to be generated",
|
35 |
-
"prompt_wav": "The path of the style prompt audio",
|
36 |
-
}
|
37 |
-
```
|
38 |
-
|
39 |
-
One example can be refered to as: [example/test.jsonl](example/test.jsonl)
|
40 |
-
|
41 |
-
The prompt wav should be a 10-second, 48kHz audio clip.
|
42 |
-
|
43 |
-
The details about lyric format can be found in [docs/lyric_format.md](docs/lyric_format.md).
|
44 |
-
|
45 |
-
## Inference
|
46 |
-
|
47 |
-
```bash
|
48 |
-
source set_env.sh
|
49 |
-
|
50 |
-
python3 infer.py --input-jsonl example/test.jsonl
|
51 |
-
|
52 |
-
# For GPUs with low VRAM like RTX4090, you should set the dtype as bfloat16
|
53 |
-
python3 infer.py --input-jsonl example/test.jsonl --dtype bfloat16
|
54 |
-
|
55 |
-
# SongBloom also supports flash-attn (optional). To enable it, please install flash-attn (v2.6.3 is used during training) manually and set os.environ['DISABLE_FLASH_ATTN'] = "0" in infer.py:8
|
56 |
-
```
|
57 |
-
|
58 |
-
## Models
|
59 |
-
|
60 |
-
| Name | Size | Max Length | Prompt type | 🤗 |
|
61 |
-
| -------------------- | ---- | ---------- | ----------- | -------------------------------------------- |
|
62 |
-
| songbloom_full_150s | 2B | 2m30s | 10s wav | [link](https://huggingface.co/CypressYang/SongBloom) |
|
63 |
-
| songbloom_mulan_150s | 2B | 2m30s | 10s wav / text description | coming soon |
|
64 |
-
| ... | | | | |
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
## TODO List
|
69 |
-
|
70 |
-
- [ ] Support Text Description
|
71 |
-
- [ ] Full version
|
72 |
-
|
73 |
-
## Citation
|
74 |
-
|
75 |
-
```
|
76 |
-
@article{yang2025songbloom,
|
77 |
-
title={SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement},
|
78 |
-
author={Yang, Chenyu and Wang, Shuai and Chen, Hangting and Tan, Wei and Yu, Jianwei and Li, Haizhou},
|
79 |
-
journal={arXiv preprint arXiv:2506.07634},
|
80 |
-
year={2025}
|
81 |
-
}
|
82 |
-
```
|
83 |
-
|
84 |
-
## License
|
85 |
-
|
86 |
-
SongBloom (codes and weights) is released under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|
|
|
1 |
---
|
2 |
sdk: gradio
|
3 |
+
short_description: Online demo for Apple's DiffuCoder-7B-cpGRPO (Diffusion LLM)
|
4 |
+
sdk_version: 5.38.0
|
5 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|