Spaces:
Running
on
T4
Running
on
T4
Storage Add
Browse files- README.md +100 -39
- app.py +31 -6
- modules/constants.py +40 -0
- modules/storage.md +156 -0
- modules/storage.py +327 -0
README.md
CHANGED
@@ -4,7 +4,7 @@ emoji: 🎼
|
|
4 |
colorFrom: gray
|
5 |
colorTo: red
|
6 |
sdk: gradio
|
7 |
-
sdk_version: 5.34.
|
8 |
python_version: 3.12.8
|
9 |
app_file: app.py
|
10 |
pinned: true
|
@@ -40,6 +40,16 @@ This allows us to follow the same arraingement of the original melody.
|
|
40 |
|
41 |
**Thank you Huggingface for the community grant to run this project**!!
|
42 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
# Audiocraft
|
44 |

|
45 |

|
@@ -67,8 +77,6 @@ We use 20K hours of licensed music to train MusicGen. Specifically, we rely on a
|
|
67 |
|
68 |
## Installation
|
69 |
Audiocraft requires Python 3.9, PyTorch 2.1.0, and a GPU with at least 16 GB of memory (for the medium-sized model). To install Audiocraft, you can run the following:
|
70 |
-
|
71 |
-
```shell
|
72 |
# Best to make sure you have torch installed first, in particular before installing xformers.
|
73 |
# Don't run this if you already have PyTorch installed.
|
74 |
pip install 'torch>=2.1'
|
@@ -76,8 +84,6 @@ pip install 'torch>=2.1'
|
|
76 |
pip install -U audiocraft # stable release
|
77 |
pip install -U git+https://git@github.com/facebookresearch/audiocraft#egg=audiocraft # bleeding edge
|
78 |
pip install -e . # or if you cloned the repo locally
|
79 |
-
```
|
80 |
-
|
81 |
## Usage
|
82 |
We offer a number of way to interact with MusicGen:
|
83 |
1. A demo is also available on the [`facebook/MusicGen` HuggingFace Space](https://huggingface.co/spaces/Surn/UnlimitedMusicGen) (huge thanks to all the HF team for their support).
|
@@ -88,9 +94,46 @@ We offer a number of way to interact with MusicGen:
|
|
88 |
updated with contributions from @camenduru and the community.
|
89 |
6. Finally, MusicGen is available in 🤗 Transformers from v4.31.0 onwards, see section [🤗 Transformers Usage](#-transformers-usage) below.
|
90 |
|
91 |
-
###
|
92 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
94 |
|
95 |
Top-k: Top-k is a parameter used in text generation models, including music generation models. It determines the number of most likely next tokens to consider at each step of the generation process. The model ranks all possible tokens based on their predicted probabilities, and then selects the top-k tokens from the ranked list. The model then samples from this reduced set of tokens to determine the next token in the generated sequence. A smaller value of k results in a more focused and deterministic output, while a larger value of k allows for more diversity in the generated music.
|
96 |
|
@@ -102,7 +145,53 @@ Classifier-Free Guidance: Classifier-Free Guidance refers to a technique used in
|
|
102 |
|
103 |
These parameters, such as top-k, top-p, temperature, and classifier-free guidance, provide different ways to influence the output of a music generation model and strike a balance between creativity, diversity, coherence, and control. The specific values for these parameters can be tuned based on the desired outcome and user preferences.
|
104 |
|
105 |
-
## API
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
106 |
|
107 |
We provide a simple API and 10 pre-trained models. The pre trained models are:
|
108 |
- `small`: 300M model, text to music only - [🤗 Hub](https://huggingface.co/facebook/musicgen-small)
|
@@ -121,14 +210,8 @@ In order to use MusicGen locally **you must have a GPU**. We recommend 16GB of m
|
|
121 |
GPUs will be able to generate short sequences, or longer sequences with the `small` model.
|
122 |
|
123 |
**Note**: Please make sure to have [ffmpeg](https://ffmpeg.org/download.html) installed when using newer version of `torchaudio`.
|
124 |
-
You can install it with:
|
125 |
-
```
|
126 |
-
apt-get install ffmpeg
|
127 |
-
```
|
128 |
-
|
129 |
See after a quick example for using the API.
|
130 |
-
|
131 |
-
```python
|
132 |
import torchaudio
|
133 |
from audiocraft.models import MusicGen
|
134 |
from audiocraft.data.audio import audio_write
|
@@ -145,22 +228,14 @@ wav = model.generate_with_chroma(descriptions, melody[None].expand(3, -1, -1), s
|
|
145 |
|
146 |
for idx, one_wav in enumerate(wav):
|
147 |
# Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
|
148 |
-
audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)
|
149 |
-
```
|
150 |
-
## 🤗 Transformers Usage
|
151 |
|
152 |
MusicGen is available in the 🤗 Transformers library from version 4.31.0 onwards, requiring minimal dependencies
|
153 |
and additional packages. Steps to get started:
|
154 |
|
155 |
1. First install the 🤗 [Transformers library](https://github.com/huggingface/transformers) from main:
|
156 |
-
|
157 |
-
```
|
158 |
pip install git+https://github.com/huggingface/transformers.git
|
159 |
-
```
|
160 |
-
|
161 |
2. Run the following Python code to generate text-conditional audio samples:
|
162 |
-
|
163 |
-
```py
|
164 |
from transformers import AutoProcessor, MusicgenForConditionalGeneration
|
165 |
|
166 |
|
@@ -174,26 +249,16 @@ inputs = processor(
|
|
174 |
)
|
175 |
|
176 |
audio_values = model.generate(**inputs, max_new_tokens=256)
|
177 |
-
```
|
178 |
-
|
179 |
3. Listen to the audio samples either in an ipynb notebook:
|
180 |
-
|
181 |
-
```py
|
182 |
from IPython.display import Audio
|
183 |
|
184 |
sampling_rate = model.config.audio_encoder.sampling_rate
|
185 |
Audio(audio_values[0].numpy(), rate=sampling_rate)
|
186 |
-
```
|
187 |
-
|
188 |
Or save them as a `.wav` file using a third-party library, e.g. `scipy`:
|
189 |
-
|
190 |
-
```py
|
191 |
import scipy
|
192 |
|
193 |
sampling_rate = model.config.audio_encoder.sampling_rate
|
194 |
scipy.io.wavfile.write("musicgen_out.wav", rate=sampling_rate, data=audio_values[0, 0].numpy())
|
195 |
-
```
|
196 |
-
|
197 |
For more details on using the MusicGen model for inference using the 🤗 Transformers library, refer to the
|
198 |
[MusicGen docs](https://huggingface.co/docs/transformers/main/en/model_doc/musicgen) or the hands-on
|
199 |
[Google Colab](https://colab.research.google.com/github/sanchit-gandhi/notebooks/blob/main/MusicGen.ipynb).
|
@@ -237,16 +302,12 @@ Yes. We will soon release the training code for MusicGen and EnCodec.
|
|
237 |
|
238 |
Check [@camenduru tutorial on Youtube](https://www.youtube.com/watch?v=EGfxuTy9Eeo).
|
239 |
|
240 |
-
## Citation
|
241 |
-
```
|
242 |
-
@article{copet2023simple,
|
243 |
title={Simple and Controllable Music Generation},
|
244 |
author={Jade Copet and Felix Kreuk and Itai Gat and Tal Remez and David Kant and Gabriel Synnaeve and Yossi Adi and Alexandre Défossez},
|
245 |
year={2023},
|
246 |
journal={arXiv preprint arXiv:2306.05284},
|
247 |
}
|
248 |
-
```
|
249 |
-
|
250 |
## License
|
251 |
* The code in this repository is released under the MIT license as found in the [LICENSE file](LICENSE).
|
252 |
* The weights in this repository are released under the CC-BY-NC 4.0 license as found in the [LICENSE_weights file](LICENSE_weights).
|
|
|
4 |
colorFrom: gray
|
5 |
colorTo: red
|
6 |
sdk: gradio
|
7 |
+
sdk_version: 5.34.2
|
8 |
python_version: 3.12.8
|
9 |
app_file: app.py
|
10 |
pinned: true
|
|
|
40 |
|
41 |
**Thank you Huggingface for the community grant to run this project**!!
|
42 |
|
43 |
+
## Key Features
|
44 |
+
|
45 |
+
- **Unlimited Audio Generation**: Generate music of any length by seamlessly stitching together segments
|
46 |
+
- **User History**: Save and manage your generated music and access it later
|
47 |
+
- **File Storage**: Generated files are automatically stored in a Hugging Face repository with shareable URLs
|
48 |
+
- **Rich Metadata**: Each generated file includes detailed metadata about the generation parameters
|
49 |
+
- **API Access**: Generate music programmatically using the REST API
|
50 |
+
- **Background Customization**: Use custom images and settings for your music videos
|
51 |
+
- **Melody Conditioning**: Use existing music to guide the generation process
|
52 |
+
|
53 |
# Audiocraft
|
54 |

|
55 |

|
|
|
77 |
|
78 |
## Installation
|
79 |
Audiocraft requires Python 3.9, PyTorch 2.1.0, and a GPU with at least 16 GB of memory (for the medium-sized model). To install Audiocraft, you can run the following:
|
|
|
|
|
80 |
# Best to make sure you have torch installed first, in particular before installing xformers.
|
81 |
# Don't run this if you already have PyTorch installed.
|
82 |
pip install 'torch>=2.1'
|
|
|
84 |
pip install -U audiocraft # stable release
|
85 |
pip install -U git+https://git@github.com/facebookresearch/audiocraft#egg=audiocraft # bleeding edge
|
86 |
pip install -e . # or if you cloned the repo locally
|
|
|
|
|
87 |
## Usage
|
88 |
We offer a number of way to interact with MusicGen:
|
89 |
1. A demo is also available on the [`facebook/MusicGen` HuggingFace Space](https://huggingface.co/spaces/Surn/UnlimitedMusicGen) (huge thanks to all the HF team for their support).
|
|
|
94 |
updated with contributions from @camenduru and the community.
|
95 |
6. Finally, MusicGen is available in 🤗 Transformers from v4.31.0 onwards, see section [🤗 Transformers Usage](#-transformers-usage) below.
|
96 |
|
97 |
+
### Advanced Usage
|
98 |
+
|
99 |
+
#### Programmatic Generation via API
|
100 |
+
|
101 |
+
The `predict_simple` API endpoint allows generating music without using the UI:
|
102 |
+
import requests
|
103 |
+
|
104 |
+
# Example API call
|
105 |
+
response = requests.post(
|
106 |
+
"https://huggingface.co/spaces/Surn/UnlimitedMusicGen/api/predict_simple",
|
107 |
+
json={
|
108 |
+
"model": "stereo-medium", # Choose your model
|
109 |
+
"text": "Epic orchestral soundtrack with dramatic strings and percussion",
|
110 |
+
"duration": 60, # Duration in seconds
|
111 |
+
"topk": 250,
|
112 |
+
"topp": 0, # 0 means use topk instead
|
113 |
+
"temperature": 0.8,
|
114 |
+
"cfg_coef": 4.0,
|
115 |
+
"seed": 42, # Use -1 for random seed
|
116 |
+
"overlap": 2, # Seconds of overlap between segments
|
117 |
+
"video_orientation": "Landscape" # or "Portrait"
|
118 |
+
}
|
119 |
+
)
|
120 |
|
121 |
+
# URLs to the generated content
|
122 |
+
video_url, audio_url, seed = response.json()
|
123 |
+
#### Custom Background Images
|
124 |
+
|
125 |
+
You can use your own background images for the music video:
|
126 |
+
|
127 |
+
1. Upload an image through the UI
|
128 |
+
2. Or specify an image URL in the API call:response = requests.post(
|
129 |
+
"https://huggingface.co/spaces/Surn/UnlimitedMusicGen/api/predict_simple",
|
130 |
+
json={
|
131 |
+
# ... other parameters
|
132 |
+
"background": "https://example.com/your-image.jpg",
|
133 |
+
"video_orientation": "Landscape"
|
134 |
+
}
|
135 |
+
)
|
136 |
+
### More info about Top-k, Top-p, Temperature and Classifier Free Guidance from ChatGPT
|
137 |
|
138 |
Top-k: Top-k is a parameter used in text generation models, including music generation models. It determines the number of most likely next tokens to consider at each step of the generation process. The model ranks all possible tokens based on their predicted probabilities, and then selects the top-k tokens from the ranked list. The model then samples from this reduced set of tokens to determine the next token in the generated sequence. A smaller value of k results in a more focused and deterministic output, while a larger value of k allows for more diversity in the generated music.
|
139 |
|
|
|
145 |
|
146 |
These parameters, such as top-k, top-p, temperature, and classifier-free guidance, provide different ways to influence the output of a music generation model and strike a balance between creativity, diversity, coherence, and control. The specific values for these parameters can be tuned based on the desired outcome and user preferences.
|
147 |
|
148 |
+
## API and Storage Integration
|
149 |
+
|
150 |
+
UnlimitedMusicGen now offers enhanced API capabilities and file storage integration with Hugging Face repositories:
|
151 |
+
|
152 |
+
### REST API Access
|
153 |
+
|
154 |
+
The application exposes a simple REST API endpoint through Gradio that allows you to generate music programmatically:
|
155 |
+
import requests
|
156 |
+
|
157 |
+
# Basic API call example
|
158 |
+
response = requests.post(
|
159 |
+
"https://your-app-url/api/predict_simple",
|
160 |
+
json={
|
161 |
+
"model": "medium",
|
162 |
+
"text": "4/4 120bpm electronic music with driving bass",
|
163 |
+
"duration": 30,
|
164 |
+
"temperature": 0.7,
|
165 |
+
"cfg_coef": 3.75,
|
166 |
+
"title": "My API Generated Track"
|
167 |
+
}
|
168 |
+
)
|
169 |
+
|
170 |
+
# The response contains URLs to the generated audio/video
|
171 |
+
video_url, audio_url, seed = response.json()
|
172 |
+
print(f"Generated music video: {video_url}")
|
173 |
+
print(f"Generated audio file: {audio_url}")
|
174 |
+
print(f"Seed used: {seed}")
|
175 |
+
### File Storage
|
176 |
+
|
177 |
+
Generated files are automatically uploaded to a Hugging Face dataset repository, providing:
|
178 |
+
|
179 |
+
- Persistent storage of your generated audio and video files
|
180 |
+
- Shareable URLs for easy distribution
|
181 |
+
- Organization by user, timestamp, and metadata
|
182 |
+
- Automatic handling of file paths and naming
|
183 |
+
|
184 |
+
The storage system supports various file types including audio (.wav, .mp3), video (.mp4), and images (.png, .jpg).
|
185 |
+
|
186 |
+
### Background Image Support
|
187 |
+
|
188 |
+
You can now provide custom background images for your music videos:
|
189 |
+
- Upload from your device
|
190 |
+
- Use URL links to images (automatically downloaded and processed)
|
191 |
+
- Choose between landscape and portrait orientations
|
192 |
+
- Add title and generation settings overlay with customizable fonts and colors
|
193 |
+
|
194 |
+
## Python API
|
195 |
|
196 |
We provide a simple API and 10 pre-trained models. The pre trained models are:
|
197 |
- `small`: 300M model, text to music only - [🤗 Hub](https://huggingface.co/facebook/musicgen-small)
|
|
|
210 |
GPUs will be able to generate short sequences, or longer sequences with the `small` model.
|
211 |
|
212 |
**Note**: Please make sure to have [ffmpeg](https://ffmpeg.org/download.html) installed when using newer version of `torchaudio`.
|
213 |
+
You can install it with:apt-get install ffmpeg
|
|
|
|
|
|
|
|
|
214 |
See after a quick example for using the API.
|
|
|
|
|
215 |
import torchaudio
|
216 |
from audiocraft.models import MusicGen
|
217 |
from audiocraft.data.audio import audio_write
|
|
|
228 |
|
229 |
for idx, one_wav in enumerate(wav):
|
230 |
# Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
|
231 |
+
audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)## 🤗 Transformers Usage
|
|
|
|
|
232 |
|
233 |
MusicGen is available in the 🤗 Transformers library from version 4.31.0 onwards, requiring minimal dependencies
|
234 |
and additional packages. Steps to get started:
|
235 |
|
236 |
1. First install the 🤗 [Transformers library](https://github.com/huggingface/transformers) from main:
|
|
|
|
|
237 |
pip install git+https://github.com/huggingface/transformers.git
|
|
|
|
|
238 |
2. Run the following Python code to generate text-conditional audio samples:
|
|
|
|
|
239 |
from transformers import AutoProcessor, MusicgenForConditionalGeneration
|
240 |
|
241 |
|
|
|
249 |
)
|
250 |
|
251 |
audio_values = model.generate(**inputs, max_new_tokens=256)
|
|
|
|
|
252 |
3. Listen to the audio samples either in an ipynb notebook:
|
|
|
|
|
253 |
from IPython.display import Audio
|
254 |
|
255 |
sampling_rate = model.config.audio_encoder.sampling_rate
|
256 |
Audio(audio_values[0].numpy(), rate=sampling_rate)
|
|
|
|
|
257 |
Or save them as a `.wav` file using a third-party library, e.g. `scipy`:
|
|
|
|
|
258 |
import scipy
|
259 |
|
260 |
sampling_rate = model.config.audio_encoder.sampling_rate
|
261 |
scipy.io.wavfile.write("musicgen_out.wav", rate=sampling_rate, data=audio_values[0, 0].numpy())
|
|
|
|
|
262 |
For more details on using the MusicGen model for inference using the 🤗 Transformers library, refer to the
|
263 |
[MusicGen docs](https://huggingface.co/docs/transformers/main/en/model_doc/musicgen) or the hands-on
|
264 |
[Google Colab](https://colab.research.google.com/github/sanchit-gandhi/notebooks/blob/main/MusicGen.ipynb).
|
|
|
302 |
|
303 |
Check [@camenduru tutorial on Youtube](https://www.youtube.com/watch?v=EGfxuTy9Eeo).
|
304 |
|
305 |
+
## Citation@article{copet2023simple,
|
|
|
|
|
306 |
title={Simple and Controllable Music Generation},
|
307 |
author={Jade Copet and Felix Kreuk and Itai Gat and Tal Remez and David Kant and Gabriel Synnaeve and Yossi Adi and Alexandre Défossez},
|
308 |
year={2023},
|
309 |
journal={arXiv preprint arXiv:2306.05284},
|
310 |
}
|
|
|
|
|
311 |
## License
|
312 |
* The code in this repository is released under the MIT license as found in the [LICENSE file](LICENSE).
|
313 |
* The weights in this repository are released under the CC-BY-NC 4.0 license as found in the [LICENSE_weights file](LICENSE_weights).
|
app.py
CHANGED
@@ -34,10 +34,12 @@ import modules.user_history
|
|
34 |
from modules.version_info import versions_html, commit_hash, get_xformers_version
|
35 |
from modules.gradio import *
|
36 |
from modules.file_utils import get_file_parts, get_filename_from_filepath, convert_title_to_filename, get_unique_file_path, delete_file, download_and_save_image
|
|
|
|
|
37 |
|
38 |
MODEL = None
|
39 |
MODELS = None
|
40 |
-
IS_SHARED_SPACE = "Surn/UnlimitedMusicGen" in os.environ.get('SPACE_ID', '')
|
41 |
INTERRUPTED = False
|
42 |
UNLOAD_MODEL = False
|
43 |
MOVE_TO_CPU = False
|
@@ -640,12 +642,12 @@ def predict_simple(model: str, text: str, duration: int = 10, dimension: int = 2
|
|
640 |
settings_font_size (int, optional): Font size for settings text.
|
641 |
settings_animate_waveform (bool, optional): Animate waveform in video.
|
642 |
video_orientation (str, optional): Video orientation
|
643 |
-
return_history_json (bool, optional):
|
644 |
|
645 |
Returns:
|
646 |
tp.List[tp.Tuple[str, str, str]]: [waveform_video_path, wave_file_path, seed_used]
|
647 |
"""
|
648 |
-
profile_username_to_send = "
|
649 |
|
650 |
if not profile:
|
651 |
profile = modules.user_history.get_profile
|
@@ -663,12 +665,35 @@ def predict_simple(model: str, text: str, duration: int = 10, dimension: int = 2
|
|
663 |
profile_username_to_send = actual_profile_data
|
664 |
|
665 |
UMG_result = predict(model, text, melody_filepath=None, duration=duration, dimension=dimension, topk=topk, topp=topp, temperature=temperature, cfg_coef=cfg_coef, background=background, title=title, settings_font=settings_font, settings_font_color=settings_font_color, seed=seed, overlap=overlap, prompt_index=prompt_index, include_title=include_title, include_settings=include_settings, harmony_only=False, profile=profile, segment_length=segment_length, settings_font_size=settings_font_size, settings_animate_waveform=settings_animate_waveform, video_orientation=video_orientation, excerpt_duration=3.5, return_history_json=return_history_json)
|
|
|
|
|
|
|
666 |
if return_history_json:
|
667 |
-
#
|
668 |
-
|
669 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
670 |
content = UMG_result["video_path"], UMG_result["audio_path"], UMG_result["metadata"]["Seed"]
|
671 |
UMG_result = content
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
672 |
|
673 |
return UMG_result
|
674 |
|
|
|
34 |
from modules.version_info import versions_html, commit_hash, get_xformers_version
|
35 |
from modules.gradio import *
|
36 |
from modules.file_utils import get_file_parts, get_filename_from_filepath, convert_title_to_filename, get_unique_file_path, delete_file, download_and_save_image
|
37 |
+
from modules.constants import IS_SHARED_SPACE, HF_REPO_ID
|
38 |
+
from modules.storage import upload_files_to_repo
|
39 |
|
40 |
MODEL = None
|
41 |
MODELS = None
|
42 |
+
#IS_SHARED_SPACE = "Surn/UnlimitedMusicGen" in os.environ.get('SPACE_ID', '')
|
43 |
INTERRUPTED = False
|
44 |
UNLOAD_MODEL = False
|
45 |
MOVE_TO_CPU = False
|
|
|
642 |
settings_font_size (int, optional): Font size for settings text.
|
643 |
settings_animate_waveform (bool, optional): Animate waveform in video.
|
644 |
video_orientation (str, optional): Video orientation
|
645 |
+
return_history_json (bool, optional): Return history JSON instead of typical output. Default to False.
|
646 |
|
647 |
Returns:
|
648 |
tp.List[tp.Tuple[str, str, str]]: [waveform_video_path, wave_file_path, seed_used]
|
649 |
"""
|
650 |
+
profile_username_to_send = "default_user"
|
651 |
|
652 |
if not profile:
|
653 |
profile = modules.user_history.get_profile
|
|
|
665 |
profile_username_to_send = actual_profile_data
|
666 |
|
667 |
UMG_result = predict(model, text, melody_filepath=None, duration=duration, dimension=dimension, topk=topk, topp=topp, temperature=temperature, cfg_coef=cfg_coef, background=background, title=title, settings_font=settings_font, settings_font_color=settings_font_color, seed=seed, overlap=overlap, prompt_index=prompt_index, include_title=include_title, include_settings=include_settings, harmony_only=False, profile=profile, segment_length=segment_length, settings_font_size=settings_font_size, settings_animate_waveform=settings_animate_waveform, video_orientation=video_orientation, excerpt_duration=3.5, return_history_json=return_history_json)
|
668 |
+
|
669 |
+
# upload to storage and return urls
|
670 |
+
folder_name = f"user_uploads/{profile_username_to_send}"
|
671 |
if return_history_json:
|
672 |
+
# use modules.storage.upload_files_to_repo to get urls for image_path, video_path, audio_path
|
673 |
+
upload_result = upload_files_to_repo(
|
674 |
+
files=[UMG_result["video_path"],UMG_result["audio_path"], UMG_result["image_path"]],
|
675 |
+
repo_id=HF_REPO_ID, # constants.py value of dataset repo
|
676 |
+
folder_name=f"{folder_name}/{UMG_result['metadata']['title']}/{UMG_result['metadata']['Seed']}/{time.strftime('%Y%m%d%H%M%S')}",
|
677 |
+
create_permalink=False,
|
678 |
+
repo_type="dataset"
|
679 |
+
)
|
680 |
+
if upload_result:
|
681 |
+
UMG_result["video_path"] = upload_result[0][1] # Assuming [(response, link) for link in individual_links]
|
682 |
+
UMG_result["audio_path"] = upload_result[1][1]
|
683 |
+
UMG_result["image_path"] = upload_result[2][1]
|
684 |
content = UMG_result["video_path"], UMG_result["audio_path"], UMG_result["metadata"]["Seed"]
|
685 |
UMG_result = content
|
686 |
+
else:
|
687 |
+
# use modules.storage.upload_files_to_repo to get urls for video_path, audio_path
|
688 |
+
upload_result = upload_files_to_repo(
|
689 |
+
files=[UMG_result[0],UMG_result[1]],
|
690 |
+
repo_id=HF_REPO_ID, # constants.py value of dataset repo
|
691 |
+
folder_name=f"{folder_name}/{UMG_result[2]}/{time.strftime('%Y%m%d%H%M%S')}",
|
692 |
+
create_permalink=False,
|
693 |
+
repo_type="dataset"
|
694 |
+
)
|
695 |
+
if upload_result:
|
696 |
+
UMG_result = upload_result[0][1], upload_result[1][1], UMG_result[2]
|
697 |
|
698 |
return UMG_result
|
699 |
|
modules/constants.py
ADDED
@@ -0,0 +1,40 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# modules/constants.py
|
2 |
+
# constants.py contains all the constants used in the project
|
3 |
+
import os
|
4 |
+
from pathlib import Path
|
5 |
+
from dotenv import load_dotenv
|
6 |
+
|
7 |
+
# Load environment variables from .env file
|
8 |
+
dotenv_path = Path(__file__).parent.parent / '.env'
|
9 |
+
load_dotenv(dotenv_path)
|
10 |
+
|
11 |
+
IS_SHARED_SPACE = "Surn/UnlimitedMusicGen" in os.environ.get('SPACE_ID', '')
|
12 |
+
|
13 |
+
HF_API_TOKEN = os.getenv("HF_API_TOKEN")
|
14 |
+
if not HF_API_TOKEN:
|
15 |
+
raise ValueError("HF_TOKEN is not set. Please check your .env file.")
|
16 |
+
try:
|
17 |
+
if os.environ['TMPDIR']:
|
18 |
+
TMPDIR = os.environ['TMPDIR']
|
19 |
+
else:
|
20 |
+
TMPDIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'tmp')
|
21 |
+
except:
|
22 |
+
TMPDIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'tmp')
|
23 |
+
|
24 |
+
os.makedirs(TMPDIR, exist_ok=True)
|
25 |
+
|
26 |
+
model_extensions = {".glb", ".gltf", ".obj", ".ply"}
|
27 |
+
model_extensions_list = list(model_extensions)
|
28 |
+
image_extensions = {".png", ".jpg", ".jpeg", ".webp"}
|
29 |
+
image_extensions_list = list(image_extensions)
|
30 |
+
audio_extensions = {".mp3", ".wav", ".ogg", ".flac", ".aac"}
|
31 |
+
audio_extensions_list = list(audio_extensions)
|
32 |
+
video_extensions = {".mp4"}
|
33 |
+
video_extensions_list = list(video_extensions)
|
34 |
+
upload_file_types = model_extensions_list + image_extensions_list + audio_extensions_list + video_extensions_list
|
35 |
+
|
36 |
+
# Constants for URL shortener
|
37 |
+
HF_REPO_ID = os.getenv("HF_REPO_ID")
|
38 |
+
if not HF_REPO_ID:
|
39 |
+
HF_REPO_ID = "Surn/Storage" # Replace with your Hugging Face repository ID
|
40 |
+
SHORTENER_JSON_FILE = "shortener.json" # The name of your JSON file in the repo
|
modules/storage.md
ADDED
@@ -0,0 +1,156 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Storage Module (`modules/storage.py`) Usage Guide
|
2 |
+
|
3 |
+
The `storage.py` module provides helper functions for:
|
4 |
+
- Generating permalinks for 3D viewer projects.
|
5 |
+
- Uploading files in batches to a Hugging Face repository.
|
6 |
+
- Managing URL shortening by storing (short URL, full URL) pairs in a JSON file on the repository.
|
7 |
+
- Retrieving full URLs from short URL IDs and vice versa.
|
8 |
+
- Handle specific file types for 3D models, images, video and audio.
|
9 |
+
|
10 |
+
## Key Functions
|
11 |
+
|
12 |
+
### 1. `generate_permalink(valid_files, base_url_external, permalink_viewer_url="surn-3d-viewer.hf.space")`
|
13 |
+
- **Purpose:**
|
14 |
+
Given a list of file paths, it looks for exactly one model file (with an extension defined in `model_extensions`) and exactly two image files (extensions defined in `image_extensions`). If the criteria are met, it returns a permalink URL built from the base URL and query parameters.
|
15 |
+
- **Usage Example:**from modules.storage import generate_permalink
|
16 |
+
|
17 |
+
valid_files = [
|
18 |
+
"models/3d_model.glb",
|
19 |
+
"images/model_texture.png",
|
20 |
+
"images/model_depth.png"
|
21 |
+
]
|
22 |
+
base_url_external = "https://huggingface.co/datasets/Surn/Storage/resolve/main/saved_models/my_model"
|
23 |
+
permalink = generate_permalink(valid_files, base_url_external)
|
24 |
+
if permalink:
|
25 |
+
print("Permalink:", permalink)
|
26 |
+
### 2. `generate_permalink_from_urls(model_url, hm_url, img_url, permalink_viewer_url="surn-3d-viewer.hf.space")`
|
27 |
+
- **Purpose:**
|
28 |
+
Constructs a permalink URL by combining individual URLs for a 3D model (`model_url`), height map (`hm_url`), and image (`img_url`) into a single URL with corresponding query parameters.
|
29 |
+
- **Usage Example:**from modules.storage import generate_permalink_from_urls
|
30 |
+
|
31 |
+
model_url = "https://example.com/model.glb"
|
32 |
+
hm_url = "https://example.com/heightmap.png"
|
33 |
+
img_url = "https://example.com/source.png"
|
34 |
+
|
35 |
+
permalink = generate_permalink_from_urls(model_url, hm_url, img_url)
|
36 |
+
print("Generated Permalink:", permalink)
|
37 |
+
### 3. `upload_files_to_repo(files, repo_id, folder_name, create_permalink=False, repo_type="dataset", permalink_viewer_url="surn-3d-viewer.hf.space")`
|
38 |
+
- **Purpose:**
|
39 |
+
Uploads a batch of files (each file represented as a path string) to a specified Hugging Face repository (e.g. `"Surn/Storage"`) under a given folder.
|
40 |
+
The function's return type is `Union[Dict[str, Any], List[Tuple[Any, str]]]`.
|
41 |
+
- When `create_permalink` is `True` and exactly three valid files (one model and two images) are provided, the function returns a dictionary:```
|
42 |
+
{
|
43 |
+
"response": <upload_folder_response>,
|
44 |
+
"permalink": "<full_permalink_url>",
|
45 |
+
"short_permalink": "<shortened_permalink_url_with_sid>"
|
46 |
+
}
|
47 |
+
``` - Otherwise (or if `create_permalink` is `False` or conditions for permalink creation are not met), it returns a list of tuples, where each tuple is `(upload_folder_response, individual_file_link)`.
|
48 |
+
- If no valid files are provided, it returns an empty list `[]` (this case should ideally also return the dictionary with empty/None values for consistency, but currently returns `[]` as per the code).
|
49 |
+
- **Usage Example:**
|
50 |
+
|
51 |
+
**a. Uploading with permalink creation:**from modules.storage import upload_files_to_repo
|
52 |
+
|
53 |
+
files_for_permalink = [
|
54 |
+
"local/path/to/model.glb",
|
55 |
+
"local/path/to/heightmap.png",
|
56 |
+
"local/path/to/image.png"
|
57 |
+
]
|
58 |
+
repo_id = "Surn/Storage" # Make sure this is defined, e.g., from constants or environment variables
|
59 |
+
folder_name = "my_new_model_with_permalink"
|
60 |
+
|
61 |
+
upload_result = upload_files_to_repo(
|
62 |
+
files_for_permalink,
|
63 |
+
repo_id,
|
64 |
+
folder_name,
|
65 |
+
create_permalink=True
|
66 |
+
)
|
67 |
+
|
68 |
+
if isinstance(upload_result, dict):
|
69 |
+
print("Upload Response:", upload_result.get("response"))
|
70 |
+
print("Full Permalink:", upload_result.get("permalink"))
|
71 |
+
print("Short Permalink:", upload_result.get("short_permalink"))
|
72 |
+
elif upload_result: # Check if list is not empty
|
73 |
+
print("Upload Response for individual files:")
|
74 |
+
for res, link in upload_result:
|
75 |
+
print(f" Response: {res}, Link: {link}")
|
76 |
+
else:
|
77 |
+
print("No files uploaded or error occurred.")
|
78 |
+
**b. Uploading without permalink creation (or if conditions for permalink are not met):**from modules.storage import upload_files_to_repo
|
79 |
+
|
80 |
+
files_individual = [
|
81 |
+
"local/path/to/another_model.obj",
|
82 |
+
"local/path/to/texture.jpg"
|
83 |
+
]
|
84 |
+
repo_id = "Surn/Storage"
|
85 |
+
folder_name = "my_other_uploads"
|
86 |
+
|
87 |
+
upload_results_list = upload_files_to_repo(
|
88 |
+
files_individual,
|
89 |
+
repo_id,
|
90 |
+
folder_name,
|
91 |
+
create_permalink=False # Or if create_permalink=True but not 1 model & 2 images
|
92 |
+
)
|
93 |
+
|
94 |
+
if upload_results_list: # Will be a list of tuples
|
95 |
+
print("Upload results for individual files:")
|
96 |
+
for res, link in upload_results_list:
|
97 |
+
print(f" Upload Response: {res}, File Link: {link}")
|
98 |
+
else:
|
99 |
+
print("No files uploaded or error occurred.")
|
100 |
+
### 4. URL Shortening Functions: `gen_full_url(...)` and Helpers
|
101 |
+
The module also enables URL shortening by managing a JSON file (e.g. `shortener.json`) in a Hugging Face repository. It supports CRUD-like operations:
|
102 |
+
- **Read:** Look up the full URL using a provided short URL ID.
|
103 |
+
- **Create:** Generate a new short URL ID for a full URL if no existing mapping exists.
|
104 |
+
- **Update/Conflict Handling:**
|
105 |
+
If both short URL ID and full URL are provided, it checks consistency and either confirms or reports a conflict.
|
106 |
+
|
107 |
+
#### `gen_full_url(short_url=None, full_url=None, repo_id=None, repo_type="dataset", permalink_viewer_url="surn-3d-viewer.hf.space", json_file="shortener.json")`
|
108 |
+
- **Purpose:**
|
109 |
+
Based on which parameter is provided, it retrieves or creates a mapping between a short URL ID and a full URL.
|
110 |
+
- If only `short_url` (the ID) is given, it returns the corresponding `full_url`.
|
111 |
+
- If only `full_url` is given, it looks up an existing `short_url` ID or generates and stores a new one.
|
112 |
+
- If both are given, it validates and returns the mapping or an error status.
|
113 |
+
- **Returns:** A tuple `(status_message, result_url)`, where `status_message` indicates the outcome (e.g., `"success_retrieved_full"`, `"created_short"`) and `result_url` is the relevant URL (full or short ID).
|
114 |
+
- **Usage Examples:**
|
115 |
+
|
116 |
+
**a. Convert a full URL into a short URL ID:**from modules.storage import gen_full_url
|
117 |
+
from modules.constants import HF_REPO_ID, SHORTENER_JSON_FILE # Assuming these are defined
|
118 |
+
|
119 |
+
full_permalink = "https://surn-3d-viewer.hf.space/?3d=https%3A%2F%2Fexample.com%2Fmodel.glb&hm=https%3A%2F%2Fexample.com%2Fheightmap.png&image=https%3A%2F%2Fexample.com%2Fsource.png"
|
120 |
+
|
121 |
+
status, short_id = gen_full_url(
|
122 |
+
full_url=full_permalink,
|
123 |
+
repo_id=HF_REPO_ID,
|
124 |
+
json_file=SHORTENER_JSON_FILE
|
125 |
+
)
|
126 |
+
print("Status:", status)
|
127 |
+
if status == "created_short" or status == "success_retrieved_short":
|
128 |
+
print("Shortened URL ID:", short_id)
|
129 |
+
# Construct the full short URL for sharing:
|
130 |
+
# permalink_viewer_url = "surn-3d-viewer.hf.space" # Or from constants
|
131 |
+
# shareable_short_url = f"https://{permalink_viewer_url}/?sid={short_id}"
|
132 |
+
# print("Shareable Short URL:", shareable_short_url)
|
133 |
+
**b. Retrieve the full URL from a short URL ID:**from modules.storage import gen_full_url
|
134 |
+
from modules.constants import HF_REPO_ID, SHORTENER_JSON_FILE # Assuming these are defined
|
135 |
+
|
136 |
+
short_id_to_lookup = "aBcDeFg1" # Example short URL ID
|
137 |
+
|
138 |
+
status, retrieved_full_url = gen_full_url(
|
139 |
+
short_url=short_id_to_lookup,
|
140 |
+
repo_id=HF_REPO_ID,
|
141 |
+
json_file=SHORTENER_JSON_FILE
|
142 |
+
)
|
143 |
+
print("Status:", status)
|
144 |
+
if status == "success_retrieved_full":
|
145 |
+
print("Retrieved Full URL:", retrieved_full_url)
|
146 |
+
## Notes
|
147 |
+
- **Authentication:** All functions that interact with Hugging Face Hub use the HF API token defined as `HF_API_TOKEN` in `modules/constants.py`. Ensure this environment variable is correctly set.
|
148 |
+
- **Constants:** Functions like `gen_full_url` and `upload_files_to_repo` (when creating short links) rely on `HF_REPO_ID` and `SHORTENER_JSON_FILE` from `modules/constants.py` for the URL shortening feature.
|
149 |
+
- **File Types:** Only files with extensions included in `upload_file_types` (a combination of `model_extensions` and `image_extensions` from `modules/constants.py`) are processed by `upload_files_to_repo`.
|
150 |
+
- **Repository Configuration:** When using URL shortening and file uploads, ensure that the specified Hugging Face repository (e.g., defined by `HF_REPO_ID`) exists and that you have write permissions.
|
151 |
+
- **Temporary Directory:** `upload_files_to_repo` temporarily copies files to a local directory (configured by `TMPDIR` in `modules/constants.py`) before uploading.
|
152 |
+
- **Error Handling:** Functions include basic error handling (e.g., catching `RepositoryNotFoundError`, `EntryNotFoundError`, JSON decoding errors, or upload issues) and print messages to the console for debugging. Review function return values to handle these cases appropriately in your application.
|
153 |
+
|
154 |
+
---
|
155 |
+
|
156 |
+
This guide provides the essential usage examples for interacting with the storage and URL-shortening functionality. You can integrate these examples into your application or use them as a reference when extending functionality.
|
modules/storage.py
ADDED
@@ -0,0 +1,327 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# modules/storage.py
|
2 |
+
__version__ = "0.1.1" # Added version
|
3 |
+
import os
|
4 |
+
import urllib.parse
|
5 |
+
import tempfile
|
6 |
+
import shutil
|
7 |
+
import json
|
8 |
+
import base64
|
9 |
+
from huggingface_hub import login, upload_folder, hf_hub_download, HfApi
|
10 |
+
from huggingface_hub.utils import RepositoryNotFoundError, EntryNotFoundError
|
11 |
+
from modules.constants import HF_API_TOKEN, upload_file_types, model_extensions, image_extensions, audio_extensions, video_extensions, HF_REPO_ID, SHORTENER_JSON_FILE
|
12 |
+
from typing import Any, Dict, List, Tuple, Union
|
13 |
+
|
14 |
+
# see storage.md for detailed information about the storage module and its functions.
|
15 |
+
|
16 |
+
def generate_permalink(valid_files, base_url_external, permalink_viewer_url="surn-3d-viewer.hf.space"):
|
17 |
+
"""
|
18 |
+
Given a list of valid files, checks if they contain exactly 1 model file and 2 image files.
|
19 |
+
Constructs and returns a permalink URL with query parameters if the criteria is met.
|
20 |
+
Otherwise, returns None.
|
21 |
+
"""
|
22 |
+
model_link = None
|
23 |
+
images_links = []
|
24 |
+
audio_links = []
|
25 |
+
video_links = []
|
26 |
+
for f in valid_files:
|
27 |
+
filename = os.path.basename(f)
|
28 |
+
ext = os.path.splitext(filename)[1].lower()
|
29 |
+
if ext in model_extensions:
|
30 |
+
if model_link is None:
|
31 |
+
model_link = f"{base_url_external}/{filename}"
|
32 |
+
elif ext in image_extensions:
|
33 |
+
images_links.append(f"{base_url_external}/{filename}")
|
34 |
+
elif ext in audio_extensions:
|
35 |
+
audio_links.append(f"{base_url_external}/{filename}")
|
36 |
+
elif ext in video_extensions:
|
37 |
+
video_links.append(f"{base_url_external}/{filename}")
|
38 |
+
if model_link and len(images_links) == 2:
|
39 |
+
# Construct a permalink to the viewer project with query parameters.
|
40 |
+
permalink_viewer_url = f"https://{permalink_viewer_url}/"
|
41 |
+
params = {"3d": model_link, "hm": images_links[0], "image": images_links[1]}
|
42 |
+
query_str = urllib.parse.urlencode(params)
|
43 |
+
return f"{permalink_viewer_url}?{query_str}"
|
44 |
+
return None
|
45 |
+
|
46 |
+
def generate_permalink_from_urls(model_url, hm_url, img_url, permalink_viewer_url="surn-3d-viewer.hf.space"):
|
47 |
+
"""
|
48 |
+
Constructs and returns a permalink URL with query string parameters for the viewer.
|
49 |
+
Each parameter is passed separately so that the image positions remain consistent.
|
50 |
+
|
51 |
+
Parameters:
|
52 |
+
model_url (str): Processed URL for the 3D model.
|
53 |
+
hm_url (str): Processed URL for the height map image.
|
54 |
+
img_url (str): Processed URL for the main image.
|
55 |
+
permalink_viewer_url (str): The base viewer URL.
|
56 |
+
|
57 |
+
Returns:
|
58 |
+
str: The generated permalink URL.
|
59 |
+
"""
|
60 |
+
import urllib.parse
|
61 |
+
params = {"3d": model_url, "hm": hm_url, "image": img_url}
|
62 |
+
query_str = urllib.parse.urlencode(params)
|
63 |
+
return f"https://{permalink_viewer_url}/?{query_str}"
|
64 |
+
|
65 |
+
def upload_files_to_repo(
|
66 |
+
files: List[Any],
|
67 |
+
repo_id: str,
|
68 |
+
folder_name: str,
|
69 |
+
create_permalink: bool = False,
|
70 |
+
repo_type: str = "dataset",
|
71 |
+
permalink_viewer_url: str = "surn-3d-viewer.hf.space"
|
72 |
+
) -> Union[Dict[str, Any], List[Tuple[Any, str]]]:
|
73 |
+
"""
|
74 |
+
Uploads multiple files to a Hugging Face repository using a batch upload approach via upload_folder.
|
75 |
+
|
76 |
+
Parameters:
|
77 |
+
files (list): A list of file paths (str) to upload.
|
78 |
+
repo_id (str): The repository ID on Hugging Face for storage, e.g. "Surn/Storage".
|
79 |
+
folder_name (str): The subfolder within the repository where files will be saved.
|
80 |
+
create_permalink (bool): If True and if exactly three files are uploaded (1 model and 2 images),
|
81 |
+
returns a single permalink to the project with query parameters.
|
82 |
+
Otherwise, returns individual permalinks for each file.
|
83 |
+
repo_type (str): Repository type ("space", "dataset", etc.). Default is "dataset".
|
84 |
+
permalink_viewer_url (str): The base viewer URL.
|
85 |
+
|
86 |
+
Returns:
|
87 |
+
Union[Dict[str, Any], List[Tuple[Any, str]]]:
|
88 |
+
If create_permalink is True and files match the criteria:
|
89 |
+
dict: {
|
90 |
+
"response": <upload response>,
|
91 |
+
"permalink": <full_permalink URL>,
|
92 |
+
"short_permalink": <shortened permalink URL>
|
93 |
+
}
|
94 |
+
Otherwise:
|
95 |
+
list: A list of tuples (response, permalink) for each file.
|
96 |
+
"""
|
97 |
+
# Log in using the HF API token.
|
98 |
+
login(token=HF_API_TOKEN) # Corrected from HF_TOKEN to HF_API_TOKEN
|
99 |
+
|
100 |
+
valid_files = []
|
101 |
+
permalink_short = None
|
102 |
+
|
103 |
+
# Ensure folder_name does not have a trailing slash.
|
104 |
+
folder_name = folder_name.rstrip("/")
|
105 |
+
|
106 |
+
# Filter for valid files based on allowed extensions.
|
107 |
+
for f in files:
|
108 |
+
file_name = f if isinstance(f, str) else f.name if hasattr(f, "name") else None
|
109 |
+
if file_name is None:
|
110 |
+
continue
|
111 |
+
ext = os.path.splitext(file_name)[1].lower()
|
112 |
+
if ext in upload_file_types:
|
113 |
+
valid_files.append(f)
|
114 |
+
|
115 |
+
if not valid_files:
|
116 |
+
# Return a dictionary with None values for permalinks if create_permalink was True
|
117 |
+
if create_permalink:
|
118 |
+
return {
|
119 |
+
"response": "No valid files to upload.",
|
120 |
+
"permalink": None,
|
121 |
+
"short_permalink": None
|
122 |
+
}
|
123 |
+
return []
|
124 |
+
|
125 |
+
# Create a temporary directory; copy valid files directly into it.
|
126 |
+
with tempfile.TemporaryDirectory(dir=os.getenv("TMPDIR", "/tmp")) as temp_dir:
|
127 |
+
for file_path in valid_files:
|
128 |
+
filename = os.path.basename(file_path)
|
129 |
+
dest_path = os.path.join(temp_dir, filename)
|
130 |
+
shutil.copy(file_path, dest_path)
|
131 |
+
|
132 |
+
# Batch upload all files in the temporary folder.
|
133 |
+
# Files will be uploaded under the folder (path_in_repo) given by folder_name.
|
134 |
+
response = upload_folder(
|
135 |
+
folder_path=temp_dir,
|
136 |
+
repo_id=repo_id,
|
137 |
+
repo_type=repo_type,
|
138 |
+
path_in_repo=folder_name,
|
139 |
+
commit_message="Batch upload files"
|
140 |
+
)
|
141 |
+
|
142 |
+
# Construct external URLs for each uploaded file.
|
143 |
+
base_url_external = f"https://huggingface.co/datasets/{repo_id}/resolve/main/{folder_name}"
|
144 |
+
individual_links = []
|
145 |
+
for file_path in valid_files:
|
146 |
+
filename = os.path.basename(file_path)
|
147 |
+
link = f"{base_url_external}/{filename}"
|
148 |
+
individual_links.append(link)
|
149 |
+
|
150 |
+
# If permalink creation is requested and exactly 3 valid files are provided,
|
151 |
+
# try to generate a permalink using generate_permalink().
|
152 |
+
if create_permalink: # No need to check len(valid_files) == 3 here, generate_permalink will handle it
|
153 |
+
permalink = generate_permalink(valid_files, base_url_external, permalink_viewer_url)
|
154 |
+
if permalink:
|
155 |
+
status, short_id = gen_full_url(
|
156 |
+
full_url=permalink,
|
157 |
+
repo_id=HF_REPO_ID, # This comes from constants
|
158 |
+
json_file=SHORTENER_JSON_FILE # This comes from constants
|
159 |
+
)
|
160 |
+
if status in ["created_short", "success_retrieved_short", "exists_match"]:
|
161 |
+
permalink_short = f"https://{permalink_viewer_url}/?sid={short_id}"
|
162 |
+
else: # Shortening failed or conflict not resolved to a usable short_id
|
163 |
+
permalink_short = None
|
164 |
+
print(f"URL shortening status: {status} for {permalink}")
|
165 |
+
|
166 |
+
return {
|
167 |
+
"response": response,
|
168 |
+
"permalink": permalink,
|
169 |
+
"short_permalink": permalink_short
|
170 |
+
}
|
171 |
+
else: # generate_permalink returned None (criteria not met)
|
172 |
+
return {
|
173 |
+
"response": response, # Still return upload response
|
174 |
+
"permalink": None,
|
175 |
+
"short_permalink": None
|
176 |
+
}
|
177 |
+
|
178 |
+
# Otherwise, return individual tuples for each file.
|
179 |
+
return [(response, link) for link in individual_links]
|
180 |
+
|
181 |
+
def _generate_short_id(length=8):
|
182 |
+
"""Generates a random base64 URL-safe string."""
|
183 |
+
return base64.urlsafe_b64encode(os.urandom(length * 2))[:length].decode('utf-8')
|
184 |
+
|
185 |
+
def _get_json_from_repo(repo_id, json_file_name, repo_type="dataset"):
|
186 |
+
"""Downloads and loads the JSON file from the repo. Returns empty list if not found or error."""
|
187 |
+
try:
|
188 |
+
login(token=HF_API_TOKEN)
|
189 |
+
json_path = hf_hub_download(
|
190 |
+
repo_id=repo_id,
|
191 |
+
filename=json_file_name,
|
192 |
+
repo_type=repo_type,
|
193 |
+
token=HF_API_TOKEN # Added token for consistency, though login might suffice
|
194 |
+
)
|
195 |
+
with open(json_path, 'r') as f:
|
196 |
+
data = json.load(f)
|
197 |
+
os.remove(json_path) # Clean up downloaded file
|
198 |
+
return data
|
199 |
+
except RepositoryNotFoundError:
|
200 |
+
print(f"Repository {repo_id} not found.")
|
201 |
+
return []
|
202 |
+
except EntryNotFoundError:
|
203 |
+
print(f"JSON file {json_file_name} not found in {repo_id}. Initializing with empty list.")
|
204 |
+
return []
|
205 |
+
except json.JSONDecodeError:
|
206 |
+
print(f"Error decoding JSON from {json_file_name}. Returning empty list.")
|
207 |
+
return []
|
208 |
+
except Exception as e:
|
209 |
+
print(f"An unexpected error occurred while fetching {json_file_name}: {e}")
|
210 |
+
return []
|
211 |
+
|
212 |
+
def _upload_json_to_repo(data, repo_id, json_file_name, repo_type="dataset"):
|
213 |
+
"""Uploads the JSON data to the specified file in the repo."""
|
214 |
+
try:
|
215 |
+
login(token=HF_API_TOKEN)
|
216 |
+
api = HfApi()
|
217 |
+
# Use a temporary directory specified by TMPDIR or default to system temp
|
218 |
+
temp_dir_for_json = os.getenv("TMPDIR", tempfile.gettempdir())
|
219 |
+
os.makedirs(temp_dir_for_json, exist_ok=True)
|
220 |
+
|
221 |
+
with tempfile.NamedTemporaryFile(mode="w+", delete=False, suffix=".json", dir=temp_dir_for_json) as tmp_file:
|
222 |
+
json.dump(data, tmp_file, indent=2)
|
223 |
+
tmp_file_path = tmp_file.name
|
224 |
+
|
225 |
+
api.upload_file(
|
226 |
+
path_or_fileobj=tmp_file_path,
|
227 |
+
path_in_repo=json_file_name,
|
228 |
+
repo_id=repo_id,
|
229 |
+
repo_type=repo_type,
|
230 |
+
commit_message=f"Update {json_file_name}"
|
231 |
+
)
|
232 |
+
os.remove(tmp_file_path) # Clean up temporary file
|
233 |
+
return True
|
234 |
+
except Exception as e:
|
235 |
+
print(f"Failed to upload {json_file_name} to {repo_id}: {e}")
|
236 |
+
if 'tmp_file_path' in locals() and os.path.exists(tmp_file_path):
|
237 |
+
os.remove(tmp_file_path) # Ensure cleanup on error too
|
238 |
+
return False
|
239 |
+
|
240 |
+
def _find_url_in_json(data, short_url=None, full_url=None):
|
241 |
+
"""
|
242 |
+
Searches the JSON data.
|
243 |
+
If short_url is provided, returns the corresponding full_url or None.
|
244 |
+
If full_url is provided, returns the corresponding short_url or None.
|
245 |
+
"""
|
246 |
+
if not data: # Handles cases where data might be None or empty
|
247 |
+
return None
|
248 |
+
if short_url:
|
249 |
+
for item in data:
|
250 |
+
if item.get("short_url") == short_url:
|
251 |
+
return item.get("full_url")
|
252 |
+
if full_url:
|
253 |
+
for item in data:
|
254 |
+
if item.get("full_url") == full_url:
|
255 |
+
return item.get("short_url")
|
256 |
+
return None
|
257 |
+
|
258 |
+
def _add_url_to_json(data, short_url, full_url):
|
259 |
+
"""Adds a new short_url/full_url pair to the data. Returns updated data."""
|
260 |
+
if data is None:
|
261 |
+
data = []
|
262 |
+
data.append({"short_url": short_url, "full_url": full_url})
|
263 |
+
return data
|
264 |
+
|
265 |
+
def gen_full_url(short_url=None, full_url=None, repo_id=None, repo_type="dataset", permalink_viewer_url="surn-3d-viewer.hf.space", json_file="shortener.json"):
|
266 |
+
"""
|
267 |
+
Manages short URLs and their corresponding full URLs in a JSON file stored in a Hugging Face repository.
|
268 |
+
|
269 |
+
- If short_url is provided, attempts to retrieve and return the full_url.
|
270 |
+
- If full_url is provided, attempts to retrieve an existing short_url or creates a new one, stores it, and returns the short_url.
|
271 |
+
- If both are provided, checks for consistency or creates a new entry.
|
272 |
+
- If neither is provided, or repo_id is missing, returns an error status.
|
273 |
+
|
274 |
+
Returns:
|
275 |
+
tuple: (status_message, result_url)
|
276 |
+
status_message can be "success", "created", "exists", "error", "not_found".
|
277 |
+
result_url is the relevant URL (short or full) or None if an error occurs or not found.
|
278 |
+
"""
|
279 |
+
if not repo_id:
|
280 |
+
return "error_repo_id_missing", None
|
281 |
+
if not short_url and not full_url:
|
282 |
+
return "error_no_input", None
|
283 |
+
|
284 |
+
login(token=HF_API_TOKEN) # Ensure login at the beginning
|
285 |
+
url_data = _get_json_from_repo(repo_id, json_file, repo_type)
|
286 |
+
|
287 |
+
# Case 1: Only short_url provided (lookup full_url)
|
288 |
+
if short_url and not full_url:
|
289 |
+
found_full_url = _find_url_in_json(url_data, short_url=short_url)
|
290 |
+
return ("success_retrieved_full", found_full_url) if found_full_url else ("not_found_short", None)
|
291 |
+
|
292 |
+
# Case 2: Only full_url provided (lookup or create short_url)
|
293 |
+
if full_url and not short_url:
|
294 |
+
existing_short_url = _find_url_in_json(url_data, full_url=full_url)
|
295 |
+
if existing_short_url:
|
296 |
+
return "success_retrieved_short", existing_short_url
|
297 |
+
else:
|
298 |
+
# Create new short_url
|
299 |
+
new_short_id = _generate_short_id()
|
300 |
+
url_data = _add_url_to_json(url_data, new_short_id, full_url)
|
301 |
+
if _upload_json_to_repo(url_data, repo_id, json_file, repo_type):
|
302 |
+
return "created_short", new_short_id
|
303 |
+
else:
|
304 |
+
return "error_upload", None
|
305 |
+
|
306 |
+
# Case 3: Both short_url and full_url provided
|
307 |
+
if short_url and full_url:
|
308 |
+
found_full_for_short = _find_url_in_json(url_data, short_url=short_url)
|
309 |
+
found_short_for_full = _find_url_in_json(url_data, full_url=full_url)
|
310 |
+
|
311 |
+
if found_full_for_short == full_url:
|
312 |
+
return "exists_match", short_url
|
313 |
+
if found_full_for_short is not None and found_full_for_short != full_url:
|
314 |
+
return "error_conflict_short_exists_different_full", short_url
|
315 |
+
if found_short_for_full is not None and found_short_for_full != short_url:
|
316 |
+
return "error_conflict_full_exists_different_short", found_short_for_full
|
317 |
+
|
318 |
+
# If short_url is provided and not found, or full_url is provided and not found,
|
319 |
+
# or neither is found, then create a new entry with the provided short_url and full_url.
|
320 |
+
# This effectively allows specifying a custom short_url if it's not already taken.
|
321 |
+
url_data = _add_url_to_json(url_data, short_url, full_url)
|
322 |
+
if _upload_json_to_repo(url_data, repo_id, json_file, repo_type):
|
323 |
+
return "created_specific_pair", short_url
|
324 |
+
else:
|
325 |
+
return "error_upload", None
|
326 |
+
|
327 |
+
return "error_unhandled_case", None # Should not be reached
|