File size: 3,791 Bytes
4dab247 386a259 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# AudioEditingCode Colab Demo\n",
"\n",
"This notebook demonstrates how to use the `AudioEditingCode` repository in Google Colab.\n",
"\n",
"## 1. Clone the repository\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!git clone https://github.com/HilaManor/AudioEditingCode.git\n",
"%cd AudioEditingCode\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Install dependencies\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install -r requirements.txt\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Demo Usage\n",
"\n",
"Here you can add examples of how to use the code. You might need to download some audio files for demonstration.\n",
"\n",
"### Download example audio\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!wget https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3 -O input_audio.mp3\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Text-Based Editing Example\n",
"\n",
"This example uses `main_run.py` for text-based audio editing. You will need a Hugging Face token to use models like Stable Audio Open. Please visit [Hugging Face](https://huggingface.co/settings/tokens) to get your token and replace `<YOUR_HF_TOKEN>` below.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"# Replace with your actual Hugging Face token\n",
"os.environ[\"HF_TOKEN\"] = \"<YOUR_HF_TOKEN>\"\n",
"\n",
"!python code/main_run.py \\\n",
" --cfg_tar 1.5 \\\n",
" --cfg_src 0.5 \\\n",
" --init_aud input_audio.mp3 \\\n",
" --target_prompt \"a dog barking\" \\\n",
" --tstart 100 \\\n",
" --model_id cvssp/audioldm-s-full-v2 \\\n",
" --results_path results_text_based\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Unsupervised Editing Example\n",
"\n",
"First, extract the principal components:\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python code/main_pc_extract_inv.py \\\n",
" --init_aud input_audio.mp3 \\\n",
" --model_id cvssp/audioldm-s-full-v2 \\\n",
" --results_path results_unsupervised_extract \\\n",
" --drift_start 0 \\\n",
" --drift_end 200 \\\n",
" --n_evs 5\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, apply the principal components:\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python code/main_pc_apply_drift.py \\\n",
" --extraction_path results_unsupervised_extract/input_audio_cvssp_audioldm-s-full-v2_inversion_data.pt \\\n",
" --drift_start 0 \\\n",
" --drift_end 200 \\\n",
" --amount 1.0 \\\n",
" --evs 0\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
|