File size: 3,791 Bytes
4dab247
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
386a259
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# AudioEditingCode Colab Demo\n",
    "\n",
    "This notebook demonstrates how to use the `AudioEditingCode` repository in Google Colab.\n",
    "\n",
    "## 1. Clone the repository\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!git clone https://github.com/HilaManor/AudioEditingCode.git\n",
    "%cd AudioEditingCode\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Install dependencies\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -r requirements.txt\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Demo Usage\n",
    "\n",
    "Here you can add examples of how to use the code. You might need to download some audio files for demonstration.\n",
    "\n",
    "### Download example audio\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3 -O input_audio.mp3\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Text-Based Editing Example\n",
    "\n",
    "This example uses `main_run.py` for text-based audio editing. You will need a Hugging Face token to use models like Stable Audio Open. Please visit [Hugging Face](https://huggingface.co/settings/tokens) to get your token and replace `<YOUR_HF_TOKEN>` below.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "# Replace with your actual Hugging Face token\n",
    "os.environ[\"HF_TOKEN\"] = \"<YOUR_HF_TOKEN>\"\n",
    "\n",
    "!python code/main_run.py \\\n",
    "    --cfg_tar 1.5 \\\n",
    "    --cfg_src 0.5 \\\n",
    "    --init_aud input_audio.mp3 \\\n",
    "    --target_prompt \"a dog barking\" \\\n",
    "    --tstart 100 \\\n",
    "    --model_id cvssp/audioldm-s-full-v2 \\\n",
    "    --results_path results_text_based\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Unsupervised Editing Example\n",
    "\n",
    "First, extract the principal components:\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python code/main_pc_extract_inv.py \\\n",
    "    --init_aud input_audio.mp3 \\\n",
    "    --model_id cvssp/audioldm-s-full-v2 \\\n",
    "    --results_path results_unsupervised_extract \\\n",
    "    --drift_start 0 \\\n",
    "    --drift_end 200 \\\n",
    "    --n_evs 5\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then, apply the principal components:\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python code/main_pc_apply_drift.py \\\n",
    "    --extraction_path results_unsupervised_extract/input_audio_cvssp_audioldm-s-full-v2_inversion_data.pt \\\n",
    "    --drift_start 0 \\\n",
    "    --drift_end 200 \\\n",
    "    --amount 1.0 \\\n",
    "    --evs 0\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}