{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# AudioEditingCode Colab Demo\n", "\n", "This notebook demonstrates how to use the `AudioEditingCode` repository in Google Colab.\n", "\n", "## 1. Clone the repository\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!git clone https://github.com/HilaManor/AudioEditingCode.git\n", "%cd AudioEditingCode\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Install dependencies\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -r requirements.txt\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Demo Usage\n", "\n", "Here you can add examples of how to use the code. You might need to download some audio files for demonstration.\n", "\n", "### Download example audio\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!wget https://www.soundhelix.com/examples/mp3/SoundHelix-Song-1.mp3 -O input_audio.mp3\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Text-Based Editing Example\n", "\n", "This example uses `main_run.py` for text-based audio editing. You will need a Hugging Face token to use models like Stable Audio Open. Please visit [Hugging Face](https://huggingface.co/settings/tokens) to get your token and replace `` below.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "# Replace with your actual Hugging Face token\n", "os.environ[\"HF_TOKEN\"] = \"\"\n", "\n", "!python code/main_run.py \\\n", " --cfg_tar 1.5 \\\n", " --cfg_src 0.5 \\\n", " --init_aud input_audio.mp3 \\\n", " --target_prompt \"a dog barking\" \\\n", " --tstart 100 \\\n", " --model_id cvssp/audioldm-s-full-v2 \\\n", " --results_path results_text_based\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Unsupervised Editing Example\n", "\n", "First, extract the principal components:\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!python code/main_pc_extract_inv.py \\\n", " --init_aud input_audio.mp3 \\\n", " --model_id cvssp/audioldm-s-full-v2 \\\n", " --results_path results_unsupervised_extract \\\n", " --drift_start 0 \\\n", " --drift_end 200 \\\n", " --n_evs 5\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, apply the principal components:\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!python code/main_pc_apply_drift.py \\\n", " --extraction_path results_unsupervised_extract/input_audio_cvssp_audioldm-s-full-v2_inversion_data.pt \\\n", " --drift_start 0 \\\n", " --drift_end 200 \\\n", " --amount 1.0 \\\n", " --evs 0\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 4 }