reach-vb HF Staff mshukor commited on
Commit
d853002
·
verified ·
0 Parent(s):

Duplicate from lerobot/smolvla_base

Browse files

Co-authored-by: Mustafa Shukor <mshukor@users.noreply.huggingface.co>

.gitattributes ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ collage_small.gif filter=lfs diff=lfs merge=lfs -text
Finetune_SmolVLA_notebook.ipynb ADDED
@@ -0,0 +1,214 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {
6
+ "id": "NQUk3Y0WwYZ4"
7
+ },
8
+ "source": [
9
+ "# 🤗 x 🦾: Training SmolVLA with LeRobot Notebook\n",
10
+ "\n",
11
+ "Welcome to the **LeRobot SmolVLA training notebook**! This notebook provides a ready-to-run setup for training imitation learning policies using the [🤗 LeRobot](https://github.com/huggingface/lerobot) library.\n",
12
+ "\n",
13
+ "In this example, we train an `SmolVLA` policy using a dataset hosted on the [Hugging Face Hub](https://huggingface.co/), and optionally track training metrics with [Weights & Biases (wandb)](https://wandb.ai/).\n",
14
+ "\n",
15
+ "## ⚙️ Requirements\n",
16
+ "- A Hugging Face dataset repo ID containing your training data (`--dataset.repo_id=YOUR_USERNAME/YOUR_DATASET`)\n",
17
+ "- Optional: A [wandb](https://wandb.ai/) account if you want to enable training visualization\n",
18
+ "- Recommended: GPU runtime (e.g., NVIDIA A100) for faster training\n",
19
+ "\n",
20
+ "## ⏱️ Expected Training Time\n",
21
+ "Training with the `SmolVLA` policy for 20,000 steps typically takes **about 5 hours on an NVIDIA A100** GPU. On less powerful GPUs or CPUs, training may take significantly longer!\n",
22
+ "\n",
23
+ "## Example Output\n",
24
+ "Model checkpoints, logs, and training plots will be saved to the specified `--output_dir`. If `wandb` is enabled, progress will also be visualized in your wandb project dashboard.\n"
25
+ ]
26
+ },
27
+ {
28
+ "cell_type": "markdown",
29
+ "metadata": {
30
+ "id": "MOJyX0CnwA5m"
31
+ },
32
+ "source": [
33
+ "## Install conda\n",
34
+ "This cell uses `condacolab` to bootstrap a full Conda environment inside Google Colab.\n"
35
+ ]
36
+ },
37
+ {
38
+ "cell_type": "code",
39
+ "execution_count": null,
40
+ "metadata": {
41
+ "id": "QlKjL1X5t_zM"
42
+ },
43
+ "outputs": [],
44
+ "source": [
45
+ "!pip install -q condacolab\n",
46
+ "import condacolab\n",
47
+ "condacolab.install()"
48
+ ]
49
+ },
50
+ {
51
+ "cell_type": "markdown",
52
+ "metadata": {
53
+ "id": "DxCc3CARwUjN"
54
+ },
55
+ "source": [
56
+ "## Install LeRobot\n",
57
+ "This cell clones the `lerobot` repository from Hugging Face, installs FFmpeg (version 7.1.1), and installs the package in editable mode.\n"
58
+ ]
59
+ },
60
+ {
61
+ "cell_type": "code",
62
+ "execution_count": null,
63
+ "metadata": {
64
+ "id": "dgLu7QT5tUik"
65
+ },
66
+ "outputs": [],
67
+ "source": [
68
+ "!git clone https://github.com/huggingface/lerobot.git\n",
69
+ "!conda install ffmpeg=7.1.1 -c conda-forge\n",
70
+ "!cd lerobot && pip install -e ."
71
+ ]
72
+ },
73
+ {
74
+ "cell_type": "markdown",
75
+ "metadata": {
76
+ "id": "Q8Sn2wG4wldo"
77
+ },
78
+ "source": [
79
+ "## Weights & Biases login\n",
80
+ "This cell logs you into Weights & Biases (wandb) to enable experiment tracking and logging."
81
+ ]
82
+ },
83
+ {
84
+ "cell_type": "code",
85
+ "execution_count": null,
86
+ "metadata": {
87
+ "id": "PolVM_movEvp"
88
+ },
89
+ "outputs": [],
90
+ "source": [
91
+ "!wandb login"
92
+ ]
93
+ },
94
+ {
95
+ "cell_type": "markdown",
96
+ "metadata": {
97
+ "id": "zTWQAgX9xseE"
98
+ },
99
+ "source": [
100
+ "## Install SmolVLA dependencies"
101
+ ]
102
+ },
103
+ {
104
+ "cell_type": "code",
105
+ "execution_count": null,
106
+ "metadata": {
107
+ "id": "DiHs0BKwxseE"
108
+ },
109
+ "outputs": [],
110
+ "source": [
111
+ "!cd lerobot && pip install -e \".[smolvla]\""
112
+ ]
113
+ },
114
+ {
115
+ "cell_type": "markdown",
116
+ "metadata": {
117
+ "id": "IkzTo4mNwxaC"
118
+ },
119
+ "source": [
120
+ "## Start training SmolVLA with LeRobot\n",
121
+ "\n",
122
+ "This cell runs the `train.py` script from the `lerobot` library to train a robot control policy. \n",
123
+ "\n",
124
+ "Make sure to adjust the following arguments to your setup:\n",
125
+ "\n",
126
+ "1. `--dataset.repo_id=YOUR_HF_USERNAME/YOUR_DATASET`: \n",
127
+ " Replace this with the Hugging Face Hub repo ID where your dataset is stored, e.g., `pepijn223/il_gym0`.\n",
128
+ "\n",
129
+ "2. `--batch_size=64`: means the model processes 64 training samples in parallel before doing one gradient update. Reduce this number if you have a GPU with low memory.\n",
130
+ "\n",
131
+ "3. `--output_dir=outputs/train/...`: \n",
132
+ " Directory where training logs and model checkpoints will be saved.\n",
133
+ "\n",
134
+ "4. `--job_name=...`: \n",
135
+ " A name for this training job, used for logging and Weights & Biases.\n",
136
+ "\n",
137
+ "5. `--policy.device=cuda`: \n",
138
+ " Use `cuda` if training on an NVIDIA GPU. Use `mps` for Apple Silicon, or `cpu` if no GPU is available.\n",
139
+ "\n",
140
+ "6. `--wandb.enable=true`: \n",
141
+ " Enables Weights & Biases for visualizing training progress. You must be logged in via `wandb login` before running this."
142
+ ]
143
+ },
144
+ {
145
+ "cell_type": "code",
146
+ "execution_count": null,
147
+ "metadata": {
148
+ "id": "ZO52lcQtxseE"
149
+ },
150
+ "outputs": [],
151
+ "source": [
152
+ "!cd lerobot && python lerobot/scripts/train.py \\\n",
153
+ " --policy.path=lerobot/smolvla_base \\\n",
154
+ " --dataset.repo_id=${HF_USER}/mydataset \\\n",
155
+ " --batch_size=64 \\\n",
156
+ " --steps=20000 \\\n",
157
+ " --output_dir=outputs/train/my_smolvla \\\n",
158
+ " --job_name=my_smolvla_training \\\n",
159
+ " --policy.device=cuda \\\n",
160
+ " --wandb.enable=true"
161
+ ]
162
+ },
163
+ {
164
+ "cell_type": "markdown",
165
+ "metadata": {
166
+ "id": "2PBu7izpxseF"
167
+ },
168
+ "source": [
169
+ "## Login into Hugging Face Hub\n",
170
+ "Now after training is done login into the Hugging Face hub and upload the last checkpoint"
171
+ ]
172
+ },
173
+ {
174
+ "cell_type": "code",
175
+ "execution_count": null,
176
+ "metadata": {
177
+ "id": "8yu5khQGIHi6"
178
+ },
179
+ "outputs": [],
180
+ "source": [
181
+ "!huggingface-cli login"
182
+ ]
183
+ },
184
+ {
185
+ "cell_type": "code",
186
+ "execution_count": null,
187
+ "metadata": {
188
+ "id": "zFMLGuVkH7UN"
189
+ },
190
+ "outputs": [],
191
+ "source": [
192
+ "!huggingface-cli upload ${HF_USER}/my_smolvla \\\n",
193
+ " /content/lerobot/outputs/train/my_smolvla/checkpoints/last/pretrained_model"
194
+ ]
195
+ }
196
+ ],
197
+ "metadata": {
198
+ "accelerator": "GPU",
199
+ "colab": {
200
+ "gpuType": "A100",
201
+ "machine_shape": "hm",
202
+ "provenance": []
203
+ },
204
+ "kernelspec": {
205
+ "display_name": "Python 3",
206
+ "name": "python3"
207
+ },
208
+ "language_info": {
209
+ "name": "python"
210
+ }
211
+ },
212
+ "nbformat": 4,
213
+ "nbformat_minor": 0
214
+ }
README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: robotics
3
+ tags:
4
+ - lerobot
5
+ library_name: lerobot
6
+ datasets:
7
+ - lerobot/svla_so101_pickplace
8
+ ---
9
+
10
+ ## SmolVLA: A vision-language-action model for affordable and efficient robotics
11
+
12
+ Resources and technical documentation:
13
+
14
+ [SmolVLA Paper](https://huggingface.co/papers/2506.01844)
15
+
16
+ [SmolVLA Blogpost](https://huggingface.co/blog/smolvla)
17
+
18
+ [Code](https://github.com/huggingface/lerobot/blob/main/lerobot/common/policies/smolvla/modeling_smolvla.py)
19
+
20
+ [Train using Google Colab Notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/lerobot/training-smolvla.ipynb#scrollTo=ZO52lcQtxseE)
21
+
22
+ [SmolVLA HF Documentation](https://huggingface.co/docs/lerobot/smolvla)
23
+
24
+ Designed by Hugging Face.
25
+
26
+ This model has 450M parameters in total.
27
+ You can use inside the [LeRobot library](https://github.com/huggingface/lerobot).
28
+
29
+ Before proceeding to the next steps, you need to properly install the environment by following [Installation Guide](https://huggingface.co/docs/lerobot/installation) on the docs.
30
+
31
+ Install smolvla extra dependencies:
32
+ ```bash
33
+ pip install -e ".[smolvla]"
34
+ ```
35
+
36
+ Example of finetuning the smolvla pretrained model (`smolvla_base`):
37
+ ```bash
38
+ python lerobot/scripts/train.py \
39
+ --policy.path=lerobot/smolvla_base \
40
+ --dataset.repo_id=lerobot/svla_so101_pickplace \
41
+ --batch_size=64 \
42
+ --steps=20000 \
43
+ --output_dir=outputs/train/my_smolvla \
44
+ --job_name=my_smolvla_training \
45
+ --policy.device=cuda \
46
+ --wandb.enable=true
47
+ ```
48
+
49
+ Example of finetuning the smolvla neural network with pretrained VLM and action expert
50
+ intialized from scratch:
51
+ ```bash
52
+ python lerobot/scripts/train.py \
53
+ --dataset.repo_id=lerobot/svla_so101_pickplace \
54
+ --batch_size=64 \
55
+ --steps=200000 \
56
+ --output_dir=outputs/train/my_smolvla \
57
+ --job_name=my_smolvla_training \
58
+ --policy.device=cuda \
59
+ --wandb.enable=true
60
+ ```
collage_small.gif ADDED

Git LFS Details

  • SHA256: c43f022bf1fdfbac82841ef95c7a99a7ead13ea67f2a7202f2fe57148f352c83
  • Pointer size: 132 Bytes
  • Size of remote file: 8.01 MB
config.json ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "type": "smolvla",
3
+ "n_obs_steps": 1,
4
+ "normalization_mapping": {
5
+ "VISUAL": "IDENTITY",
6
+ "STATE": "MEAN_STD",
7
+ "ACTION": "MEAN_STD"
8
+ },
9
+ "input_features": {
10
+ "observation.state": {
11
+ "type": "STATE",
12
+ "shape": [
13
+ 6
14
+ ]
15
+ },
16
+ "observation.image2": {
17
+ "type": "VISUAL",
18
+ "shape": [
19
+ 3,
20
+ 256,
21
+ 256
22
+ ]
23
+ },
24
+ "observation.image": {
25
+ "type": "VISUAL",
26
+ "shape": [
27
+ 3,
28
+ 256,
29
+ 256
30
+ ]
31
+ },
32
+ "observation.image3": {
33
+ "type": "VISUAL",
34
+ "shape": [
35
+ 3,
36
+ 256,
37
+ 256
38
+ ]
39
+ }
40
+ },
41
+ "output_features": {
42
+ "action": {
43
+ "type": "ACTION",
44
+ "shape": [
45
+ 6
46
+ ]
47
+ }
48
+ },
49
+ "chunk_size": 50,
50
+ "n_action_steps": 50,
51
+ "max_state_dim": 32,
52
+ "max_action_dim": 32,
53
+ "resize_imgs_with_padding": [
54
+ 512,
55
+ 512
56
+ ],
57
+ "empty_cameras": 0,
58
+ "adapt_to_pi_aloha": false,
59
+ "use_delta_joint_actions_aloha": false,
60
+ "tokenizer_max_length": 48,
61
+ "num_steps": 10,
62
+ "use_cache": true,
63
+ "freeze_vision_encoder": true,
64
+ "train_expert_only": true,
65
+ "train_state_proj": true,
66
+ "optimizer_lr": 0.0001,
67
+ "optimizer_betas": [
68
+ 0.9,
69
+ 0.95
70
+ ],
71
+ "optimizer_eps": 1e-08,
72
+ "optimizer_weight_decay": 1e-10,
73
+ "optimizer_grad_clip_norm": 10,
74
+ "scheduler_warmup_steps": 1000,
75
+ "scheduler_decay_steps": 30000,
76
+ "scheduler_decay_lr": 2.5e-06,
77
+ "vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
78
+ "load_vlm_weights": true,
79
+ "attention_mode": "cross_attn",
80
+ "prefix_length": 0,
81
+ "pad_language_to": "max_length",
82
+ "num_expert_layers": 0,
83
+ "num_vlm_layers": 16,
84
+ "self_attn_every_n_layers": 2,
85
+ "expert_width_multiplier": 0.75
86
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f8dc071d5b933e79edd2b73b8d6b5cca482ef0437c099ea3ec13ab978a38fc8
3
+ size 906720008
train_config.json ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset": {
3
+ "repo_id": "satvikahuja/mixer_on_off_new_1,aergogo/so100_pick_place,andy309/so100_0314_fold_cloths,jchun/so100_pickplace_small_20250323_120056,astroyat/cube,Ofiroz91/so_100_cube2bowl,HappyPablo/dec3_data2,ZCM5115/so100_1210,francescocrivelli/orange_feeding,francescocrivelli/carrot_eating,0x00raghu/toffee_red,0x00raghu/toffee_red_2,0x00raghu/toffee_red_3__,0x00raghu/toffee_blue,0x00raghu/toffee_blue_2,0x00raghu/toffee_to_hand_1,0x00raghu/toffee_to_hand_2,liyitenga/so100_bi_hello,liyitenga/so100_bi_giveme5,ZCM5115/so100_2Arm3cameras_movebox,pranavsaroha/so100_carrot_1,pranavsaroha/so100_carrot_3,pranavsaroha/so100_carrot_4,maximilienroberti/so100_lego_red_box,pranavsaroha/so100_squishy,rabhishek100/so100_train_dataset,pranavsaroha/so100_squishy100,swarajgosavi/kikobot_pusht_real_v2,pandaRQ/pickmed,swarajgosavi/act_kikobot_pusht_real,pranavsaroha/so100_squishy2colors,pranavsaroha/so100_squishy2colors_1,Chojins/chess_game_001_white,jmrog/so100_sweet_pick,Chojins/chess_game_002_white,pranavsaroha/so100_squishy2colors_2_new,Chojins/chess_game_003_white,aractingi/pick_place_lego_cube,Chojins/chess_game_004_white,Chojins/chess_game_005_white,Chojins/chess_game_006_white,Chojins/chess_game_007_white,koenvanwijk/blue2,jlitch/so100multicam3,koenvanwijk/blue52,jlitch/so100multicam6,aractingi/pick_place_lego_cube_1,jlitch/so100multicam7,vladfatu/so100_ds,Chojins/chess_game_000_white,HITHY/so100-kiwi,HITHY/so100_peach1,HITHY/so100_redstrawberry,satvikahuja/orange_mixer_1,satvikahuja/mixer_on_off,satvikahuja/orange_pick_place_new1,satvikahuja/mixer_on_off_new,danmac1/real_real332,FeiYjf/Makalu_push,liyitenga/so100_pick_taffy1,chmadran/so100_dataset04,FeiYjf/Maklu_dataset,FeiYjf/new_Dataset,liyitenga/so100_pick_taffy2,satvikahuja/mixer_on_off_new_4,CSCSXX/pick_place_cube_1.17,liyitenga/so100_pick_taffy3,liyitenga/so100_pick_taffy4,yuz1wan/so100_pick_pink,yuz1wan/so100_pick_wahaha,yuz1wan/so100_pp_pink,yuz1wan/so100_pour_cup,liyitenga/so100_pick_taffy5,liyitenga/so100_pick_taffy6,yuz1wan/so100_button,yuz1wan/so100_pickplace,liyitenga/so100_pick_taffy7,FeiYjf/push_gg,FeiYjf/push_0094,swarajgosavi/act_kikobot_block_real,liyitenga/so100_pick_taffy8,phospho-ai/OrangeBrick3Cameras,vaishanthr/toy_pick_place,SeanLMH/so100_picknplace_v2,pepijn223/yellow_lego_in_box1,DimiSch/so100_50ep_2,DimiSch/so100_50ep_3,SeanLMH/so100_picknplace,nbaron99/so100_pick_and_place2,chmadran/so100_dataset08,vaishanthr/toy_pickplace_50ep,Beegbrain/pick_place_green_block_lr,Ityl/so100_recording1,vaishanthr/toy_pickplace,ad330/so100_box_pickPlace,Beegbrain/so100_put_cube_cup,aractingi/push_green_cube_hf,aractingi/push_green_cube_hf_cropped_resized,carpit680/giraffe_task,carpit680/giraffe_sock_demo_1,DimiSch/so100_terra_50_2,carpit680/giraffe_sock_demo_2,aractingi/push_cube_to_face_reward,aractingi/push_cube_to_face_reward_cropped_resized,aractingi/push_cube_reward_data,aractingi/push_cube_reward_data_cropped_resized,aractingi/push_cube_offline_data_cropped_resized,aractingi/push_cube_front_side_reward,aractingi/push_cube_front_side_reward_cropped_resized,aractingi/push_cube_front_side_reward_long,aractingi/push_cube_front_side_reward_long_cropped_resized,aractingi/push_cube_reward,aractingi/push_cube_reward_cropped_resized,aractingi/push_cube_square_reward_cropped_resized,aractingi/push_cube_square_reward_1,aractingi/push_cube_square_reward_1_cropped_resized,aractingi/push_cube_square_light_reward,aractingi/push_cube_square_light_offline_demo,aractingi/push_cube_square_light_offline_demo_cropped_resized,denghj/dataset_red_tape01,aractingi/push_cube_square_offline_demo,aractingi/push_cube_square_offline_demo_cropped_resized,Beegbrain/stack_two_cubes,FeiYjf/Test_NNNN,LegrandFrederic/Orange-brick-lower-resolution,aractingi/pick_place_lego_cube_cropped_resized,aractingi/push_cube_overfit,aractingi/push_cube_overfit_cropped_resized,HITHY/so100_peach,zaringleb/so100_cube_2,andreasBihlmaier/dual_arm_transfer_2025_02_16,zaringleb/so100_cube_4_binary,1g0rrr/reward_pickplace1,1g0rrr/reward_pickplace1_cropped_resized,FeiYjf/Hold_Pieces,FeiYjf/Grab_Pieces,hegdearyandev/so100_eraser_cup_v1,jbraumann/so100_1902,liyitenga/so100_pick_taffy10,mikechambers/block_cup_5,zaringleb/so100_cube_5_linear,yuz1wan/so100_pickplace_0223_2,yuz1wan/so100_pickplace_0223_3,samsam0510/mj_data_temp,samsam0510/tape_insert_1,samsam0510/tape_insert_2,pengjunkun/so100_push_to_hole,Deason11/Random_Kitchen,1g0rrr/reward_dataset_name2,1g0rrr/reward_dataset_name2_cropped_resized,1g0rrr/offline_dataset_name2,1g0rrr/offline_dataset_name2_cropped_resized,aractingi/push_cube_simp_cropped_resized,danielkr452/so100_work6,Loki0929/so100_100,yuz1wan/so100_fold_0227_1,yuz1wan/so100_fold_0227_2,speedyyoshi/so100_grasp_pink_block,lirislab/stack_two_red_cubes,lirislab/red_cube_into_mug,lirislab/green_lego_block_into_mug,lirislab/green_lego_block_into_mug_easy,kevin510/lerobot-cat-toy-placement,NONHUMAN-RESEARCH/SOARM100_TASK_VENDA_BOX,wangjl1512/pour_water,airthebear/so100_GL,zijian2022/noticehuman1,zijian2022/noticehuman2,kantine/so100_kapla_tower6,zijian2022/noticehuman5,zijian2022/llm40,Ashton3/lerobot-aloha,zijian2022/noticehuman50,AaronNewman/screwdriver_task_batch1,AaronNewman/screwdriver_task_batch2,AaronNewman/screwdriver_task_batch3,zijian2022/noticehuman60,zijian2022/noticehuman70,Bartm3/tape_to_bin,liuhuanjim013/so100_th_1,Pi-robot/barbecue_flip,Pi-robot/barbecue_put,wangjl1512/doll,sshh11/so100_orange_50ep_1,sshh11/so100_orange_50ep_2,DorayakiLin/so100_pick_cube_in_box,Bartm3/tape_to_bin2,luke250305/play_dice_250311.1,andy309/so100_0311_1152,sihyun77/suho_so100,sihyun77/si_so100,shreyasgite/so100_base_left,sihyun77/suho_red,liuhuanjim013/so100_block,andy309/so100_0313_no_wrist_camera,zijian2022/l9,zijian2022/n1_2,DorayakiLin/so100_stack_cube,andy309/so100_0313_no_wrist_camera_with_two_arms_cloths,joaoocruz00/so100_makeitD1,zijian2022/l10_1,zijian2022/l10_5,sihyun77/suho_red2,sihyun77/suho_angel,sihyun77/sihyun_king,acrampette/third_arm_01,Winster/so100_cube,1g0rrr/sam_openpi03,thedevansh/mar16_1336,hkphoooey/throw_stuffie,doujiangwang/task1_10epi_100000step,sihyun77/sihyun_3_17_1,acrampette/third_arm_02,imsyed00/so100_yellowbowl_pickplace_1,kumarhans/so100_tape_task,sihyun77/sihyun_main,doujiangwang/task2_10epi_100000step,kantine/industrial_robothon_buttons_expert,kantine/industrial_robothon_buttons_anomaly,kantine/industrial_robothon_hatchAndProbe_expert,kantine/industrial_robothon_hatchAndProbe_anomaly,Odog16/so100_tea_towel_folding_v1,zijian2022/so100_318,zijian2022/so100_318_1,Congying1112/so100_place_blue_bottle_with_two_cameras,Congying1112/so100_place_blue_bottle_with_two_cameras2,Congying1112/so100_place_blue_bottle_with_single_camera,pietroom/first_task_short,kantine/industrial_screws_sorting_expert,kantine/industrial_screws_sorting_anomaly,pietroom/second_task,zijian2022/c0,doujiangwang/task4_10epi_100000step,Congying1112/so100_switch_with_onhand_camera,HYAIYN/so100_get_orange_10epi,doujiangwang/task5_10epi_100000step,1g0rrr/sam_openpi_cube_low10,1g0rrr/sam_openpi_cube_top10,1g0rrr/sam_openpi_wire10,1g0rrr/sam_openpi_solder1,1g0rrr/sam_openpi_solder2,wcode/so100_put_pen_50,jchun/so100_pickplace_small_20250322_193929,bnarin/so100_tic_tac_toe_we_do_it_live,dc2ac/so100-t5,chmadran/so100_home_dataset,baladhurgesh97/so100_final_picking_3,bnarin/so100_tic_tac_toe_move_0_0,bnarin/so100_tic_tac_toe_move_1_0,bnarin/so100_tic_tac_toe_move_2_1,bnarin/so100_tic_tac_toe_move_4_0,zaringleb/so100_cube_6_2d,andlyu/so100_indoor_0,andlyu/so100_indoor_2,Winster/so100_sim,badwolf256/so100_twin_cam_duck,Congying1112/so100_simplepick_with_2_cameras_from_top,andlyu/so100_indoor_4,Zak-Y/so100_grap_dataset,kantine/domotic_pouringCoffee_expert,kantine/domotic_pouringCoffee_anomaly,lucasngoo/so100_strawberry_grape,kantine/domotic_makingCoffee_expert,kantine/domotic_makingCoffee_anomaly,ZGGZZG/so100_drop1,kantine/industrial_soldering_expert,kantine/industrial_soldering_anomaly,Yotofu/so100_sweeper_shoes,kantine/domotic_dishTidyUp_expert,kantine/domotic_dishTidyUp_anomaly,kantine/domotic_groceriesSorting_expert,kantine/domotic_groceriesSorting_anomaly,badwolf256/so100_twin_cam_duck_v2,kantine/domotic_vegetagblesAndFruitsSorting_expert,kantine/domotic_vegetagblesAndFruitsSorting_anomaly,kantine/domotic_setTheTable_expert,kantine/domotic_setTheTable_anomaly,therarelab/so100_pick_place,abhisb/so100_51_ep,andlyu/so100_indoor_val_0,allenchienxxx/so100Test,lizi178119985/so100_jia,badwolf256/so100_twin_cam_duck_v3,andrewcole712/so100_tape_bin_place,Gano007/so100_lolo,Zak-Y/so100_three_cameras_dataset,Gano007/so100_doliprane,XXRRSSRR/so100_v3_num_episodes_50,zijian2022/assemblyarm2,ganker5/so100_action_20250403,andlyu/so100_indoor_val2,Gano007/so100_gano,paszea/so100_whale_grab,paszea/so100_whale,Clementppr/lerobot_pick_and_place_dataset_world_model,andlyu/so100_indoor_10,RasmusP/so100_dataset50ep_a,RasmusP/so100_dataset50ep,Gano007/so100_second,zaringleb/so100_cude_linear_and_2d_comb,dsfsg/grasp_pens,zijian2022/digitalfix,zijian2022/digitalfix2,zijian2022/digitalfix3,T1g3rGE/so100_pickplace_small_20250407_171912,sihyun77/mond_13,abokinala/sputnik_100_11_pick_place_container,dsfsg/bring_bottle,abokinala/sputnik_100_12_pick_place_container,Mwuqiu/so100_0408,AK51/4090_01,356c/so100_rope_reposition_1,paszea/so100_lego_mix,abokinala/sputnik_100_14_pick_place_container,abokinala/sputnik_100_23_pick_place_surface,jiajun001/eraser00_2,jlesein/TestBoulon2,duthvik/sputnik_100_31_pour_liquid,duthvik/sputnik_100_24_pick_place_surface,duthvik/sputnik_100_25_pick_place_surface,duthvik/sputnik_100_17_pick_place_container,duthvik/sputnik_100_26_pick_place_surface,VoicAndrei/so100_banana_to_plate_rebel_full,isadev/bougies1,danaaubakirova/so100_task_1,danaaubakirova/so100_task_2,danaaubakirova/so100_task_3,danaaubakirova/so100_task_4,sixpigs1/so100_pick_cube_in_box_error,sixpigs1/so100_push_cube_error,sixpigs1/so100_pull_cube_error,isadev/bougies2,therarelab/med_dis_rare_6,duthvik/sputnik_100_27_pick_place_surface,zijian2022/closer3,duthvik/sputnik_100_41_custom_tasks,duthvik/sputnik_100_42_custom_tasks,duthvik/sputnik_100_43_custom_tasks,duthvik/sputnik_100_44_custom_tasks,duthvik/sputnik_100_51_kitchen_tasks,duthvik/sputnik_100_52_kitchen_tasks,duthvik/sputnik_100_53_kitchen_tasks,duthvik/sputnik_100_45_custom_tasks,duthvik/sputnik_100_32_pour_liquid,duthvik/sputnik_100_29_pick_place_surface,duthvik/sputnik_100_18_pick_place_container,sixpigs1/so100_pull_cube_by_tool_error,sixpigs1/so100_insert_cylinder_error,abokinala/sputnik_100_54_kitchen_tasks,abokinala/sputnik_100_55_kitchen_tasks,m1b/so100_bluelego,abokinala/sputnik_100_46_custom_tasks,m1b/so100_bluelego_updt,kantine/flip_A0,kantine/flip_A1,kantine/flip_A2,kantine/flip_A3,lirislab/guess_who_no_cond,kantine/flip_A4,kantine/flip_A5,lirislab/guess_who_lighting,nguyen-v/so100_press_red_button,nguyen-v/so100_bimanual_grab_lemon_put_in_box2,pierfabre/cow,nguyen-v/press_red_button_new,nguyen-v/so100_rotate_red_button,Cidoyi/so100_all_notes,roboticshack/team10-red-block,Cidoyi/so100_all_notes_1,roboticshack/team_5-QuiEstCe_everyBox,roboticshack/team11_pianobot,roboticshack/team2-guess_who_so100,roboticshack/team2-guess_who_so100_light,roboticshack/team2-guess_who_so100_edge_case,roboticshack/team2-guess_who_less_ligth,Cidoyi/so100_all_notes_3,dsfsg/grasp_pen_and_bottle,abokinala/sputnik_100_60_kitchen_tasks,abokinala/sputnik_100_58_kitchen_tasks,danaaubakirova/so100_v2_task_1,danaaubakirova/so100_v2_task_2,danaaubakirova/so100_v2_task_3,danaaubakirova/so100_v2_task_4,zijian2022/force1,zijian2022/force2,zijian2022/force3,jiajun001/eraser00_3,zijian2022/bi2,zijian2022/bi1,zijian2022/hand1,Setchii/so100_grab_ball,MossProphet/so100_square-1-2-3.2,pierfabre/rabbit,bensprenger/right_arm_p_brick_in_box_with_y_noise_v0,pierfabre/horse,pierfabre/pig2,pierfabre/pig3,pierfabre/cow2,pierfabre/sheep,Chojins/chess_game_009_white,sihyun77/suho_3_17_1,sihyun77/sihyun_3_17_2,sihyun77/suho_3_17_3,sihyun77/sihyun_3_17_5,Odog16/so100_cube_drop_pick_v1,sihyun77/sihyun_main_2,sihyun77/suho_main_2,Bartm3/dice2,sihyun77/sihyun_main_3,Loki0929/so100_duck,pietroom/holdthis,pietroom/actualeasytask,Beegbrain/pick_lemon_and_drop_in_bowl,Beegbrain/sweep_tissue_cube,zijian2022/321,gxy1111/so100_pick_place,Odog16/so100_cube_stacking_v1,sihyun77/mond_1,andlyu/so100_indoor_1,andlyu/so100_indoor_3,frk2/so100large,lirislab/sweep_tissue_cube,lirislab/lemon_into_bowl,lirislab/red_cube_into_green_lego_block,lirislab/red_cube_into_blue_cube,00ri/so100_battery,frk2/so100largediffcam,FsqZ/so100_1,ZGGZZG/so100_drop0,Chojins/chess_game_000_white_red,smanni/train_so100_fluffy_box,ganker5/so100_push_20250328,ganker5/so100_dataline_0328,ganker5/so100_color_0328,CrazyYhang/A1234-B-C_mvA2B,RasmusP/so100_Orange2Green,sixpigs1/so100_pick_cube_in_box,ganker5/so100_push_20250331,ganker5/so100_dataline_20250331,lirislab/put_caps_into_teabox,lirislab/close_top_drawer_teabox,lirislab/open_top_drawer_teabox,lirislab/unfold_bottom_right,lirislab/push_cup_target,lirislab/put_banana_bowl,Chojins/chess_game_001_blue_stereo,Chojins/chess_game_001_red_stereo,ganker5/so100_toy_20250402,Gano007/so100_medic,00ri/so100_battery_bin_center,paszea/so100_whale_2,lirislab/fold_bottom_right,lirislab/put_coffee_cap_teabox,therarelab/so100_pick_place_2,paszea/so100_whale_3,paszea/so100_whale_4,paszea/so100_lego,LemonadeDai/so100_coca,zijian2022/backgrounda,zijian2022/backgroundb,356c/so100_nut_sort_1,Mwuqiu/so100_0408_muti,aimihat/so100_tape,lirislab/so100_demo,356c/so100_duck_reposition_1,zijian2022/sort1,weiye11/so100_410_zwy,VoicAndrei/so100_banana_to_plate_only,sixpigs1/so100_stack_cube_error,isadev/bougies3,zijian2022/close3,bensprenger/left_arm_yellow_brick_in_box_v0,lirislab/guess_who_so100,bensprenger/left_arm_yellow_brick_in_box_with_purple_noise_v0,roboticshack/team16-can-stacking,zijian2022/insert2,roboticshack/team-7-right-arm-grasp-tape,Jiangeng/so100_413,roboticshack/team9-pick_cube_place_static_plate,AndrejOrsula/lerobot_double_ball_stacking_random,roboticshack/left-arm-grasp-lego-brick,roboticshack/team-7-left-arm-grasp-motor,roboticshack/team9-pick_chicken_place_plate,roboticshack/team13-two-balls-stacking,tkc79/so100_lego_box_1,roboticshack/team13-three-balls-stacking,pierfabre/chicken,roboticshack/team16-water-pouring,ad330/cubePlace,Jiafei1224/so100_pa222per,paszea/so100_lego_2cam,bensprenger/chess_game_001_blue_stereo,Mohamedal/put_banana,tkc79/so100_lego_box_2,samanthalhy/so100_herding_1,jlesein/TestBoulon7,pranavsaroha/so100_onelego2,pranavsaroha/so100_onelego3,pranavsaroha/so100_carrot_2,vladfatu/so100_above,koenvanwijk/orange50-1,CSCSXX/pick_place_cube_1.18,dragon-95/so100_sorting,dragon-95/so100_sorting_1,nbaron99/so100_pick_and_place4,Beegbrain/pick_place_green_block,dragon-95/so100_sorting_3,HITHY/so100_peach3,shreyasgite/so100_legocube_50,triton7777/so100_dataset_mix,NONHUMAN-RESEARCH/SOARM100_TASK_VENDA,mikechambers/block_cup_14,samsam0510/tooth_extraction_3,samsam0510/tooth_extraction_4,samsam0510/cube_reorientation_2,samsam0510/cube_reorientation_4,samsam0510/glove_reorientation_1,vladfatu/so100_office,pranavsaroha/so100_legos4,Ityl/so100_recording2,FeiYjf/new_GtoR,dragon-95/so100_sorting_2,HITHY/so100_peach4,jpata/so100_pick_place_tangerine,HITHY/so100_strawberry,shreyasgite/so100_base_env,koenvanwijk/orange50-variation-2,pranavsaroha/so100_carrot_5,pandaRQ/pick_med_1,aractingi/push_cube_offline_data,DorayakiLin/so100_pick_charger_on_tissue,zijian2022/noticehuman3,liuhuanjim013/so100_th",
4
+ "episodes": null,
5
+ "image_transforms": {
6
+ "enable": true,
7
+ "max_num_transforms": 10,
8
+ "random_order": false,
9
+ "transform_version": 0,
10
+ "image_size": 256,
11
+ "tfs": {
12
+ "resize_with_pad": {
13
+ "weight": 1.0,
14
+ "type": "ResizeWithPad",
15
+ "kwargs": {
16
+ "size": [
17
+ 256,
18
+ 256
19
+ ]
20
+ }
21
+ }
22
+ }
23
+ },
24
+ "local_files_only": true,
25
+ "use_imagenet_stats": false,
26
+ "video_backend": "pyav",
27
+ "sampling_weights": "",
28
+ "max_action_dim": 6,
29
+ "max_state_dim": 6,
30
+ "max_num_images": 3,
31
+ "max_image_dim": 256,
32
+ "train_on_all_features": true,
33
+ "features_version": 2,
34
+ "discard_first_n_frames": 0,
35
+ "min_fps": 30,
36
+ "max_fps": 30,
37
+ "discard_first_idle_frames": false,
38
+ "motion_threshold": 0.05,
39
+ "motion_window_size": 10,
40
+ "motion_buffer": 3
41
+ },
42
+ "env": null,
43
+ "policy": {
44
+ "type": "smolvla",
45
+ "n_obs_steps": 1,
46
+ "normalization_mapping": {
47
+ "VISUAL": "IDENTITY",
48
+ "STATE": "MEAN_STD",
49
+ "ACTION": "MEAN_STD"
50
+ },
51
+ "input_features": {
52
+ "observation.state": {
53
+ "type": "STATE",
54
+ "shape": [
55
+ 6
56
+ ]
57
+ },
58
+ "observation.image2": {
59
+ "type": "VISUAL",
60
+ "shape": [
61
+ 3,
62
+ 256,
63
+ 256
64
+ ]
65
+ },
66
+ "observation.image": {
67
+ "type": "VISUAL",
68
+ "shape": [
69
+ 3,
70
+ 256,
71
+ 256
72
+ ]
73
+ },
74
+ "observation.image3": {
75
+ "type": "VISUAL",
76
+ "shape": [
77
+ 3,
78
+ 256,
79
+ 256
80
+ ]
81
+ }
82
+ },
83
+ "output_features": {
84
+ "action": {
85
+ "type": "ACTION",
86
+ "shape": [
87
+ 6
88
+ ]
89
+ }
90
+ },
91
+ "chunk_size": 50,
92
+ "n_action_steps": 1,
93
+ "max_state_dim": 32,
94
+ "max_action_dim": 32,
95
+ "resize_imgs_with_padding": [
96
+ 512,
97
+ 512
98
+ ],
99
+ "empty_cameras": 0,
100
+ "adapt_to_pi_aloha": false,
101
+ "use_delta_joint_actions_aloha": false,
102
+ "tokenizer_max_length": 48,
103
+ "num_steps": 10,
104
+ "use_cache": true,
105
+ "freeze_vision_encoder": true,
106
+ "train_expert_only": true,
107
+ "train_state_proj": true,
108
+ "optimizer_lr": 0.0001,
109
+ "optimizer_betas": [
110
+ 0.9,
111
+ 0.95
112
+ ],
113
+ "optimizer_eps": 1e-08,
114
+ "optimizer_weight_decay": 1e-10,
115
+ "optimizer_grad_clip_norm": 10,
116
+ "scheduler_warmup_steps": 1000,
117
+ "scheduler_decay_steps": 30000,
118
+ "scheduler_decay_lr": 2.5e-06,
119
+ "vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
120
+ "load_vlm_weights": true,
121
+ "attention_mode": "cross_attn",
122
+ "prefix_length": 0,
123
+ "past_obs_keys": "image",
124
+ "pad_language_to": "max_length",
125
+ "num_expert_layers": 0,
126
+ "num_vlm_layers": 16,
127
+ "causal_action_attention_mask": true,
128
+ "self_attn_every_n_layers": 2,
129
+ "expert_width_multiplier": 0.75
130
+ },
131
+ "output_dir": "/lustre/fswork/projects/rech/dyf/ugz83ue/logs/lerobot/lerobot_so100_community_v1_v2_v3clean2_smolpi0_lr1e-4bs64steps400000gpus4freeze32_imgtoktrue_cross_attn_gap1_vlml16_causalacttrue_sa2_smolvlm2500_nobs1_expw0.75_feat2_lrvlm1e-4_trans0true_decaylr2.5e-630000_camfalse_fps3030_idlefalse",
132
+ "job_name": "smolvla",
133
+ "resume": false,
134
+ "overwrite": false,
135
+ "device": "cuda",
136
+ "use_amp": true,
137
+ "seed": 1000,
138
+ "num_workers": 4,
139
+ "batch_size": 64,
140
+ "eval_freq": 5000,
141
+ "log_freq": 200,
142
+ "save_checkpoint": true,
143
+ "save_freq": 20000,
144
+ "offline": {
145
+ "steps": 400000
146
+ },
147
+ "online": {
148
+ "steps": 0,
149
+ "rollout_n_episodes": 1,
150
+ "rollout_batch_size": 1,
151
+ "steps_between_rollouts": null,
152
+ "sampling_ratio": 0.5,
153
+ "env_seed": null,
154
+ "buffer_capacity": null,
155
+ "buffer_seed_size": 0,
156
+ "do_rollout_async": false
157
+ },
158
+ "use_policy_training_preset": true,
159
+ "optimizer": {
160
+ "type": "adamw",
161
+ "lr": 0.0001,
162
+ "weight_decay": 1e-10,
163
+ "grad_clip_norm": 10,
164
+ "betas": [
165
+ 0.9,
166
+ 0.95
167
+ ],
168
+ "eps": 1e-08
169
+ },
170
+ "scheduler": {
171
+ "type": "cosine_decay_with_warmup",
172
+ "num_warmup_steps": 1000,
173
+ "num_decay_steps": 30000,
174
+ "peak_lr": 0.0001,
175
+ "decay_lr": 2.5e-06
176
+ },
177
+ "eval": {
178
+ "n_episodes": 50,
179
+ "batch_size": 50,
180
+ "use_async_envs": false
181
+ },
182
+ "wandb": {
183
+ "enable": false,
184
+ "disable_artifact": false,
185
+ "project": "lerobot",
186
+ "entity": null,
187
+ "notes": null
188
+ },
189
+ "nccl_timeout": 9000,
190
+ "gradient_accumulation_steps": 1,
191
+ "torch_compile": true,
192
+ "save_on_eval": false,
193
+ "dataloader_drop_last": true,
194
+ "eval_mse": true,
195
+ "eval_mse_steps": 1000
196
+ }