metadata

license: apache-2.0
datasets:
  - masato-ka/smolvla_block_instruction
tags:
  - robotics
  - SmolVLA
  - lerobot
pipeline_tag: robotics

Model Card for smolvla_block_instruction

SmolVLA trained for the block handling with text instructions.

How to Get Started with the Model

See the Lerobot library

We strong recommend the environment needs to be the same as the video. I use the camera of Macbook Air M2, Also The model was inference by Macbook Air M2 16GB. You can run this model with below command. Instruction set to control.single_task property.

python erobot/scripts/control_robot.py
--robot.type=so100
--control.type=record
--control.fps=30
--control.single_task="Transfer the blue block onto the yellow plate."
--control.repo_id=<YOUR EVAL DATASET>
--control.warmup_time_s=5
--control.episode_time_s=60
--control.reset_time_s=10
--control.num_episodes=1
--control.push_to_hub=false
--control.policy.path=masato-ka/smolvla_block_instruct
--control.display_data=true
--control.policy.device=mps

This model trained with below instruction.

  - Transfer the blue block onto the yellow plate.
  - Position the blue block a top the yellow plate.
  - Set the blue block down on the yellow plate.
  - Place blue block on yellow plate.
  - Blue block goes on the yellow plate!
  - Put the blue one on the yellow thing.
  - Yellow plate for the blue block!
  - Completely remove the blue block from the yellow plate.
  - The blue block must be taken away from the yellow plate.
  - Dislodge the blue block from the yellow plate entirely."
  - Get that blue block off the yellow plate!
  - Take the blue thing away from the yellow one.
  - Blue's gotta go from the yellow plate!
  - Remove blue block from yellow plate.

Training Details

Trained with LeRobot@b536f47.

The model was trained using LeRobot's training script and with the masato-ka/so100_nlact_block_instruct_v3 dataset, using this command:

!python lerobot/scripts/train.py \
  --dataset.repo_id=masato-ka/so100_nlact_block_instruct_v3 \
  --policy.path=lerobot/smolvla_base \
  --batch_size=8 \
  --output_dir=outputs/train/smolvla \
  --job_name=smolvla_exp03 \
  --policy.device=cuda \
  --steps=40000\
  --save_freq=20000 \
  --wandb.enable=true \
  --wandb.project=smolvla_test

This took about 3h to train on an Nvida A100.