Spaces:
Running
Running
title: RobotHub Arena Frontend | |
tags: | |
- robotics | |
- control | |
- simulation | |
- svelte | |
- frontend | |
- realtime | |
emoji: π€ | |
colorFrom: blue | |
colorTo: purple | |
sdk: docker | |
app_port: 8000 | |
pinned: true | |
license: mit | |
fullWidth: true | |
short_description: Web interface of the RobotHub platform | |
# π€ RobotHub Arena β Frontend | |
RobotHub is an **open-source, end-to-end robotics stack** that combines real-time communication, 3-D visualisation, and modern AI policies to control both simulated and physical robots. | |
**This repository contains the *Frontend*** β a SvelteKit web application that runs completely in the browser (or inside Electron / Tauri). It talks to two backend micro-services that live in their own repositories: | |
1. **[RobotHub Transport Server](https://github.com/julien-blanchon/RobotHub-TransportServer)** | |
β WebSocket / WebRTC switch-board for video streams & robot joint messages. | |
2. **[RobotHub Inference Server](https://github.com/julien-blanchon/RobotHub-InferenceServer)** | |
β FastAPI service that loads large language- and vision-based policies (ACT, Pi-0, SmolVLA, β¦) and turns camera images + state into joint commands. | |
```text | |
ββββββββββββββββββββββ ββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ | |
β RobotHub Frontend β HTTP β Transport Server β WebSocket β Robot / Camera HW β | |
β (this repo) β <βββββΊ β (rooms, WS, WebRTC) β ββββββββββββΊβ β servo bus, USBβ¦ β | |
β β ββββββββββββββββββββββββββ ββββββββββββββββββββββββββββ | |
β 3-D scene (Threlte)β | |
β UI / Settings β ββββββββββββββββββββββββββ | |
β Svelte 5 runes β HTTP β Inference Server β HTTP/WS β GPU (Torch, HF models) β | |
ββββββββββββββββββββββ <βββββΊ β (FastAPI, PyTorch) β ββββββββββββΊββββββββββββββββββββββββββββ | |
ββββββββββββββββββββββββββ | |
``` | |
--- | |
## β¨ Key Features | |
β’ **Digital-Twin 3-D Scene** β inspect robots, cameras & AI compute blocks in real-time. | |
β’ **Multi-Workspace Collaboration** β share a hash URL and others join the *same* WS rooms instantly. | |
β’ **Drag-&-Drop Add-ons** β spawn robots, cameras or AI models from the toolbar. | |
β’ **Transport-Agnostic** β control physical hardware over USB, or send/receive via WebRTC rooms. | |
β’ **Model Agnostic** β any policy exposed by the Inference Server can be used (ACT, Diffusion, β¦). | |
β’ **Reactive Core** β built with *Svelte 5 runes* β state is automatically pushed into the UI. | |
--- | |
## π Repository Layout (short) | |
| Path | Purpose | | |
|-------------------------------|---------| | |
| `src/` | SvelteKit app (routes, components) | | |
| `src/lib/elements` | Runtime domain logic (robots, video, compute) | | |
| `external/RobotHub-*` | Git sub-modules for the backend services β used for generated clients & tests | | |
| `static/` | URDFs, STL meshes, textures, favicon | | |
A more in-depth component overview can be found in `/src/lib/components/**` β every major popup/modal has its own Svelte file. | |
--- | |
## π Quick Start (dev) | |
```bash | |
# 1. clone with submodules (transport + inference) | |
$ git clone --recurse-submodules https://github.com/julien-blanchon/RobotHub-Frontend robothub-frontend | |
$ cd robothub-frontend | |
# 2. install deps (uses Bun) | |
$ bun install | |
# 3. start dev server (http://localhost:5173) | |
$ bun run dev -- --open | |
``` | |
### Running the full stack locally | |
```bash | |
# 1. start Transport Server (rooms & streaming) | |
$ cd external/RobotHub-InferenceServer/external/RobotHub-TransportServer/server | |
$ uv run launch_with_ui.py # β http://localhost:8000 | |
# 2. start Inference Server (AI brains) | |
$ cd ../../.. | |
$ python launch_simple.py # β http://localhost:8001 | |
# 3. frontend (separate terminal) | |
$ bun run dev -- --open # β http://localhost:5173 (hash = workspace-id) | |
``` | |
The **workspace-id** in the URL hash ties all three services together. Share `http://localhost:5173/#<id>` and a collaborator instantly joins the same set of rooms. | |
--- | |
## π οΈ Usage Walk-Through | |
1. **Open the web-app** β a fresh *workspace* is created (β left corner shows π ID). | |
2. Click *Add Robot* β spawns an SO-100 6-DoF arm (URDF). | |
3. Click *Add Sensor β Camera* β creates a virtual camera element. | |
4. Click *Add Model β ACT* β spawns a *Compute* block. | |
5. On the Compute block choose *Create Session* β select model path (`./checkpoints/act_so101_beyond`) and cameras (`front`). | |
6. Connect: | |
β’ *Video Input* β local webcam β `front` room. | |
β’ *Robot Input* β robot β *joint-input* room (producer). | |
β’ *Robot Output* β robot β AI predictions (consumer). | |
7. Press *Start Inference* β the model will predict the next joint trajectory every few frames. π | |
All modals (`AISessionConnectionModal`, `RobotInputConnectionModal`, β¦) expose precisely what is happening under the hood: which room ID, whether you are *producer* or *consumer*, and the live status. | |
--- | |
## π§© Package Relations | |
| Package | Role | Artifacts exposed to this repo | | |
|---------|------|--------------------------------| | |
| **Transport Server** | Low-latency switch-board (WS/WebRTC). Creates *rooms* for video & joint messages. | TypeScript & Python client libraries (imported from sub-module) | | |
| **Inference Server** | Loads checkpoints (ACT, Pi-0, β¦) and manages *sessions*. Each session automatically asks the Transport Server to create dedicated rooms. | Generated TS SDK (`@robothub/inference-server-client`) β auto-called from `RemoteComputeManager` | | |
| **Frontend (this repo)** | UI + 3-D scene. Manages *robots*, *videos* & *compute* blocks and connects them to the correct rooms. | β | | |
> Because the two backend repos are included as git sub-modules you can develop & debug the whole trio in one repo clone. | |
--- | |
## π Important Components (frontend) | |
β’ `RemoteComputeManager` β wraps the Inference Server REST API. | |
β’ `RobotManager` β talks to Transport Server and USB drivers. | |
β’ `VideoManager` β handles local/remote camera streams and WebRTC. | |
Each element is a small class with `$state` fields which Svelte 5 picks up automatically. The modals listed below are *thin* UI shells around those classes: | |
``` | |
AISessionConnectionModal β create / start / stop AI sessions | |
RobotInputConnectionModal β joint-states β AI | |
RobotOutputConnectionModal β AI commands β robot | |
VideoInputConnectionModal β camera β AI or screen | |
ManualControlSheet β slider control, runs when no consumer connected | |
SettingsSheet β configure base URLs of the two servers | |
``` | |
--- | |
## π³ Docker | |
A production-grade image is provided (multi-stage, 24 MB with Bun runtime): | |
```bash | |
$ docker build -t robothub-frontend . | |
$ docker run -p 8000:8000 robothub-frontend # served by vite-preview | |
``` | |
See `Dockerfile` for the full build β it also performs `bun test` & `bun run build` for the TS clients inside the sub-modules so that the image is completely self-contained. | |
--- | |
## π§βπ» Contributing | |
PRs are welcome! The codebase is organised into **domain managers** (robot / video / compute) and **pure-UI** components. If you add a new feature, create a manager first so that business logic can be unit-tested without DOM. | |
1. `bun test` β unit tests. | |
2. `bun run typecheck` β strict TS config. | |
Please run `bun format` before committing β ESLint + Prettier configs are included. | |
--- | |
## π Special Thanks | |
Huge gratitude to [Tim Qian](https://github.com/timqian) ([X/Twitter](https://x.com/tim_qian)) and the | |
[bambot project](https://bambot.org/) for open-sourcing **feetech.js** β the | |
delightful js driver that powers our USB communication layer. | |
--- | |
## π License | |
MIT β see `LICENSE` in the root. | |
## π± Project Philosophy | |
RobotHub follows a **separation-of-concerns** design: | |
* **Transport Server** is the single source of truth for *real-time* data β video frames, joint values, heart-beats. Every participant (browser, Python script, robot firmware) only needs one WebSocket/WebRTC connection, no matter how many peers join later. | |
* **Inference Server** is stateless with regard to connectivity; it spins up / tears down *sessions* that rely on rooms in the Transport Server. This lets heavy AI models live on a GPU box while cameras and robots stay on the edge. | |
* **Frontend** stays 100 % in the browser β no secret keys or device drivers required β and simply wires together rooms that already exist. | |
> By decoupling the pipeline we can deploy each piece on separate hardware or even different clouds, swap alternative implementations (e.g. ROS bridge instead of WebRTC) and scale each micro-service independently. | |
--- | |
## π° Transport Server β Real-Time Router | |
``` | |
Browser / Robot β· π Transport Server β· Other Browser / AI / HW | |
``` | |
* **Creates rooms** β `POST /robotics/workspaces/{ws}/rooms` or `POST /video/workspaces/{ws}/rooms`. | |
* **Manages roles** β every WebSocket identifies as *producer* (source) or *consumer* (sink). | |
* **Does zero processing** β it only forwards JSON (robotics) or WebRTC SDP/ICE (video). | |
* **Health-check** β `GET /api/health` returns a JSON heartbeat. | |
Why useful? | |
* You never expose robot hardware directly to the internet β it only speaks to the Transport Server. | |
* Multiple followers can subscribe to the *same* producer without extra bandwidth on the producer side (server fans out messages). | |
* Works across NAT thanks to WebRTC TURN support. | |
## π’ Workspaces β Lightweight Multi-Tenant Isolation | |
A **workspace** is simply a UUID namespace in the Transport Server. Every room URL starts with: | |
``` | |
/robotics/workspaces/{workspace_id}/rooms/{room_id} | |
/video/workspaces/{workspace_id}/rooms/{room_id} | |
``` | |
Why bother? | |
1. **Privacy / Security** β clients in workspace *A* can neither list nor join rooms from workspace *B*. A workspace id is like a private password that keeps the rooms in the same workspace isolated from each other. | |
2. **Organisation** β keep each class, project or experiment separated without spinning up extra servers. | |
3. **Zero-config sharing** β the Frontend stores the workspace ID in the URL hash (e.g. `/#d742e85d-c9e9-4f7b-β¦`). Send that link to a teammate and they automatically connect to the *same* namespace β all existing video feeds, robot rooms and AI sessions become visible. | |
4. **Stateless Scale-out** β Transport Server holds no global state; deleting a workspace removes all rooms in one call. | |
Typical lifecycle: | |
* **Create** β Frontend generates `crypto.randomUUID()` if the hash is empty. Back-end rooms are lazily created when the first producer/consumer calls the REST API. | |
* **Share** β click the *#workspace* badge β *Copy URL* (handled by `WorkspaceIdButton.svelte`) | |
> Practical tip: Use one workspace per demo to prevent collisions, then recycle it afterwards. | |
--- | |
## π§ Inference Server β Session Lifecycle | |
1. **Create session** | |
`POST /api/sessions` with JSON: | |
```jsonc | |
{ | |
"session_id": "pick_place_demo", | |
"policy_path": "./checkpoints/act_so101_beyond", | |
"camera_names": ["front", "wrist"], | |
"transport_server_url": "http://localhost:8000", | |
"workspace_id": "<existing-or-new>" // optional | |
} | |
``` | |
2. **Receive response** | |
```jsonc | |
{ | |
"workspace_id": "ws-uuid", | |
"camera_room_ids": { "front": "room-id-a", "wrist": "room-id-b" }, | |
"joint_input_room_id": "room-id-c", | |
"joint_output_room_id": "room-id-d" | |
} | |
``` | |
3. **Wire connections** | |
* Camera PC joins `front` / `wrist` rooms as **producer** (WebRTC). | |
* Robot joins `joint_input_room_id` as **producer** (joint states). | |
* Robot (or simulator) joins `joint_output_room_id` as **consumer** (commands). | |
4. **Start inference** | |
`POST /api/sessions/{id}/start` β server loads the model and begins publishing commands. | |
5. **Stop / delete** as needed. Stats & health are available via `GET /api/sessions`. | |
The Frontend automates steps 1-4 via the *AI Session* modal β you only click buttons. | |
--- | |
## π Hosted Demo End-Points | |
| Service | URL | Status | | |
|---------|-----|--------| | |
| Transport Server | <https://blanchon-robothub-transportserver.hf.space/api> | Public & healthy | | |
| Inference Server | <https://blanchon-robothub-inferenceserver.hf.space/api> | `{"status":"healthy"}` | | |
| Frontend (read-only preview) | <https://blanchon-robothub-frontend.hf.space> | latest `main` | | |
Point the *Settings β Server Configuration* panel to these URLs and you can play without any local backend. | |
--- | |
## π― Main Use-Cases | |
Below are typical connection patterns you can set-up **entirely from the UI**. Each example lists the raw data-flow (β = producer to consumer/AI) plus a video placeholder you can swap for a screen-capture. | |
### Direct Tele-Operation (Leader β Follower) | |
*Leader PC* `USB` β **Robot A** β `Remote producer` β **Transport room** β `Remote consumer` β **Robot B** (`USB`) | |
> One human moves Robot A, Robot B mirrors the motion in real-time. Works with any number of followers β just add more consumers to the same room. | |
> | |
> πΊ *demo-teleop-1.mp4* | |
### Web-UI Manual Control | |
**Browser sliders** (`ManualControlSheet`) β `Remote producer` β **Robot (USB)** | |
> No physical master arm needed β drive joints from any device. | |
> | |
> πΊ *demo-webui.mp4* | |
### AI Inference Loop | |
**Robot (USB)** β `Remote producer` β **joint-input room** | |
**Camera PC** β `Video producer` β **camera room(s)** | |
**Inference Server** (consumer) β processes β publishes to **joint-output room** β `Remote consumer` β **Robot** | |
> Lets a low-power robot PC stream data while a beefy GPU node does the heavy lifting. | |
> | |
> πΊ *demo-inference.mp4* | |
### Hybrid Classroom (Multi-Follower AI) | |
*Same as AI Inference Loop* with additional **Robot C, Dβ¦** subscribing to `joint_output_room_id` to run the same policy in parallel. | |
> Useful for swarm behaviours or classroom demonstrations. | |
> | |
> πΊ *demo-classroom.mp4* | |
### Split Video / Robot Across Machines | |
**Laptop A** (near cameras) β streams video β Transport | |
**Laptop B** (near robot) β joins joint rooms | |
**Browser** anywhere β watches video consumer & sends manual overrides | |
> Ideal when the camera PC stays close to sensors and you want minimal upstream bandwidth. | |
> | |
> πΊ *demo-splitio.mp4* | |