RobotHub-Frontend / README.md
blanchon's picture
Update
838a719
---
title: RobotHub Arena Frontend
tags:
- robotics
- control
- simulation
- svelte
- frontend
- realtime
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 8000
pinned: true
license: mit
fullWidth: true
short_description: Web interface of the RobotHub platform
---
# πŸ€– RobotHub Arena – Frontend
RobotHub is an **open-source, end-to-end robotics stack** that combines real-time communication, 3-D visualisation, and modern AI policies to control both simulated and physical robots.
**This repository contains the *Frontend*** – a SvelteKit web application that runs completely in the browser (or inside Electron / Tauri). It talks to two backend micro-services that live in their own repositories:
1. **[RobotHub Transport Server](https://github.com/julien-blanchon/RobotHub-TransportServer)**
– WebSocket / WebRTC switch-board for video streams & robot joint messages.
2. **[RobotHub Inference Server](https://github.com/julien-blanchon/RobotHub-InferenceServer)**
– FastAPI service that loads large language- and vision-based policies (ACT, Pi-0, SmolVLA, …) and turns camera images + state into joint commands.
```text
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ RobotHub Frontend β”‚ HTTP β”‚ Transport Server β”‚ WebSocket β”‚ Robot / Camera HW β”‚
β”‚ (this repo) β”‚ <────► β”‚ (rooms, WS, WebRTC) β”‚ ◄──────────►│ – servo bus, USB… β”‚
β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ 3-D scene (Threlte)β”‚
β”‚ UI / Settings β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Svelte 5 runes β”‚ HTTP β”‚ Inference Server β”‚ HTTP/WS β”‚ GPU (Torch, HF models) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ <────► β”‚ (FastAPI, PyTorch) β”‚ β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Ίβ””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## ✨ Key Features
β€’ **Digital-Twin 3-D Scene** – inspect robots, cameras & AI compute blocks in real-time.
β€’ **Multi-Workspace Collaboration** – share a hash URL and others join the *same* WS rooms instantly.
β€’ **Drag-&-Drop Add-ons** – spawn robots, cameras or AI models from the toolbar.
β€’ **Transport-Agnostic** – control physical hardware over USB, or send/receive via WebRTC rooms.
β€’ **Model Agnostic** – any policy exposed by the Inference Server can be used (ACT, Diffusion, …).
β€’ **Reactive Core** – built with *Svelte 5 runes* – state is automatically pushed into the UI.
---
## πŸ“‚ Repository Layout (short)
| Path | Purpose |
|-------------------------------|---------|
| `src/` | SvelteKit app (routes, components) |
| `src/lib/elements` | Runtime domain logic (robots, video, compute) |
| `external/RobotHub-*` | Git sub-modules for the backend services – used for generated clients & tests |
| `static/` | URDFs, STL meshes, textures, favicon |
A more in-depth component overview can be found in `/src/lib/components/**` – every major popup/modal has its own Svelte file.
---
## πŸš€ Quick Start (dev)
```bash
# 1. clone with submodules (transport + inference)
$ git clone --recurse-submodules https://github.com/julien-blanchon/RobotHub-Frontend robothub-frontend
$ cd robothub-frontend
# 2. install deps (uses Bun)
$ bun install
# 3. start dev server (http://localhost:5173)
$ bun run dev -- --open
```
### Running the full stack locally
```bash
# 1. start Transport Server (rooms & streaming)
$ cd external/RobotHub-InferenceServer/external/RobotHub-TransportServer/server
$ uv run launch_with_ui.py # β†’ http://localhost:8000
# 2. start Inference Server (AI brains)
$ cd ../../..
$ python launch_simple.py # β†’ http://localhost:8001
# 3. frontend (separate terminal)
$ bun run dev -- --open # β†’ http://localhost:5173 (hash = workspace-id)
```
The **workspace-id** in the URL hash ties all three services together. Share `http://localhost:5173/#<id>` and a collaborator instantly joins the same set of rooms.
---
## πŸ› οΈ Usage Walk-Through
1. **Open the web-app** β†’ a fresh *workspace* is created (☝ left corner shows 🌐 ID).
2. Click *Add Robot* β†’ spawns an SO-100 6-DoF arm (URDF).
3. Click *Add Sensor β†’ Camera* β†’ creates a virtual camera element.
4. Click *Add Model β†’ ACT* β†’ spawns a *Compute* block.
5. On the Compute block choose *Create Session* – select model path (`./checkpoints/act_so101_beyond`) and cameras (`front`).
6. Connect:
β€’ *Video Input* – local webcam β†’ `front` room.
β€’ *Robot Input* – robot β†’ *joint-input* room (producer).
β€’ *Robot Output* – robot ← AI predictions (consumer).
7. Press *Start Inference* – the model will predict the next joint trajectory every few frames. πŸŽ‰
All modals (`AISessionConnectionModal`, `RobotInputConnectionModal`, …) expose precisely what is happening under the hood: which room ID, whether you are *producer* or *consumer*, and the live status.
---
## 🧩 Package Relations
| Package | Role | Artifacts exposed to this repo |
|---------|------|--------------------------------|
| **Transport Server** | Low-latency switch-board (WS/WebRTC). Creates *rooms* for video & joint messages. | TypeScript & Python client libraries (imported from sub-module) |
| **Inference Server** | Loads checkpoints (ACT, Pi-0, …) and manages *sessions*. Each session automatically asks the Transport Server to create dedicated rooms. | Generated TS SDK (`@robothub/inference-server-client`) – auto-called from `RemoteComputeManager` |
| **Frontend (this repo)** | UI + 3-D scene. Manages *robots*, *videos* & *compute* blocks and connects them to the correct rooms. | – |
> Because the two backend repos are included as git sub-modules you can develop & debug the whole trio in one repo clone.
---
## πŸ“œ Important Components (frontend)
β€’ `RemoteComputeManager` – wraps the Inference Server REST API.
β€’ `RobotManager` – talks to Transport Server and USB drivers.
β€’ `VideoManager` – handles local/remote camera streams and WebRTC.
Each element is a small class with `$state` fields which Svelte 5 picks up automatically. The modals listed below are *thin* UI shells around those classes:
```
AISessionConnectionModal – create / start / stop AI sessions
RobotInputConnectionModal – joint-states β†’ AI
RobotOutputConnectionModal – AI commands β†’ robot
VideoInputConnectionModal – camera β†’ AI or screen
ManualControlSheet – slider control, runs when no consumer connected
SettingsSheet – configure base URLs of the two servers
```
---
## 🐳 Docker
A production-grade image is provided (multi-stage, 24 MB with Bun runtime):
```bash
$ docker build -t robothub-frontend .
$ docker run -p 8000:8000 robothub-frontend # served by vite-preview
```
See `Dockerfile` for the full build – it also performs `bun test` & `bun run build` for the TS clients inside the sub-modules so that the image is completely self-contained.
---
## πŸ§‘β€πŸ’» Contributing
PRs are welcome! The codebase is organised into **domain managers** (robot / video / compute) and **pure-UI** components. If you add a new feature, create a manager first so that business logic can be unit-tested without DOM.
1. `bun test` – unit tests.
2. `bun run typecheck` – strict TS config.
Please run `bun format` before committing – ESLint + Prettier configs are included.
---
## πŸ™ Special Thanks
Huge gratitude to [Tim Qian](https://github.com/timqian) ([X/Twitter](https://x.com/tim_qian)) and the
[bambot project](https://bambot.org/) for open-sourcing **feetech.js** – the
delightful js driver that powers our USB communication layer.
---
## πŸ“„ License
MIT – see `LICENSE` in the root.
## 🌱 Project Philosophy
RobotHub follows a **separation-of-concerns** design:
* **Transport Server** is the single source of truth for *real-time* data – video frames, joint values, heart-beats. Every participant (browser, Python script, robot firmware) only needs one WebSocket/WebRTC connection, no matter how many peers join later.
* **Inference Server** is stateless with regard to connectivity; it spins up / tears down *sessions* that rely on rooms in the Transport Server. This lets heavy AI models live on a GPU box while cameras and robots stay on the edge.
* **Frontend** stays 100 % in the browser – no secret keys or device drivers required – and simply wires together rooms that already exist.
> By decoupling the pipeline we can deploy each piece on separate hardware or even different clouds, swap alternative implementations (e.g. ROS bridge instead of WebRTC) and scale each micro-service independently.
---
## πŸ›° Transport Server – Real-Time Router
```
Browser / Robot ⟷ 🌐 Transport Server ⟷ Other Browser / AI / HW
```
* **Creates rooms** – `POST /robotics/workspaces/{ws}/rooms` or `POST /video/workspaces/{ws}/rooms`.
* **Manages roles** – every WebSocket identifies as *producer* (source) or *consumer* (sink).
* **Does zero processing** – it only forwards JSON (robotics) or WebRTC SDP/ICE (video).
* **Health-check** – `GET /api/health` returns a JSON heartbeat.
Why useful?
* You never expose robot hardware directly to the internet – it only speaks to the Transport Server.
* Multiple followers can subscribe to the *same* producer without extra bandwidth on the producer side (server fans out messages).
* Works across NAT thanks to WebRTC TURN support.
## 🏒 Workspaces – Lightweight Multi-Tenant Isolation
A **workspace** is simply a UUID namespace in the Transport Server. Every room URL starts with:
```
/robotics/workspaces/{workspace_id}/rooms/{room_id}
/video/workspaces/{workspace_id}/rooms/{room_id}
```
Why bother?
1. **Privacy / Security** – clients in workspace *A* can neither list nor join rooms from workspace *B*. A workspace id is like a private password that keeps the rooms in the same workspace isolated from each other.
2. **Organisation** – keep each class, project or experiment separated without spinning up extra servers.
3. **Zero-config sharing** – the Frontend stores the workspace ID in the URL hash (e.g. `/#d742e85d-c9e9-4f7b-…`). Send that link to a teammate and they automatically connect to the *same* namespace – all existing video feeds, robot rooms and AI sessions become visible.
4. **Stateless Scale-out** – Transport Server holds no global state; deleting a workspace removes all rooms in one call.
Typical lifecycle:
* **Create** – Frontend generates `crypto.randomUUID()` if the hash is empty. Back-end rooms are lazily created when the first producer/consumer calls the REST API.
* **Share** – click the *#workspace* badge β†’ *Copy URL* (handled by `WorkspaceIdButton.svelte`)
> Practical tip: Use one workspace per demo to prevent collisions, then recycle it afterwards.
---
## 🧠 Inference Server – Session Lifecycle
1. **Create session**
`POST /api/sessions` with JSON:
```jsonc
{
"session_id": "pick_place_demo",
"policy_path": "./checkpoints/act_so101_beyond",
"camera_names": ["front", "wrist"],
"transport_server_url": "http://localhost:8000",
"workspace_id": "<existing-or-new>" // optional
}
```
2. **Receive response**
```jsonc
{
"workspace_id": "ws-uuid",
"camera_room_ids": { "front": "room-id-a", "wrist": "room-id-b" },
"joint_input_room_id": "room-id-c",
"joint_output_room_id": "room-id-d"
}
```
3. **Wire connections**
* Camera PC joins `front` / `wrist` rooms as **producer** (WebRTC).
* Robot joins `joint_input_room_id` as **producer** (joint states).
* Robot (or simulator) joins `joint_output_room_id` as **consumer** (commands).
4. **Start inference**
`POST /api/sessions/{id}/start` – server loads the model and begins publishing commands.
5. **Stop / delete** as needed. Stats & health are available via `GET /api/sessions`.
The Frontend automates steps 1-4 via the *AI Session* modal – you only click buttons.
---
## 🌐 Hosted Demo End-Points
| Service | URL | Status |
|---------|-----|--------|
| Transport Server | <https://blanchon-robothub-transportserver.hf.space/api> | Public & healthy |
| Inference Server | <https://blanchon-robothub-inferenceserver.hf.space/api> | `{"status":"healthy"}` |
| Frontend (read-only preview) | <https://blanchon-robothub-frontend.hf.space> | latest `main` |
Point the *Settings β†’ Server Configuration* panel to these URLs and you can play without any local backend.
---
## 🎯 Main Use-Cases
Below are typical connection patterns you can set-up **entirely from the UI**. Each example lists the raw data-flow (β†’ = producer to consumer/AI) plus a video placeholder you can swap for a screen-capture.
### Direct Tele-Operation (Leader ➜ Follower)
*Leader PC* `USB` ➜ **Robot A** ➜ `Remote producer` β†’ **Transport room** β†’ `Remote consumer` ➜ **Robot B** (`USB`)
> One human moves Robot A, Robot B mirrors the motion in real-time. Works with any number of followers – just add more consumers to the same room.
>
> πŸ“Ί *demo-teleop-1.mp4*
### Web-UI Manual Control
**Browser sliders** (`ManualControlSheet`) β†’ `Remote producer` β†’ **Robot (USB)**
> No physical master arm needed – drive joints from any device.
>
> πŸ“Ί *demo-webui.mp4*
### AI Inference Loop
**Robot (USB)** ➜ `Remote producer` β†’ **joint-input room**
**Camera PC** ➜ `Video producer` β†’ **camera room(s)**
**Inference Server** (consumer) β†’ processes β†’ publishes to **joint-output room** β†’ `Remote consumer` ➜ **Robot**
> Lets a low-power robot PC stream data while a beefy GPU node does the heavy lifting.
>
> πŸ“Ί *demo-inference.mp4*
### Hybrid Classroom (Multi-Follower AI)
*Same as AI Inference Loop* with additional **Robot C, D…** subscribing to `joint_output_room_id` to run the same policy in parallel.
> Useful for swarm behaviours or classroom demonstrations.
>
> πŸ“Ί *demo-classroom.mp4*
### Split Video / Robot Across Machines
**Laptop A** (near cameras) β†’ streams video β†’ Transport
**Laptop B** (near robot) β†’ joins joint rooms
**Browser** anywhere β†’ watches video consumer & sends manual overrides
> Ideal when the camera PC stays close to sensors and you want minimal upstream bandwidth.
>
> πŸ“Ί *demo-splitio.mp4*