File size: 15,066 Bytes
6ce4ca6
3cdf7b9
 
 
 
 
 
 
 
6ce4ca6
 
 
ac47069
aa886e5
ac47069
6ce4ca6
3cdf7b9
838a719
6ce4ca6
 
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
6ce4ca6
3cdf7b9
 
 
 
 
 
 
 
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
 
3cdf7b9
 
 
6ce4ca6
3cdf7b9
 
6ce4ca6
3cdf7b9
 
6ce4ca6
 
3cdf7b9
6ce4ca6
 
3cdf7b9
 
 
6ce4ca6
3cdf7b9
 
 
6ce4ca6
3cdf7b9
 
6ce4ca6
 
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
 
 
 
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
 
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
 
3cdf7b9
 
6ce4ca6
 
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
 
 
 
 
 
6ce4ca6
3cdf7b9
 
6ce4ca6
 
3cdf7b9
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
6ce4ca6
3cdf7b9
 
 
 
 
 
 
6ce4ca6
 
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
6ce4ca6
3cdf7b9
 
 
 
 
6ce4ca6
3cdf7b9
6ce4ca6
 
 
3cdf7b9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
---
title: RobotHub Arena Frontend
tags:
  - robotics
  - control
  - simulation
  - svelte
  - frontend
  - realtime
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 8000
pinned: true
license: mit
fullWidth: true
short_description: Web interface of the RobotHub platform 
---

# πŸ€– RobotHub Arena – Frontend

RobotHub is an **open-source, end-to-end robotics stack** that combines real-time communication, 3-D visualisation, and modern AI policies to control both simulated and physical robots.

**This repository contains the *Frontend*** – a SvelteKit web application that runs completely in the browser (or inside Electron / Tauri).  It talks to two backend micro-services that live in their own repositories:

1. **[RobotHub Transport Server](https://github.com/julien-blanchon/RobotHub-TransportServer)**  
   – WebSocket / WebRTC switch-board for video streams & robot joint messages.
2. **[RobotHub Inference Server](https://github.com/julien-blanchon/RobotHub-InferenceServer)**  
   – FastAPI service that loads large language- and vision-based policies (ACT, Pi-0, SmolVLA, …) and turns camera images + state into joint commands.

```text
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  RobotHub Frontend β”‚  HTTP  β”‚  Transport Server      β”‚   WebSocket β”‚   Robot / Camera HW      β”‚
β”‚  (this repo)       β”‚ <────► β”‚  (rooms, WS, WebRTC)   β”‚ ◄──────────►│   – servo bus, USB…      β”‚
β”‚                    β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚  3-D scene (Threlte)β”‚
β”‚  UI / Settings     β”‚        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Svelte 5 runes    β”‚  HTTP  β”‚  Inference Server      β”‚   HTTP/WS   β”‚  GPU (Torch, HF models)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ <────► β”‚  (FastAPI, PyTorch)     β”‚ β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Ίβ””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

---

## ✨ Key Features

β€’ **Digital-Twin 3-D Scene** – inspect robots, cameras & AI compute blocks in real-time.  
β€’ **Multi-Workspace Collaboration** – share a hash URL and others join the *same* WS rooms instantly.  
β€’ **Drag-&-Drop Add-ons** – spawn robots, cameras or AI models from the toolbar.  
β€’ **Transport-Agnostic** – control physical hardware over USB, or send/receive via WebRTC rooms.  
β€’ **Model Agnostic** – any policy exposed by the Inference Server can be used (ACT, Diffusion, …).  
β€’ **Reactive Core** – built with *Svelte 5 runes* – state is automatically pushed into the UI.

---

## πŸ“‚ Repository Layout (short)

| Path                          | Purpose |
|-------------------------------|---------|
| `src/`                        | SvelteKit app (routes, components) |
| `src/lib/elements`            | Runtime domain logic (robots, video, compute) |
| `external/RobotHub-*`         | Git sub-modules for the backend services – used for generated clients & tests |
| `static/`                     | URDFs, STL meshes, textures, favicon |

A more in-depth component overview can be found in `/src/lib/components/**` – every major popup/modal has its own Svelte file.

---

## πŸš€ Quick Start (dev)

```bash
# 1. clone with submodules (transport + inference)
$ git clone --recurse-submodules https://github.com/julien-blanchon/RobotHub-Frontend robothub-frontend
$ cd robothub-frontend

# 2. install deps (uses Bun)
$ bun install

# 3. start dev server (http://localhost:5173)
$ bun run dev -- --open
```

### Running the full stack locally

```bash
# 1. start Transport Server (rooms & streaming)
$ cd external/RobotHub-InferenceServer/external/RobotHub-TransportServer/server
$ uv run launch_with_ui.py  # β†’  http://localhost:8000

# 2. start Inference Server (AI brains)
$ cd ../../..
$ python launch_simple.py   # β†’  http://localhost:8001

# 3. frontend (separate terminal)
$ bun run dev -- --open     # β†’  http://localhost:5173  (hash = workspace-id)
```

The **workspace-id** in the URL hash ties all three services together.  Share `http://localhost:5173/#<id>` and a collaborator instantly joins the same set of rooms.

---

## πŸ› οΈ Usage Walk-Through

1. **Open the web-app** β†’ a fresh *workspace* is created (☝ left corner shows 🌐 ID).  
2. Click *Add Robot* β†’ spawns an SO-100 6-DoF arm (URDF).  
3. Click *Add Sensor β†’ Camera* β†’ creates a virtual camera element.  
4. Click *Add Model β†’ ACT* β†’ spawns a *Compute* block.
5. On the Compute block choose *Create Session* – select model path (`./checkpoints/act_so101_beyond`) and cameras (`front`).
6. Connect:  
   β€’ *Video Input* – local webcam β†’ `front` room.  
   β€’ *Robot Input* – robot β†’ *joint-input* room (producer).  
   β€’ *Robot Output* – robot ← AI predictions (consumer).
7. Press *Start Inference* – the model will predict the next joint trajectory every few frames. πŸŽ‰

All modals (`AISessionConnectionModal`, `RobotInputConnectionModal`, …) expose precisely what is happening under the hood: which room ID, whether you are *producer* or *consumer*, and the live status.

---

## 🧩 Package Relations

| Package | Role | Artifacts exposed to this repo |
|---------|------|--------------------------------|
| **Transport Server** | Low-latency switch-board (WS/WebRTC).  Creates *rooms* for video & joint messages. | TypeScript & Python client libraries (imported from sub-module) |
| **Inference Server** | Loads checkpoints (ACT, Pi-0, …) and manages *sessions*.  Each session automatically asks the Transport Server to create dedicated rooms. | Generated TS SDK (`@robothub/inference-server-client`) – auto-called from `RemoteComputeManager` |
| **Frontend (this repo)** | UI + 3-D scene.  Manages *robots*, *videos* & *compute* blocks and connects them to the correct rooms. | – |

> Because the two backend repos are included as git sub-modules you can develop & debug the whole trio in one repo clone.

---

## πŸ“œ Important Components (frontend)

β€’ `RemoteComputeManager` – wraps the Inference Server REST API.  
β€’ `RobotManager` – talks to Transport Server and USB drivers.  
β€’ `VideoManager` – handles local/remote camera streams and WebRTC.

Each element is a small class with `$state` fields which Svelte 5 picks up automatically.  The modals listed below are *thin* UI shells around those classes:

```
AISessionConnectionModal     – create / start / stop AI sessions
RobotInputConnectionModal    – joint-states β†’ AI
RobotOutputConnectionModal   – AI commands β†’ robot
VideoInputConnectionModal    – camera β†’ AI or screen
ManualControlSheet           – slider control, runs when no consumer connected
SettingsSheet                – configure base URLs of the two servers
```

---

## 🐳 Docker

A production-grade image is provided (multi-stage, 24 MB with Bun runtime):

```bash
$ docker build -t robothub-frontend .
$ docker run -p 8000:8000 robothub-frontend  # served by vite-preview
```

See `Dockerfile` for the full build – it also performs `bun test` & `bun run build` for the TS clients inside the sub-modules so that the image is completely self-contained.

---

## πŸ§‘β€πŸ’» Contributing

PRs are welcome!  The codebase is organised into **domain managers** (robot / video / compute) and **pure-UI** components.  If you add a new feature, create a manager first so that business logic can be unit-tested without DOM.

1. `bun test` – unit tests.  
2. `bun run typecheck` – strict TS config.

Please run `bun format` before committing – ESLint + Prettier configs are included.

---

## πŸ™ Special Thanks

Huge gratitude to [Tim Qian](https://github.com/timqian) ([X/Twitter](https://x.com/tim_qian)) and the
[bambot project](https://bambot.org/) for open-sourcing **feetech.js** – the
delightful js driver that powers our USB communication layer. 
---

## πŸ“„ License

MIT – see `LICENSE` in the root.

## 🌱 Project Philosophy

RobotHub follows a **separation-of-concerns** design:

* **Transport Server** is the single source of truth for *real-time* data – video frames, joint values, heart-beats.  Every participant (browser, Python script, robot firmware) only needs one WebSocket/WebRTC connection, no matter how many peers join later.
* **Inference Server** is stateless with regard to connectivity; it spins up / tears down *sessions* that rely on rooms in the Transport Server.  This lets heavy AI models live on a GPU box while cameras and robots stay on the edge.
* **Frontend** stays 100 % in the browser – no secret keys or device drivers required – and simply wires together rooms that already exist.

> By decoupling the pipeline we can deploy each piece on separate hardware or even different clouds, swap alternative implementations (e.g. ROS bridge instead of WebRTC) and scale each micro-service independently.

---

## πŸ›°  Transport Server – Real-Time Router

```
Browser / Robot ⟷  🌐 Transport Server  ⟷  Other Browser / AI / HW
```

* **Creates rooms** – `POST /robotics/workspaces/{ws}/rooms` or `POST /video/workspaces/{ws}/rooms`.
* **Manages roles** – every WebSocket identifies as *producer* (source) or *consumer* (sink).
* **Does zero processing** – it only forwards JSON (robotics) or WebRTC SDP/ICE (video).
* **Health-check** – `GET /api/health` returns a JSON heartbeat.

Why useful?

* You never expose robot hardware directly to the internet – it only speaks to the Transport Server.
* Multiple followers can subscribe to the *same* producer without extra bandwidth on the producer side (server fans out messages).
* Works across NAT thanks to WebRTC TURN support.

## 🏒  Workspaces – Lightweight Multi-Tenant Isolation

A **workspace** is simply a UUID namespace in the Transport Server.  Every room URL starts with:

```
/robotics/workspaces/{workspace_id}/rooms/{room_id}
/video/workspaces/{workspace_id}/rooms/{room_id}
```

Why bother?

1. **Privacy / Security** – clients in workspace *A* can neither list nor join rooms from workspace *B*. A workspace id is like a private password that keeps the rooms in the same workspace isolated from each other.
2. **Organisation** – keep each class, project or experiment separated without spinning up extra servers.
3. **Zero-config sharing** – the Frontend stores the workspace ID in the URL hash (e.g. `/#d742e85d-c9e9-4f7b-…`).  Send that link to a teammate and they automatically connect to the *same* namespace – all existing video feeds, robot rooms and AI sessions become visible.
4. **Stateless Scale-out** – Transport Server holds no global state; deleting a workspace removes all rooms in one call.

Typical lifecycle:

* **Create** – Frontend generates `crypto.randomUUID()` if the hash is empty.  Back-end rooms are lazily created when the first producer/consumer calls the REST API.
* **Share** – click the *#workspace* badge β†’ *Copy URL* (handled by `WorkspaceIdButton.svelte`)

> Practical tip: Use one workspace per demo to prevent collisions, then recycle it afterwards.

---

## 🧠  Inference Server – Session Lifecycle

1. **Create session**  
   `POST /api/sessions` with JSON:
   ```jsonc
   {
     "session_id": "pick_place_demo",
     "policy_path": "./checkpoints/act_so101_beyond",
     "camera_names": ["front", "wrist"],
     "transport_server_url": "http://localhost:8000",
     "workspace_id": "<existing-or-new>"  // optional
   }
   ```
2. **Receive response**  
   ```jsonc
   {
     "workspace_id": "ws-uuid",
     "camera_room_ids": { "front": "room-id-a", "wrist": "room-id-b" },
     "joint_input_room_id":  "room-id-c",
     "joint_output_room_id": "room-id-d"
   }
   ```
3. **Wire connections**
   * Camera PC joins `front` / `wrist` rooms as **producer** (WebRTC).
   * Robot joins `joint_input_room_id` as **producer** (joint states).
   * Robot (or simulator) joins `joint_output_room_id` as **consumer** (commands).
4. **Start inference**  
   `POST /api/sessions/{id}/start` – server loads the model and begins publishing commands.
5. **Stop / delete** as needed.  Stats & health are available via `GET /api/sessions`.

The Frontend automates steps 1-4 via the *AI Session* modal – you only click buttons.

---

## 🌐 Hosted Demo End-Points

| Service | URL | Status |
|---------|-----|--------|
| Transport Server | <https://blanchon-robothub-transportserver.hf.space/api> | Public & healthy |
| Inference Server | <https://blanchon-robothub-inferenceserver.hf.space/api> | `{"status":"healthy"}` |
| Frontend (read-only preview) | <https://blanchon-robothub-frontend.hf.space> | latest `main` |

Point the *Settings β†’ Server Configuration* panel to these URLs and you can play without any local backend.

---

## 🎯 Main Use-Cases

Below are typical connection patterns you can set-up **entirely from the UI**.  Each example lists the raw data-flow (β†’ = producer to consumer/AI) plus a video placeholder you can swap for a screen-capture.

### Direct Tele-Operation (Leader ➜ Follower)
*Leader PC*  `USB` ➜ **Robot A** ➜ `Remote producer` β†’ **Transport room** β†’ `Remote consumer` ➜ **Robot B**  (`USB`)

> One human moves Robot A, Robot B mirrors the motion in real-time. Works with any number of followers – just add more consumers to the same room.
>
> πŸ“Ί *demo-teleop-1.mp4*

### Web-UI Manual Control
**Browser sliders** (`ManualControlSheet`) β†’ `Remote producer` β†’ **Robot (USB)**

> No physical master arm needed – drive joints from any device.
>
> πŸ“Ί *demo-webui.mp4*

### AI Inference Loop
**Robot (USB)** ➜ `Remote producer` β†’ **joint-input room**  
**Camera PC** ➜ `Video producer` β†’ **camera room(s)**  
**Inference Server** (consumer) β†’ processes β†’ publishes to **joint-output room** β†’ `Remote consumer` ➜ **Robot**

> Lets a low-power robot PC stream data while a beefy GPU node does the heavy lifting.
>
> πŸ“Ί *demo-inference.mp4*

### Hybrid Classroom (Multi-Follower AI)
*Same as AI Inference Loop* with additional **Robot C, D…** subscribing to `joint_output_room_id` to run the same policy in parallel.

> Useful for swarm behaviours or classroom demonstrations.
>
> πŸ“Ί *demo-classroom.mp4*

### Split Video / Robot Across Machines
**Laptop A** (near cameras) β†’ streams video β†’ Transport  
**Laptop B** (near robot)   β†’ joins joint rooms  
**Browser** anywhere        β†’ watches video consumer & sends manual overrides

> Ideal when the camera PC stays close to sensors and you want minimal upstream bandwidth.
>
> πŸ“Ί *demo-splitio.mp4*