Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -18,7 +18,7 @@ For ablation studies and additional insights, see our detailed [blog post]()!
|
|
| 18 |
|
| 19 |
# Performance
|
| 20 |
|
| 21 |
-
We evaluate on benchmarks ScreenSpot-V2, ScreenSpotPro and OS-World-G for grounding as well an agentic benchmark OS-World. For the latter we use an evaluation harness combining our grounding model with a planner (GPT-5)
|
| 22 |
|
| 23 |
| **Model** | **Size** | **Open Source** | **ScreenSpot-V2** | **ScreenSpotPro** | **OSWORLD-G** |
|
| 24 |
|-------------------|:--------:|:---------------:|:-----------------:|:-----------------:|:-----------------:|
|
|
|
|
| 18 |
|
| 19 |
# Performance
|
| 20 |
|
| 21 |
+
We evaluate on benchmarks ScreenSpot-V2, ScreenSpotPro and OS-World-G for grounding as well an agentic benchmark OS-World. For the latter we use an [evaluation harness](https://github.com/xlang-ai/OSWorld/blob/main/mm_agents/gta1/gta1_agent.py) combining our grounding model with a planner (GPT-5):
|
| 22 |
|
| 23 |
| **Model** | **Size** | **Open Source** | **ScreenSpot-V2** | **ScreenSpotPro** | **OSWORLD-G** |
|
| 24 |
|-------------------|:--------:|:---------------:|:-----------------:|:-----------------:|:-----------------:|
|