aylinakkus commited on
Commit
04619df
·
verified ·
1 Parent(s): 1700f98

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -18,7 +18,7 @@ For ablation studies and additional insights, see our detailed [blog post]()!
18
 
19
  # Performance
20
 
21
- We evaluate on benchmarks ScreenSpot-V2, ScreenSpotPro and OS-World-G for grounding as well an agentic benchmark OS-World. For the latter we use an evaluation harness combining our grounding model with a planner (GPT-5) inspired by GTA1 Test-Time Scaling GUI Agents.
22
 
23
  | **Model** | **Size** | **Open Source** | **ScreenSpot-V2** | **ScreenSpotPro** | **OSWORLD-G** |
24
  |-------------------|:--------:|:---------------:|:-----------------:|:-----------------:|:-----------------:|
 
18
 
19
  # Performance
20
 
21
+ We evaluate on benchmarks ScreenSpot-V2, ScreenSpotPro and OS-World-G for grounding as well an agentic benchmark OS-World. For the latter we use an [evaluation harness](https://github.com/xlang-ai/OSWorld/blob/main/mm_agents/gta1/gta1_agent.py) combining our grounding model with a planner (GPT-5):
22
 
23
  | **Model** | **Size** | **Open Source** | **ScreenSpot-V2** | **ScreenSpotPro** | **OSWORLD-G** |
24
  |-------------------|:--------:|:---------------:|:-----------------:|:-----------------:|:-----------------:|