One Vision-Language-Action Model for GUI Agent
Qinghong (Kevin) Lin
KevinQHLin
AI & ML interests
Vision-Language Model, Video Understanding, Human-AI Interaction
Recent Activity
upvoted
a
paper
5 days ago
Show-o2: Improved Native Unified Multimodal Models
new activity
13 days ago
VideoGUI/VideoGUI-High-Plan:Update README.md
new activity
13 days ago
VideoGUI/VideoGUI-Mid-Plan:Update README.md