zzfive
			's Collections
			 
		
			
		agent
		
	updated
			
 
				
				
	
	
	
			
			AgentOhana: Design Unified Data and Training Pipeline for Effective
  Agent Learning
		
			Paper
			
•
			2402.15506
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web
  Navigating Agent
		
			Paper
			
•
			2404.03648
			
•
			Published
				
			•
				
				29
			
 
	
	 
	
	
	
			
			Similarity is Not All You Need: Endowing Retrieval Augmented Generation
  with Multi Layered Thoughts
		
			Paper
			
•
			2405.19893
			
•
			Published
				
			•
				
				33
			
 
	
	 
	
	
	
			
			Parrot: Efficient Serving of LLM-based Applications with Semantic
  Variable
		
			Paper
			
•
			2405.19888
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
  Navigation via Multi-Agent Collaboration
		
			Paper
			
•
			2406.01014
			
•
			Published
				
			•
				
				34
			
 
	
	 
	
	
	
			
			AgentGym: Evolving Large Language Model-based Agents across Diverse
  Environments
		
			Paper
			
•
			2406.04151
			
•
			Published
				
			•
				
				23
			
 
	
	 
	
	
	
			
			τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World
  Domains
		
			Paper
			
•
			2406.12045
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			Agentless: Demystifying LLM-based Software Engineering Agents
		
			Paper
			
•
			2407.01489
			
•
			Published
				
			•
				
				64
			
 
	
	 
	
	
	
			
			Internet of Agents: Weaving a Web of Heterogeneous Agents for
  Collaborative Intelligence
		
			Paper
			
•
			2407.07061
			
•
			Published
				
			•
				
				27
			
 
	
	 
	
	
	
			
			Spider2-V: How Far Are Multimodal Agents From Automating Data Science
  and Engineering Workflows?
		
			Paper
			
•
			2407.10956
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			Sibyl: Simple yet Effective Agent Framework for Complex Real-world
  Reasoning
		
			Paper
			
•
			2407.10718
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation
		
			Paper
			
•
			2407.14931
			
•
			Published
				
			•
				
				22
			
 
	
	 
	
	
	
			
			AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
		
			Paper
			
•
			2407.15711
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis
		
			Paper
			
•
			2407.13301
			
•
			Published
				
			•
				
				55
			
 
	
	 
	
	
	
			
			OpenDevin: An Open Platform for AI Software Developers as Generalist
  Agents
		
			Paper
			
•
			2407.16741
			
•
			Published
				
			•
				
				73
			
 
	
	 
	
	
	
			
			LAMBDA: A Large Model Based Data Agent
		
			Paper
			
•
			2407.17535
			
•
			Published
				
			•
				
				37
			
 
	
	 
	
	
	
			
			AppWorld: A Controllable World of Apps and People for Benchmarking
  Interactive Coding Agents
		
			Paper
			
•
			2407.18901
			
•
			Published
				
			•
				
				35
			
 
	
	 
	
	
	
			
			MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
		
			Paper
			
•
			2407.20183
			
•
			Published
				
			•
				
				43
			
 
	
	 
	
	
	
			
			GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
		
			Paper
			
•
			2408.01584
			
•
			Published
				
			•
				
				10
			
 
	
	 
	
	
	
			
			Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in
  Long-Horizon Tasks
		
			Paper
			
•
			2408.03615
			
•
			Published
				
			•
				
				31
			
 
	
	 
	
	
	
			
			CodexGraph: Bridging Large Language Models and Code Repositories via
  Code Graph Databases
		
			Paper
			
•
			2408.03910
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			Automated Design of Agentic Systems
		
			Paper
			
•
			2408.08435
			
•
			Published
				
			•
				
				40
			
 
	
	 
	
	
	
			
			Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized
  Academic Assistance
		
			Paper
			
•
			2409.04593
			
•
			Published
				
			•
				
				26
			
 
	
	 
	
	
	
		
			Paper
			
•
			2409.07429
			
•
			Published
				
			•
				
				31
			
 
	
	 
	
	
	
			
			SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research
  Repositories
		
			Paper
			
•
			2409.07440
			
•
			Published
				
			•
				
				8
			
 
	
	 
	
	
	
			
			HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
  at Scale
		
			Paper
			
•
			2409.16299
			
•
			Published
				
			•
				
				12
			
 
	
	 
	
	
	
			
			MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for
  Superior Planning and Decision-Making
		
			Paper
			
•
			2409.16686
			
•
			Published
				
			•
				
				10
			
 
	
	 
	
	
	
			
			Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise
		
			Paper
			
•
			2410.03017
			
•
			Published
				
			•
				
				29
			
 
	
	 
	
	
	
			
			Agent S: An Open Agentic Framework that Uses Computers Like a Human
		
			Paper
			
•
			2410.08164
			
•
			Published
				
			•
				
				26
			
 
	
	 
	
	
	
			
			MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language
  Models
		
			Paper
			
•
			2410.11710
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
			
			Agent-as-a-Judge: Evaluate Agents with Agents
		
			Paper
			
•
			2410.10934
			
•
			Published
				
			•
				
				23
			
 
	
	 
	
	
	
			
			Revealing the Barriers of Language Agents in Planning
		
			Paper
			
•
			2410.12409
			
•
			Published
				
			•
				
				27
			
 
	
	 
	
	
	
			
			MobA: A Two-Level Agent System for Efficient Mobile Task Automation
		
			Paper
			
•
			2410.13757
			
•
			Published
				
			•
				
				33
			
 
	
	 
	
	
	
			
			Web Agents with World Models: Learning and Leveraging Environment
  Dynamics in Web Navigation
		
			Paper
			
•
			2410.13232
			
•
			Published
				
			•
				
				44
			
 
	
	 
	
	
	
			
			AgentStore: Scalable Integration of Heterogeneous Agents As Specialized
  Generalist Computer Assistant
		
			Paper
			
•
			2410.18603
			
•
			Published
				
			•
				
				32
			
 
	
	 
	
	
	
			
			AutoKaggle: A Multi-Agent Framework for Autonomous Data Science
  Competitions
		
			Paper
			
•
			2410.20424
			
•
			Published
				
			•
				
				40
			
 
	
	 
	
	
	
			
			OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World
  Exploration, Feedback and Optimization
		
			Paper
			
•
			2410.19609
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			Teaching Embodied Reinforcement Learning Agents: Informativeness and
  Diversity of Language Use
		
			Paper
			
•
			2410.24218
			
•
			Published
				
			•
				
				6
			
 
	
	 
	
	
	
			
			OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
		
			Paper
			
•
			2410.23218
			
•
			Published
				
			•
				
				49
			
 
	
	 
	
	
	
			
			Adapting While Learning: Grounding LLMs for Scientific Problems with
  Intelligent Tool Usage Adaptation
		
			Paper
			
•
			2411.00412
			
•
			Published
				
			•
				
				10
			
 
	
	 
	
	
	
			
			AndroidLab: Training and Systematic Benchmarking of Android Autonomous
  Agents
		
			Paper
			
•
			2410.24024
			
•
			Published
				
			•
				
				49
			
 
	
	 
	
	
	
			
			WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum
  Reinforcement Learning
		
			Paper
			
•
			2411.02337
			
•
			Published
				
			•
				
				36
			
 
	
	 
	
	
	
			
			Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large
  Language Model
		
			Paper
			
•
			2411.04496
			
•
			Published
				
			•
				
				23
			
 
	
	 
	
	
	
			
			GazeGen: Gaze-Driven User Interaction for Visual Content Generation
		
			Paper
			
•
			2411.04335
			
•
			Published
				
			•
				
				15
			
 
	
	 
	
	
	
			
			The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer
  Use
		
			Paper
			
•
			2411.10323
			
•
			Published
				
			•
				
				34
			
 
	
	 
	
	
	
			
			Is Your LLM Secretly a World Model of the Internet? Model-Based Planning
  for Web Agents
		
			Paper
			
•
			2411.06559
			
•
			Published
				
			•
				
				16
			
 
	
	 
	
	
	
			
			BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
		
			Paper
			
•
			2411.13543
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			SketchAgent: Language-Driven Sequential Sketch Generation
		
			Paper
			
•
			2411.17673
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			Interleaved Scene Graph for Interleaved Text-and-Image Generation
  Assessment
		
			Paper
			
•
			2411.17188
			
•
			Published
				
			•
				
				21
			
 
	
	 
	
	
	
			
			Large Language Model-Brained GUI Agents: A Survey
		
			Paper
			
•
			2411.18279
			
•
			Published
				
			•
				
				31
			
 
	
	 
	
	
	
			
			MALT: Improving Reasoning with Multi-Agent LLM Training
		
			Paper
			
•
			2412.01928
			
•
			Published
				
			•
				
				45
			
 
	
	 
	
	
	
			
			Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
		
			Paper
			
•
			2412.04454
			
•
			Published
				
			•
				
				71
			
 
	
	 
	
	
	
			
			Unraveling the Complexity of Memory in RL Agents: an Approach for
  Classification and Evaluation
		
			Paper
			
•
			2412.06531
			
•
			Published
				
			•
				
				72
			
 
	
	 
	
	
	
			
			The BrowserGym Ecosystem for Web Agent Research
		
			Paper
			
•
			2412.05467
			
•
			Published
				
			•
				
				22
			
 
	
	 
	
	
	
			
			AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web
  Tutorials
		
			Paper
			
•
			2412.09605
			
•
			Published
				
			•
				
				30
			
 
	
	 
	
	
	
			
			Large Action Models: From Inception to Implementation
		
			Paper
			
•
			2412.10047
			
•
			Published
				
			•
				
				36
			
 
	
	 
	
	
	
			
			Evaluation Agent: Efficient and Promptable Evaluation Framework for
  Visual Generative Models
		
			Paper
			
•
			2412.09645
			
•
			Published
				
			•
				
				36
			
 
	
	 
	
	
	
			
			Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation
  Model Internet Agents
		
			Paper
			
•
			2412.13194
			
•
			Published
				
			•
				
				12
			
 
	
	 
	
	
	
			
			TheAgentCompany: Benchmarking LLM Agents on Consequential Real World
  Tasks
		
			Paper
			
•
			2412.14161
			
•
			Published
				
			•
				
				51
			
 
	
	 
	
	
	
		
			Paper
			
•
			2412.13501
			
•
			Published
				
			•
				
				29
			
 
	
	 
	
	
	
			
			PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital
  World
		
			Paper
			
•
			2412.17589
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			Agent-SafetyBench: Evaluating the Safety of LLM Agents
		
			Paper
			
•
			2412.14470
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			Training Software Engineering Agents and Verifiers with SWE-Gym
		
			Paper
			
•
			2412.21139
			
•
			Published
				
			•
				
				24
			
 
	
	 
	
	
	
			
			OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse
  Task Synthesis
		
			Paper
			
•
			2412.19723
			
•
			Published
				
			•
				
				87
			
 
	
	 
	
	
	
			
			A3: Android Agent Arena for Mobile GUI Agents
		
			Paper
			
•
			2501.01149
			
•
			Published
				
			•
				
				22
			
 
	
	 
	
	
	
			
			Agent Laboratory: Using LLM Agents as Research Assistants
		
			Paper
			
•
			2501.04227
			
•
			Published
				
			•
				
				94
			
 
	
	 
	
	
	
			
			Search-o1: Agentic Search-Enhanced Large Reasoning Models
		
			Paper
			
•
			2501.05366
			
•
			Published
				
			•
				
				102
			
 
	
	 
	
	
	
			
			InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning
  and Reflection
		
			Paper
			
•
			2501.04575
			
•
			Published
				
			•
				
				25
			
 
	
	 
	
	
	
			
			PaSa: An LLM Agent for Comprehensive Academic Paper Search
		
			Paper
			
•
			2501.10120
			
•
			Published
				
			•
				
				52
			
 
	
	 
	
	
	
			
			Agent-R: Training Language Model Agents to Reflect via Iterative
  Self-Training
		
			Paper
			
•
			2501.11425
			
•
			Published
				
			•
				
				108
			
 
	
	 
	
	
	
			
			UI-TARS: Pioneering Automated GUI Interaction with Native Agents
		
			Paper
			
•
			2501.12326
			
•
			Published
				
			•
				
				65
			
 
	
	 
	
	
	
			
			Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
		
			Paper
			
•
			2501.11733
			
•
			Published
				
			•
				
				28
			
 
	
	 
	
	
	
			
			FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in
  Virtual 3D Spaces
		
			Paper
			
•
			2501.12909
			
•
			Published
				
			•
				
				71
			
 
	
	 
	
	
	
			
			IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI
  Systems
		
			Paper
			
•
			2501.11067
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web
  Navigation
		
			Paper
			
•
			2501.16609
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
		
			Paper
			
•
			2502.02584
			
•
			Published
				
			•
				
				17
			
 
	
	 
	
	
	
			
			Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models
  Beneficial?
		
			Paper
			
•
			2502.00674
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents
		
			Paper
			
•
			2502.05957
			
•
			Published
				
			•
				
				16
			
 
	
	 
	
	
	
			
			InSTA: Towards Internet-Scale Training For Agents
		
			Paper
			
•
			2502.06776
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			Hephaestus: Improving Fundamental Agent Capabilities of Large Language
  Models through Continual Pre-Training
		
			Paper
			
•
			2502.06589
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
			
			EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language
  Models for Vision-Driven Embodied Agents
		
			Paper
			
•
			2502.09560
			
•
			Published
				
			•
				
				35
			
 
	
	 
	
	
	
			
			OctoTools: An Agentic Framework with Extensible Tools for Complex
  Reasoning
		
			Paper
			
•
			2502.11271
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			Autellix: An Efficient Serving Engine for LLM Agents as General Programs
		
			Paper
			
•
			2502.13965
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			TAG: A Decentralized Framework for Multi-Agent Hierarchical
  Reinforcement Learning
		
			Paper
			
•
			2502.15425
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			Self-Taught Agentic Long Context Understanding
		
			Paper
			
•
			2502.15920
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			WebGames: Challenging General-Purpose Web-Browsing AI Agents
		
			Paper
			
•
			2502.18356
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic
  Iterative Reasoning Agents
		
			Paper
			
•
			2502.18017
			
•
			Published
				
			•
				
				21
			
 
	
	 
	
	
	
			
			PodAgent: A Comprehensive Framework for Podcast Generation
		
			Paper
			
•
			2503.00455
			
•
			Published
				
			•
				
				6
			
 
	
	 
	
	
	
			
			MPO: Boosting LLM Agents with Meta Plan Optimization
		
			Paper
			
•
			2503.02682
			
•
			Published
				
			•
				
				28
			
 
	
	 
	
	
	
			
			Agent models: Internalizing Chain-of-Action Generation into Reasoning
  models
		
			Paper
			
•
			2503.06580
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			API Agents vs. GUI Agents: Divergence and Convergence
		
			Paper
			
•
			2503.11069
			
•
			Published
				
			•
				
				37
			
 
	
	 
	
	
	
			
			STEVE: AStep Verification Pipeline for Computer-use Agent Training
		
			Paper
			
•
			2503.12532
			
•
			Published
				
			•
				
				17
			
 
	
	 
	
	
	
			
			Survey on Evaluation of LLM-based Agents
		
			Paper
			
•
			2503.16416
			
•
			Published
				
			•
				
				95
			
 
	
	 
	
	
	
			
			Verbal Process Supervision Elicits Better Coding Agents
		
			Paper
			
•
			2503.18494
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Large Language Model Agent: A Survey on Methodology, Applications and
  Challenges
		
			Paper
			
•
			2503.21460
			
•
			Published
				
			•
				
				83
			
 
	
	 
	
	
	
			
			UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement
  Learning
		
			Paper
			
•
			2503.21620
			
•
			Published
				
			•
				
				62
			
 
	
	 
	
	
	
			
			Classical Planning with LLM-Generated Heuristics: Challenging the State
  of the Art with Python Code
		
			Paper
			
•
			2503.18809
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			Agent S2: A Compositional Generalist-Specialist Framework for Computer
  Use Agents
		
			Paper
			
•
			2504.00906
			
•
			Published
				
			•
				
				26
			
 
	
	 
	
	
	
			
			Advances and Challenges in Foundation Agents: From Brain-Inspired
  Intelligence to Evolutionary, Collaborative, and Safe Systems
		
			Paper
			
•
			2504.01990
			
•
			Published
				
			•
				
				300
			
 
	
	 
	
	
	
			
			AgentRewardBench: Evaluating Automatic Evaluations of Web Agent
  Trajectories
		
			Paper
			
•
			2504.08942
			
•
			Published
				
			•
				
				28
			
 
	
	 
	
	
	
			
			Breaking the Data Barrier -- Building GUI Agents Through Task
  Generalization
		
			Paper
			
•
			2504.10127
			
•
			Published
				
			•
				
				17
			
 
	
	 
	
	
	
			
			SocioVerse: A World Model for Social Simulation Powered by LLM Agents
  and A Pool of 10 Million Real-World Users
		
			Paper
			
•
			2504.10157
			
•
			Published
				
			•
				
				17
			
 
	
	 
	
	
	
			
			The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via
  Agentic Tree Search
		
			Paper
			
•
			2504.08066
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
		
			Paper
			
•
			2504.11442
			
•
			Published
				
			•
				
				29
			
 
	
	 
	
	
	
			
			MLRC-Bench: Can Language Agents Solve Machine Learning Research
  Challenges?
		
			Paper
			
•
			2504.09702
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			Exploring Expert Failures Improves LLM Agent Tuning
		
			Paper
			
•
			2504.13145
			
•
			Published
				
			•
				
				12
			
 
	
	 
	
	
	
			
			UFO2: The Desktop AgentOS
		
			Paper
			
•
			2504.14603
			
•
			Published
				
			•
				
				29
			
 
	
	 
	
	
	
			
			InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to
  Deliberative Reasoners
		
			Paper
			
•
			2504.14239
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making
  Abilities
		
			Paper
			
•
			2504.16078
			
•
			Published
				
			•
				
				21
			
 
	
	 
	
	
	
			
			Paper2Code: Automating Code Generation from Scientific Papers in Machine
  Learning
		
			Paper
			
•
			2504.17192
			
•
			Published
				
			•
				
				120
			
 
	
	 
	
	
	
			
			LLM-Powered GUI Agents in Phone Automation: Surveying Progress and
  Prospects
		
			Paper
			
•
			2504.19838
			
•
			Published
				
			•
				
				22
			
 
	
	 
	
	
	
			
			Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
		
			Paper
			
•
			2504.19413
			
•
			Published
				
			•
				
				28
			
 
	
	 
	
	
	
			
			RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn
  Reinforcement Learning
		
			Paper
			
•
			2504.20073
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			Agentic Reasoning and Tool Integration for LLMs via Reinforcement
  Learning
		
			Paper
			
•
			2505.01441
			
•
			Published
				
			•
				
				39
			
 
	
	 
	
	
	
			
			Think on your Feet: Adaptive Thinking via Reinforcement Learning for
  Social Agents
		
			Paper
			
•
			2505.02156
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			Multi-Agent System for Comprehensive Soccer Understanding
		
			Paper
			
•
			2505.03735
			
•
			Published
				
			•
				
				25
			
 
	
	 
	
	
	
			
			OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents
		
			Paper
			
•
			2505.03570
			
•
			Published
				
			•
				
				8
			
 
	
	 
	
	
	
			
			LLM-Independent Adaptive RAG: Let the Question Speak for Itself
		
			Paper
			
•
			2505.04253
			
•
			Published
				
			•
				
				13
			
 
	
	 
	
	
	
			
			AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and
  Challenge
		
			Paper
			
•
			2505.10468
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			Creating General User Models from Computer Use
		
			Paper
			
•
			2505.10831
			
•
			Published
				
			•
				
				5
			
 
	
	 
	
	
	
			
			Visual Agentic Reinforcement Fine-Tuning
		
			Paper
			
•
			2505.14246
			
•
			Published
				
			•
				
				32
			
 
	
	 
	
	
	
			
			NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop
  System from Hypothesis to Verification
		
			Paper
			
•
			2505.16938
			
•
			Published
				
			•
				
				120
			
 
	
	 
	
	
	
			
			Distilling LLM Agent into Small Models with Retrieval and Code Tools
		
			Paper
			
•
			2505.17612
			
•
			Published
				
			•
				
				81
			
 
	
	 
	
	
	
			
			UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based
  Mobile GUI Agents
		
			Paper
			
•
			2505.21496
			
•
			Published
				
			•
				
				38
			
 
	
	 
	
	
	
			
			WebDancer: Towards Autonomous Information Seeking Agency
		
			Paper
			
•
			2505.22648
			
•
			Published
				
			•
				
				33
			
 
	
	 
	
	
	
			
			Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and
  Benchmarking Multimodal LLM Agents
		
			Paper
			
•
			2505.24878
			
•
			Published
				
			•
				
				22
			
 
	
	 
	
	
	
			
			GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
		
			Paper
			
•
			2506.03143
			
•
			Published
				
			•
				
				52
			
 
	
	 
	
	
	
			
			TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management
  in LLM-based Agentic Multi-Agent Systems
		
			Paper
			
•
			2506.04133
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow
  Development
		
			Paper
			
•
			2506.05010
			
•
			Published
				
			•
				
				79
			
 
	
	 
	
	
	
			
			Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
		
			Paper
			
•
			2506.02865
			
•
			Published
				
			•
				
				33
			
 
	
	 
	
	
	
			
			MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at
  Scale
		
			Paper
			
•
			2506.04405
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			Agents of Change: Self-Evolving LLM Agents for Strategic Planning
		
			Paper
			
•
			2506.04651
			
•
			Published
				
			•
				
				8
			
 
	
	 
	
	
	
			
			DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
		
			Paper
			
•
			2506.11763
			
•
			Published
				
			•
				
				71
			
 
	
	 
	
	
	
			
			Scaling Test-time Compute for LLM Agents
		
			Paper
			
•
			2506.12928
			
•
			Published
				
			•
				
				63
			
 
	
	 
	
	
	
			
			OAgents: An Empirical Study of Building Effective Agents
		
			Paper
			
•
			2506.15741
			
•
			Published
				
			•
				
				35
			
 
	
	 
	
	
	
			
			SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via
  Multi-Agent Multi-Turn Reinforcement Learning
		
			Paper
			
•
			2506.24119
			
•
			Published
				
			•
				
				50
			
 
	
	 
	
	
	
			
			WebSailor: Navigating Super-human Reasoning for Web Agent
		
			Paper
			
•
			2507.02592
			
•
			Published
				
			•
				
				121
			
 
	
	 
	
	
	
			
			PresentAgent: Multimodal Agent for Presentation Video Generation
		
			Paper
			
•
			2507.04036
			
•
			Published
				
			•
				
				10
			
 
	
	 
	
	
	
			
			Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
		
			Paper
			
•
			2507.06229
			
•
			Published
				
			•
				
				75
			
 
	
	 
	
	
	
			
			MIRIX: Multi-Agent Memory System for LLM-Based Agents
		
			Paper
			
•
			2507.07957
			
•
			Published
				
			•
				
				75
			
 
	
	 
	
	
	
			
			GUI-G^2: Gaussian Reward Modeling for GUI Grounding
		
			Paper
			
•
			2507.15846
			
•
			Published
				
			•
				
				132
			
 
	
	 
	
	
	
			
			MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models
		
			Paper
			
•
			2507.12806
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
			
			LLM Economist: Large Population Models and Mechanism Design in
  Multi-Agent Generative Simulacra
		
			Paper
			
•
			2507.15815
			
•
			Published
				
			•
				
				6
			
 
	
	 
	
	
	
			
			MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI
  Agents
		
			Paper
			
•
			2507.19478
			
•
			Published
				
			•
				
				30
			
 
	
	 
	
	
	
			
			A Survey of Self-Evolving Agents: On Path to Artificial Super
  Intelligence
		
			Paper
			
•
			2507.21046
			
•
			Published
				
			•
				
				81
			
 
	
	 
	
	
	
			
			GenoMAS: A Multi-Agent Framework for Scientific Discovery via
  Code-Driven Gene Expression Analysis
		
			Paper
			
•
			2507.21035
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			ScreenCoder: Advancing Visual-to-Code Generation for Front-End
  Automation via Modular Multimodal Agents
		
			Paper
			
•
			2507.22827
			
•
			Published
				
			•
				
				98
			
 
	
	 
	
	
	
			
			Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent
  Foundation Models Training
		
			Paper
			
•
			2508.00414
			
•
			Published
				
			•
				
				91
			
 
	
	 
	
	
	
			
			SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
		
			Paper
			
•
			2507.23348
			
•
			Published
				
			•
				
				11
			
 
	
	 
	
	
	
			
			CellForge: Agentic Design of Virtual Cell Models
		
			Paper
			
•
			2508.02276
			
•
			Published
				
			•
				
				39
			
 
	
	 
	
	
	
			
			RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong
  Learning in Physical Embodied Systems
		
			Paper
			
•
			2508.01415
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			AgentTTS: Large Language Model Agent for Test-time Compute-optimal
  Scaling Strategy in Complex Tasks
		
			Paper
			
•
			2508.00890
			
•
			Published
				
			•
				
				6
			
 
	
	 
	
	
	
			
			LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
		
			Paper
			
•
			2508.01780
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and
  Decision in Embodied Agents
		
			Paper
			
•
			2508.02629
			
•
			Published
				
			•
				
				5
			
 
	
	 
	
	
	
			
			Efficient Agents: Building Effective Agents While Reducing Cost
		
			Paper
			
•
			2508.02694
			
•
			Published
				
			•
				
				85
			
 
	
	 
	
	
	
			
			SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from
  Experience
		
			Paper
			
•
			2508.04700
			
•
			Published
				
			•
				
				52
			
 
	
	 
	
	
	
			
			Training Long-Context, Multi-Turn Software Engineering Agents with
  Reinforcement Learning
		
			Paper
			
•
			2508.03501
			
•
			Published
				
			•
				
				56
			
 
	
	 
	
	
	
			
			Enhancing Vision-Language Model Training with Reinforcement Learning in
  Synthetic Worlds for Real-World Success
		
			Paper
			
•
			2508.04280
			
•
			Published
				
			•
				
				35
			
 
	
	 
	
	
	
			
			Agent Lightning: Train ANY AI Agents with Reinforcement Learning
		
			Paper
			
•
			2508.03680
			
•
			Published
				
			•
				
				96
			
 
	
	 
	
	
	
			
			Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web
  Agents
		
			Paper
			
•
			2508.01858
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
			
			CoAct-1: Computer-using Agents with Coding as Actions
		
			Paper
			
•
			2508.03923
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			OS Agents: A Survey on MLLM-based Agents for General Computing Devices
  Use
		
			Paper
			
•
			2508.04482
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			WideSearch: Benchmarking Agentic Broad Info-Seeking
		
			Paper
			
•
			2508.07999
			
•
			Published
				
			•
				
				109
			
 
	
	 
	
	
	
			
			A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm
  Bridging Foundation Models and Lifelong Agentic Systems
		
			Paper
			
•
			2508.07407
			
•
			Published
				
			•
				
				97
			
 
	
	 
	
	
	
			
			BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of
  Deep-Research Agent
		
			Paper
			
•
			2508.06600
			
•
			Published
				
			•
				
				40
			
 
	
	 
	
	
	
			
			WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
		
			Paper
			
•
			2508.05748
			
•
			Published
				
			•
				
				137
			
 
	
	 
	
	
	
			
			Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale
  Asynchronous RL
		
			Paper
			
•
			2508.07976
			
•
			Published
				
			•
				
				51
			
 
	
	 
	
	
	
			
			OpenCUA: Open Foundations for Computer-Use Agents
		
			Paper
			
•
			2508.09123
			
•
			Published
				
			•
				
				31
			
 
	
	 
	
	
	
			
			Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with
  Long-Term Memory
		
			Paper
			
•
			2508.09736
			
•
			Published
				
			•
				
				56
			
 
	
	 
	
	
	
			
			AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust
  GAIA Problem Solving
		
			Paper
			
•
			2508.09889
			
•
			Published
				
			•
				
				32
			
 
	
	 
	
	
	
			
			UI-Venus Technical Report: Building High-performance UI Agents with RFT
		
			Paper
			
•
			2508.10833
			
•
			Published
				
			•
				
				43
			
 
	
	 
	
	
	
			
			Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent
  Distillation and Agentic RL
		
			Paper
			
•
			2508.13167
			
•
			Published
				
			•
				
				127
			
 
	
	 
	
	
	
			
			MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
		
			Paper
			
•
			2508.13186
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			CAMAR: Continuous Actions Multi-Agent Routing
		
			Paper
			
•
			2508.12845
			
•
			Published
				
			•
				
				7
			
 
	
	 
	
	
	
			
			Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic
  Thought Reward
		
			Paper
			
•
			2508.12800
			
•
			Published
				
			•
				
				5
			
 
	
	 
	
	
	
			
			MCP-Universe: Benchmarking Large Language Models with Real-World Model
  Context Protocol Servers
		
			Paper
			
•
			2508.14704
			
•
			Published
				
			•
				
				42
			
 
	
	 
	
	
	
			
			Mobile-Agent-v3: Foundamental Agents for GUI Automation
		
			Paper
			
•
			2508.15144
			
•
			Published
				
			•
				
				64
			
 
	
	 
	
	
	
			
			AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
		
			Paper
			
•
			2508.16153
			
•
			Published
				
			•
				
				154
			
 
	
	 
	
	
	
			
			PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent
  LLMs
		
			Paper
			
•
			2508.17188
			
•
			Published
				
			•
				
				17
			
 
	
	 
	
	
	
			
			Training Language Model Agents to Find Vulnerabilities with CTF-Dojo
		
			Paper
			
•
			2508.18370
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks
		
			Paper
			
•
			2508.15804
			
•
			Published
				
			•
				
				15
			
 
	
	 
	
	
	
			
			MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World
  Tasks via MCP Servers
		
			Paper
			
•
			2508.20453
			
•
			Published
				
			•
				
				63
			
 
	
	 
	
	
	
			
			AWorld: Orchestrating the Training Recipe for Agentic AI
		
			Paper
			
•
			2508.20404
			
•
			Published
				
			•
				
				38
			
 
	
	 
	
	
	
			
			UItron: Foundational GUI Agent with Advanced Perception and Planning
		
			Paper
			
•
			2508.21767
			
•
			Published
				
			•
				
				12
			
 
	
	 
	
	
	
			
			The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
		
			Paper
			
•
			2509.02547
			
•
			Published
				
			•
				
				220
			
 
	
	 
	
	
	
			
			UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn
  Reinforcement Learning
		
			Paper
			
•
			2509.02544
			
•
			Published
				
			•
				
				123
			
 
	
	 
	
	
	
			
			AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making
  through Multi-Turn Reinforcement Learning
		
			Paper
			
•
			2509.08755
			
•
			Published
				
			•
				
				56
			
 
	
	 
	
	
	
			
			MCP-AgentBench: Evaluating Real-World Language Agent Performance with
  MCP-Mediated Tools
		
			Paper
			
•
			2509.09734
			
•
			Published
				
			•
				
				15
			
 
	
	 
	
	
	
			
			QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading
		
			Paper
			
•
			2509.09995
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for
  Open-Ended Deep Research
		
			Paper
			
•
			2509.13312
			
•
			Published
				
			•
				
				105
			
 
	
	 
	
	
	
			
			Scaling Agents via Continual Pre-training
		
			Paper
			
•
			2509.13310
			
•
			Published
				
			•
				
				113
			
 
	
	 
	
	
	
			
			WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic
  Data and Scalable Reinforcement Learning
		
			Paper
			
•
			2509.13305
			
•
			Published
				
			•
				
				89
			
 
	
	 
	
	
	
			
			Towards General Agentic Intelligence via Environment Scaling
		
			Paper
			
•
			2509.13311
			
•
			Published
				
			•
				
				70
			
 
	
	 
	
	
	
			
			WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon
  Agents
		
			Paper
			
•
			2509.13309
			
•
			Published
				
			•
				
				67
			
 
	
	 
	
	
	
			
			ReSum: Unlocking Long-Horizon Search Intelligence via Context
  Summarization
		
			Paper
			
•
			2509.13313
			
•
			Published
				
			•
				
				78
			
 
	
	 
	
	
	
			
			ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform
  Data
		
			Paper
			
•
			2509.15221
			
•
			Published
				
			•
				
				109
			
 
	
	 
	
	
	
			
			Towards Human-like Multimodal Conversational Agent by Generating
  Engaging Speech
		
			Paper
			
•
			2509.14627
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			LIMI: Less is More for Agency
		
			Paper
			
•
			2509.17567
			
•
			Published
				
			•
				
				100
			
 
	
	 
	
	
	
			
			ARE: Scaling Up Agent Environments and Evaluations
		
			Paper
			
•
			2509.17158
			
•
			Published
				
			•
				
				34
			
 
	
	 
	
	
	
			
			SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering
  Tasks?
		
			Paper
			
•
			2509.16941
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
		
			Paper
			
•
			2509.17336
			
•
			Published
				
			•
				
				10
			
 
	
	 
	
	
	
			
			GEM: A Gym for Agentic LLMs
		
			Paper
			
•
			2510.01051
			
•
			Published
				
			•
				
				87
			
 
	
	 
	
	
	
			
			Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel
  Execution
		
			Paper
			
•
			2509.25301
			
•
			Published
				
			•
				
				17
			
 
	
	 
	
	
	
			
			JoyAgent-JDGenie: Technical Report on the GAIA
		
			Paper
			
•
			2510.00510
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			Multi-Agent Tool-Integrated Policy Optimization
		
			Paper
			
•
			2510.04678
			
•
			Published
				
			•
				
				30
			
 
	
	 
	
	
	
			
			Don't Just Fine-tune the Agent, Tune the Environment
		
			Paper
			
•
			2510.10197
			
•
			Published
				
			•
				
				28
			
 
	
	 
	
	
	
			
			AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement
  Learning Framework for Stock Trading
		
			Paper
			
•
			2510.14264
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
		
			Paper
			
•
			2510.16872
			
•
			Published
				
			•
				
				90