Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.21618

about 9 hours ago

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14 • 18
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19 • 48

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published 12 days ago • 92
Agent Learning via Early Experience

Paper • 2510.08558 • Published 27 days ago • 260

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 132
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 137
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published Aug 21 • 64
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 154

ruslanmv/Medical-Llama3-8B

Text Generation • 8B • Updated May 15, 2024 • 992 • • 104
Robust Speech Recognition via Large-Scale Weak Supervision

Paper • 2212.04356 • Published Dec 6, 2022 • 40
Mykes/medicus

Text Generation • 3B • Updated Feb 18 • 72 • 5
DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published 12 days ago • 92

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published 6 days ago • 98
RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging

Paper • 2510.20479 • Published 14 days ago • 10
A Definition of AGI

Paper • 2510.18212 • Published 16 days ago • 33
Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published 13 days ago • 44

HoloScene: Simulation-Ready Interactive 3D Worlds from a Single Video

Paper • 2510.05560 • Published 30 days ago • 7
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning

Paper • 2510.06217 • Published 29 days ago • 62
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published about 1 month ago • 464
Fast-dLLM v2: Efficient Block-Diffusion LLM

Paper • 2509.26328 • Published Sep 30 • 49

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4, 2024 • 72
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 92

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 6 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

about 9 hours ago

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14 • 18
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19 • 48

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published 6 days ago • 98
RECALL: REpresentation-aligned Catastrophic-forgetting ALLeviation via Hierarchical Model Merging

Paper • 2510.20479 • Published 14 days ago • 10
A Definition of AGI

Paper • 2510.18212 • Published 16 days ago • 33
Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published 13 days ago • 44

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published 12 days ago • 92
Agent Learning via Early Experience

Paper • 2510.08558 • Published 27 days ago • 260

HoloScene: Simulation-Ready Interactive 3D Worlds from a Single Video

Paper • 2510.05560 • Published 30 days ago • 7
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning

Paper • 2510.06217 • Published 29 days ago • 62
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published about 1 month ago • 464
Fast-dLLM v2: Efficient Block-Diffusion LLM

Paper • 2509.26328 • Published Sep 30 • 49

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 132
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 137
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published Aug 21 • 64
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 154

Writing in the Margins: Better Inference Pattern for Long Context Retrieval

Paper • 2408.14906 • Published Aug 27, 2024 • 144
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140
Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published Sep 4, 2024 • 72
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 92

ruslanmv/Medical-Llama3-8B

Text Generation • 8B • Updated May 15, 2024 • 992 • • 104
Robust Speech Recognition via Large-Scale Weak Supervision

Paper • 2212.04356 • Published Dec 6, 2022 • 40
Mykes/medicus

Text Generation • 3B • Updated Feb 18 • 72 • 5
DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published 12 days ago • 92

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

about 6 hours ago

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 23
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 84
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs