Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arXiv:2510.16872

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 18
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 30
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 33
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

Read Later Stack

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published 26 days ago • 31
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published about 1 month ago • 9
Making Mathematical Reasoning Adaptive

Paper • 2510.04617 • Published Oct 6 • 22
DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published 27 days ago • 26

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published Aug 14 • 28
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20 • 42
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 154

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 97
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Paper • 2501.02790 • Published Jan 6 • 9
Who's Your Judge? On the Detectability of LLM-Generated Judgments

Paper • 2509.25154 • Published Sep 29 • 29
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published Sep 30 • 54

RUC-DataLab/DeepAnalyze-8B

Text Generation • 8B • Updated 14 days ago • 3.74k • 56
RUC-DataLab/DataScience-Instruct-500K

Viewer • Updated 19 days ago • 26.2k • 7.62k • 53
RUC-DataLab/DABStep-Research

Preview • Updated 22 days ago • 57
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

Paper • 2510.16872 • Published 21 days ago • 92

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 220
Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published 26 days ago • 31
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

Paper • 2510.16872 • Published 21 days ago • 92

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 132
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 137
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published Aug 21 • 64
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 154

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 189
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Paper • 2401.04658 • Published Jan 9, 2024 • 27
Weaver: Foundation Models for Creative Writing

Paper • 2401.17268 • Published Jan 30, 2024 • 45
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 18
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 30
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 33
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

RUC-DataLab/DeepAnalyze-8B

Text Generation • 8B • Updated 14 days ago • 3.74k • 56
RUC-DataLab/DataScience-Instruct-500K

Viewer • Updated 19 days ago • 26.2k • 7.62k • 53
RUC-DataLab/DABStep-Research

Preview • Updated 22 days ago • 57
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

Paper • 2510.16872 • Published 21 days ago • 92

Read Later Stack

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published 26 days ago • 31
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published about 1 month ago • 9
Making Mathematical Reasoning Adaptive

Paper • 2510.04617 • Published Oct 6 • 22
DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published 27 days ago • 26

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 220
Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published 26 days ago • 31
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

Paper • 2510.16872 • Published 21 days ago • 92

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models

Paper • 2508.10751 • Published Aug 14 • 28
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20 • 42
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 154

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21 • 132
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 137
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published Aug 21 • 64
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22 • 154

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 97
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Paper • 2501.02790 • Published Jan 6 • 9
Who's Your Judge? On the Detectability of LLM-Generated Judgments

Paper • 2509.25154 • Published Sep 29 • 29
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published Sep 30 • 54

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 189
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Paper • 2401.04658 • Published Jan 9, 2024 • 27
Weaver: Foundation Models for Creative Writing

Paper • 2401.17268 • Published Jan 30, 2024 • 45
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs