Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2506.11763

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published 13 days ago • 58

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published 13 days ago • 58
A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications

Paper • 2506.12594 • Published 12 days ago • 1

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published 20 days ago • 123
Magistral

Paper • 2506.10910 • Published 14 days ago • 59
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Paper • 2506.07240 • Published 18 days ago • 4
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Paper • 2506.09991 • Published 15 days ago • 56

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19 • 45
facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 1.3k • 506
nvidia/OpenMathReasoning

Viewer • Updated 30 days ago • 5.68M • 15.4k • 288
Search Arena: Analyzing Search-Augmented LLMs

Paper • 2506.05334 • Published 21 days ago • 17

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model

Paper • 2410.13639 • Published Oct 17, 2024 • 19
MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Paper • 2410.13757 • Published Oct 17, 2024 • 33
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 62
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

Paper • 2411.18478 • Published Nov 27, 2024 • 38

about 5 hours ago

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 29
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 41
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 55
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

Eval Leaderboards

Running

4.48k

4.48k

Chatbot Arena Leaderboard

🏆

Display chatbot leaderboard and stats
Running on CPU Upgrade

13.2k

13.2k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade

5.91k

5.91k

MTEB Leaderboard

🥇

Embedding Leaderboard
Running

523

523

LLM-Perf Leaderboard

🏆

Explore LLM performance across hardware

about 4 hours ago

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 17
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 32
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs