Launching Agent Leaderboard v2: The Enterprise-Grade Benchmark for AI Agents By pratikbhavsar and 1 other • 19 days ago
🎲 [ICLR 2025] DICE: Data Influence Cascade in Decentralized Learning By TongtianZhu • 20 days ago • 1
Unlocking Healthcare AI: I'm Releasing State-of-the-Art Medical Models for Free. Forever. By MaziyarPanahi • 20 days ago • 131
A Survey of Small Language Models in the Era of LLMs: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness By FairyFali • 20 days ago • 1
5 Things You Need to Know About Moonshot AI and Kimi K2, the New #1 model on the Hub By fdaudens and 1 other • 21 days ago • 21
Take Control of What Your LLM Knows and Does — with the EasyEdit Tool Series By xzwnlp and 3 others • 22 days ago • 4
LG AI Research Partners with FriendliAI to Launch EXAONE 4.0 for Fast, Scalable API By FriendliAI • 22 days ago • 3
MultiTalk Levelled Up - Way Better Animation Compared to Before with New Workflows - Image to Video By MonsterMMORPG • 22 days ago • 1
Understanding the AGI Seed Prompt: Multi-Layered Cognitive Initialization for Advanced AI Systems By kanaria007 • 22 days ago
<p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p> By hba123 and 2 others • 23 days ago • 11
Seeing Isn’t Understanding: The Spatial Reasoning Gap in Vision-Language Models By KBayoud • 24 days ago • 8
Announcing UA-Code-Bench: a New Benchmark for Evaluating LLMs on Competitive Programming Tasks in Ukrainian By anon-researcher-ua • 24 days ago