R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning Paper • 2508.21113 • Published Aug 28 • 109
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment Paper • 2502.18965 • Published Feb 26 • 28