Microsoft

company

Verified

https://www.microsoft.com/en-us/research/

microsoft

AI & ML interests

None defined yet.

Recent Activity

curiousT updated a dataset about 15 hours ago

microsoft/Updesh_beta

yangwang92 authored a paper 8 days ago

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

yangwang92 authored a paper 8 days ago

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

View all activity

curiousT

updated a dataset about 15 hours ago

microsoft/Updesh_beta

Viewer • Updated about 15 hours ago • 8.85M

XumengWen

authored a paper 8 days ago

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

Paper • 2506.14245 • Published 9 days ago • 35

unilm

authored 4 papers 16 days ago

Think Only When You Need with Large Hybrid-Reasoning Models

Paper • 2505.14631 • Published May 20 • 19

On-Policy RL with Optimal Reward Baseline

Paper • 2505.23585 • Published 28 days ago • 14

Rectified Sparse Attention

Paper • 2506.04108 • Published 22 days ago • 10

Reinforcement Pre-Training

Paper • 2506.08007 • Published 16 days ago • 227

tricktreat

authored a paper 21 days ago

TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence

Paper • 2505.24500 • Published 27 days ago • 12

szanella

authored 3 papers 24 days ago

Closed-Form Bounds for DP-SGD against Record-level Inference

Paper • 2402.14397 • Published Feb 22, 2024

Analyzing Leakage of Personally Identifiable Information in Language Models

Paper • 2302.00539 • Published Feb 1, 2023

Securing AI Agents with Information-Flow Control

Paper • 2505.23643 • Published 28 days ago • 1

lx865712528

authored a paper 27 days ago

How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective

Paper • 2505.21505 • Published 29 days ago • 18

tricktreat

authored a paper 29 days ago

ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models

Paper • 2505.21500 • Published 29 days ago • 12

nushib

authored 5 papers 30 days ago

Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

Paper • 2304.03916 • Published Apr 8, 2023

Diversity of Thought Improves Reasoning Abilities of Large Language Models

Paper • 2310.07088 • Published Oct 11, 2023 • 5

Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models

Paper • 2404.06209 • Published Apr 9, 2024 • 5

Eureka: Evaluating and Understanding Large Foundation Models

Paper • 2409.10566 • Published Sep 13, 2024

BENCHAGENTS: Automated Benchmark Creation with Agent Interaction

Paper • 2410.22584 • Published Oct 29, 2024 • 1

tricktreat

authored 3 papers about 1 month ago

Let LLMs Break Free from Overthinking via Self-Braking Tuning

Paper • 2505.14604 • Published May 20 • 23

Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning

Paper • 2505.14684 • Published May 20 • 23

VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models

Paper • 2505.15801 • Published May 21 • 17