3 24 2

Nitay Calderon

nitay

nitaytech

AI & ML interests

NLP, Controllable Generation, Counterfactual Generation, Domain Adaptation

Recent Activity

authored a paper about 18 hours ago

The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs

authored a paper about 18 hours ago

Multi-Domain Explainability of Preferences

authored a paper about 18 hours ago

LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals

View all activity

Organizations

authored 5 papers about 18 hours ago

The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs

Paper • 2501.10970 • Published Jan 19, 2025 • 1

Multi-Domain Explainability of Preferences

Paper • 2505.20088 • Published May 26, 2025 • 20

LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals

Paper • 2601.10700 • Published Jan 15 • 18

Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality

Paper • 2602.14080 • Published Feb 15 • 21

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

Paper • 2605.28556 • Published 7 days ago • 55

upvoted a paper about 23 hours ago

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

Paper • 2605.28556 • Published 7 days ago • 55

upvoted 2 papers 20 days ago

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling

Paper • 2605.12411 • Published 22 days ago • 49

MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

Paper • 2605.10616 • Published 23 days ago • 140

upvoted a paper 28 days ago

Hallucinations Undermine Trust; Metacognition is a Way Forward

Paper • 2605.01428 • Published May 2 • 24

upvoted a paper about 1 month ago

Efficient Agent Evaluation via Diversity-Guided User Simulation

Paper • 2604.21480 • Published Apr 23 • 15

upvoted a paper about 2 months ago

Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness

Paper • 2604.12373 • Published Apr 14 • 9

upvoted 3 papers 3 months ago

commented a paper 3 months ago

Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality

Paper • 2602.14080 • Published Feb 15 • 21 •

upvoted a paper 3 months ago

Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality

Paper • 2602.14080 • Published Feb 15 • 21

submitted a paper to Daily Papers 3 months ago

Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality

Paper • 2602.14080 • Published Feb 15 • 21

upvoted 2 papers 4 months ago

STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts

Paper • 2602.14265 • Published Feb 15 • 21

LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals

Paper • 2601.10700 • Published Jan 15 • 18

submitted a paper to Daily Papers 4 months ago

LIBERTy: A Causal Framework for Benchmarking Concept-Based Explanations of LLMs with Structural Counterfactuals

Paper • 2601.10700 • Published Jan 15 • 18

Nitay Calderon

AI & ML interests

Recent Activity

Organizations

nitay's activity