arXiv:2511.03718

Grounded Misunderstandings in Asymmetric Dialogue: A Perspectivist Annotation Scheme for MapTask

Published on Nov 5

· Submitted by

Nan Li on Nov 6

NLP Group at Utrecht University

Upvote

Authors:

Abstract

A perspectivist annotation scheme for the HCRC MapTask corpus reveals how understanding emerges, diverges, and repairs in collaborative dialogue, highlighting the role of multiplicity discrepancies in referential misalignment.

AI-generated summary

Collaborative dialogue relies on participants incrementally establishing common ground, yet in asymmetric settings they may believe they agree while referring to different entities. We introduce a perspectivist annotation scheme for the HCRC MapTask corpus (Anderson et al., 1991) that separately captures speaker and addressee grounded interpretations for each reference expression, enabling us to trace how understanding emerges, diverges, and repairs over time. Using a scheme-constrained LLM annotation pipeline, we obtain 13k annotated reference expressions with reliability estimates and analyze the resulting understanding states. The results show that full misunderstandings are rare once lexical variants are unified, but multiplicity discrepancies systematically induce divergences, revealing how apparent grounding can mask referential misalignment. Our framework provides both a resource and an analytic lens for studying grounded misunderstanding and for evaluating (V)LLMs' capacity to model perspective-dependent grounding in collaborative dialogue.

View arXiv page View PDF Add to collection

Community

chnln

Paper submitter about 14 hours ago

We operationalize reference grounding as perspective-dependent alignment, introducing a five-attribute hierarchy that separately tracks speaker-intended vs. addressee-interpreted referents. Our LLM-assisted annotation pipeline achieves high accuracy on 13k MapTask reference expressions. Analyses reveal that multiplicity discrepancies, where the same landmark appears twice on one map but once on another, systematically induce misunderstandings (12% rate vs. 1.8% corpus average), while participants show strong preferences for reusing grounded landmarks. Our work establishes both a resource and benchmark for evaluating (V)LLMs on incremental grounding.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2511.03718 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.03718 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.03718 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.