Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs

Sap, Maarten; LeBras, Ronan; Fried, Daniel; Choi, Yejin

Computer Science > Computation and Language

arXiv:2210.13312 (cs)

[Submitted on 24 Oct 2022 (v1), last revised 3 Apr 2023 (this version, v2)]

Title:Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs

Authors:Maarten Sap, Ronan LeBras, Daniel Fried, Yejin Choi

View PDF

Abstract:Social intelligence and Theory of Mind (ToM), i.e., the ability to reason about the different mental states, intents, and reactions of all people involved, allow humans to effectively navigate and understand everyday social interactions. As NLP systems are used in increasingly complex social situations, their ability to grasp social dynamics becomes crucial. In this work, we examine the open question of social intelligence and Theory of Mind in modern NLP systems from an empirical and theory-based perspective. We show that one of today's largest language models (GPT-3; Brown et al., 2020) lacks this kind of social intelligence out-of-the box, using two tasks: SocialIQa (Sap et al., 2019), which measures models' ability to understand intents and reactions of participants of social interactions, and ToMi (Le et al., 2019), which measures whether models can infer mental states and realities of participants of situations. Our results show that models struggle substantially at these Theory of Mind tasks, with well-below-human accuracies of 55% and 60% on SocialIQa and ToMi, respectively. To conclude, we draw on theories from pragmatics to contextualize this shortcoming of large language models, by examining the limitations stemming from their data, neural architecture, and training paradigms. Challenging the prevalent narrative that only scale is needed, we posit that person-centric NLP approaches might be more effective towards neural Theory of Mind.
In our updated version, we also analyze newer instruction tuned and RLFH models for neural ToM. We find that even ChatGPT and GPT-4 do not display emergent Theory of Mind; strikingly even GPT-4 performs only 60% accuracy on the ToMi questions related to mental states and realities.

Comments:	Originally published at EMNLP 2022, extended to include ChatGPT and GPT-4 models on March 30th 2023 (extension not peer reviewed)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2210.13312 [cs.CL]
	(or arXiv:2210.13312v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.13312

Submission history

From: Maarten Sap [view email]
[v1] Mon, 24 Oct 2022 14:58:58 UTC (2,811 KB)
[v2] Mon, 3 Apr 2023 15:26:20 UTC (2,940 KB)

Computer Science > Computation and Language

Title:Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators