> Mahowald, Kyle, et al. _Dissociating Language and Thought in Large Language Models: A Cognitive Perspective_. arXiv:2301.06627, arXiv, 16 Jan. 2023. _arXiv.org_, [https://doi.org/10.48550/arXiv.2301.06627](https://doi.org/10.48550/arXiv.2301.06627).
# Dissociating language and thought in large language models: A cognitive perspective
## Introduction
- The Turing test has led to fallacies and misconceptions related to the language-thought relationship
- **==Good at language does not imply good at thinking==**
- Bad at thinking does not imply bad at language \[it's just the contrapositive though\]
- **==Formal vs functional linguistic competence==**: language rules vs ability to use language in real world
- **==Language and thought in humans are robustly dissociable==**
- Evidence stems from individuals with aphasia and brain imaging studies
## Formal linguistic competence
- **==LLMs learn hierarchical structure and abstraction==**
- Even on semantically empty sentences, LLMs generate correct syntax with varying success, but consistently above chance
- LLMs pick up on statistical regularities without necessarily learning linguistic information -> **=="right for the wrong reason"==**
- Combination of word co-occurrence knowledge and abstract morphosyntactic rules
- Which biases are introduced by model architectures? Same as humans?
## Functional linguistic competence
- **==LLMs are great at pretending to think==**
- Struggle to come up with creative solutions to novel, unseen tasks
- **Formal reasoning**: dissociated from language in cognitive systems
- **Semantic knowledge** too (studies on aphasia and semantic dementia)
- LLM world knowledge is brittle and biased
- **SItuation modeling** might be performed by the default network and not the language network
- By design, LLMs are incapable of tracking information over long contexts
- **Social reasoning**: theory of mind network
- LLMs are unable to interpret sarcasm or complete jokes, lack communicative intent.
- They have "nothing to say", since their objective is maximizing next-word predictive accuracy
- **==LLMs' behaviour highlights the difference between being good at language and being good at thought==**
- **Is it reasonable to use a _single_ system with a _single_ objective function to model _diverse_ functional language capabilities?**
## Building human-like models
- **==Modularity==**
- **Architectural modularity**: pairing transformers with memory modules
- **Emergent modularity**: end-to-end training that allows specialized modules to develop within the model (e.g. attention heads attending to different input features)
- **==Training==**
- **Training on naturalistic data** is **biased** towards low-level input properties, **does not reflect the world** faithfully, and incentivizes the model to learn patterns but **limits their ability to generalize**
- **Need to adjust ==training data== and ==objective function==**
- Counterintuitively, mastering language may _require_ a general intelligence model
- **==Benchmark==**
- **No single benchmark** for evaluating functional linguistic competence
- Developing **comprehensive and separate assessments of formal and functional linguistic competence in LLMs** will enable the development of better models
## Conclusion
- **LLMs are very ==successful on formal linguistic competence tasks==, but ==struggle at functional linguistic competence==**
- **Next-word predicting models are not enough for developing an AGI. On the contrary, ==an AGI may be needed for excelling at real-life language==**
- **Future advances in AGI will probably require ==combining LLMs with models that perform well at functional tasks== (reasoning, abstract modeling…)**