Intelligence is easy; cognition is hard
Distinguishing intelligent systems from cognitive systems
Transformer models have moderately complex organization and are readily treated as intelligent; C. elegans have well-understood neural organization we tend to disregard as meaningfully intelligent; and many systems show great performance in benchmarks while having very simple organizations.
It is tempting to assume complex performance requires complex internal structure. That if a system does something that looks hard, it must contain rich representations and functions that explain the performance. Yet, this confounds intelligence with cognition, treating the ability to solve complex tasks as evidence for complex internal structures, representations, and functions. Many algorithms are best understood as competent procedures under constraints. Simple computational methods can be highly effective in many contexts while still failing to display the breadth, transfer, robustness, and flexibility that characterize the cognition of highly adaptable biological systems.
Intelligence can emerge from trivial algorithms. What should concern the scientific community is not whether AI systems are intelligent but whether they can birth cognition. Can systems develop internal organization that warrants cognitive interpretation? This question cannot be answered while we conflate intelligence with cognition. Without principled criteria for when cognitive interpretation becomes justified, debates about AI capability remain unfalsifiable and the different communities involved (neuroscientists, AI researchers, philosophers, cognitive scientists) will lack common ground for what would count as cognition.
Here, I start by marking the distinction between (i) externally evaluated competence (intelligence) and (ii) internally explanatory organization (cognition). A system can score highly on tasks without justifying a cognitive interpretation of its internal variables (for example, a classifier model that is competent doesn’t rely on a cognitively meaningful internal architecture); conversely, a system can instantiate a rich internal organization and still score poorly on a given benchmark (for example, a human subject subject to biases and resource constraints).
I then propose a research program for interpreting whether an organization meets criteria for cognition, that I call Cognitive Interpretability. These definitions are intended to separate intelligence claims from cognitive claims, clarify the levels of description at which each operates, and make explicit what kinds of evidence would be relevant to evaluating them. The goal is to make "Can AI birth cognition?" empirically tractable.
Intelligence as competence under constraints
Intelligence is a property of a system’s capacity to achieve objectives across a class of tasks and environments under constraints. This means an intelligent system has generalizable capabilities and is not merely mapping inputs to outputs for a single task.
Achieving objectives means performing successfully relative to an objective (e.g., explicit reward, task success, error minimization). Intelligence is about doing well by the criterion. Some intelligent systems can be modeled as approximately optimizing an objective (e.g., reward, cost, accuracy, utility) under constraints, but intelligence does not require any explicit human-derived objective function. On this definition, intelligence is a notion of competence: it can apply to simple estimators and dimensionality-reduction procedures as well as to complex learned models, provided the evaluation is defined over a task class and constraints.
Intelligence can be attributed either to an agentic or to a non-agentic system depending on what is being evaluated. A non-agentic system is assessed for task competence: it maps inputs to outputs under a performance criterion without initiating actions or managing objectives over time (for example, a translation model or image classifier queried episodically). An agentic system is assessed for goal-directed behavior: it generates actions in pursuit of objectives, allocates resources over time, and arbitrates among candidate strategies. A system can be intelligent without being agentic, but when intelligence is evaluated in settings involving sustained, self-initiated goal pursuit and action selection, intelligence is attributed at the level of an agentic system rather than a input-output map. In sum, a system does not need to be agentic to be considered intelligent, in coherence with the widely accepted notion of artificial intelligence.
Because it is defined by tasks, objectives, and constraints rather than mechanisms, intelligence can be compared across architectures (animals, plants, humans, machines). The comparison class is defined by the tasks and the outcomes, and not by the mechanisms. This implies multiple realizability: different mechanisms can yield similar scores, and high scores do not identify a unique architecture. A complex and very general architecture can, due to its information-processing mechanisms, achieve a low score on a given task; strong competence can be achieved by mechanisms that do not support robust generalization.
Finally, this definition is compatible with psychological approaches to intelligence, as long as we understand intelligence as describing how well a system performs across different types of tasks. Psychometric theories identify such performance with competing models: a general factor model (general g factor), dual-factor models (crystallized vs. fluid intelligence), hierarchical models (three-stratum), or domain-specific constructs (social, emotional intelligence). These theories measure how individuals perform across task batteries, such as verbal comprehension, numerical reasoning, spatial awareness, and processing speed. They produce scores that describe what a system can do across related tasks.
Psychometric models of intelligence are tailored for human intelligence, which performs across a very large set of tasks. Models for artificial systems would require task batteries appropriate to the scope of intelligence being evaluated: narrow task families for specialized systems, broader batteries for systems claiming general intelligence. In either case, while any performance must arise from some internal organization, calling a system intelligent does not require specifying which one.
Cognition as explanatory organization of information processing
While intelligence concerns external performance, understanding how systems achieve that performance requires a different lens: cognition.
There is limited consensus on the definition of cognition, and any definition involves specification about which features are treated as central (Bayne et al., 2019; Siemens et al., 2022). The goal here is to adopt a broad umbrella definition that aligns with standard usage in cognitive science, while making explicit that different paradigms impose stricter criteria.
In cognitive science, a broad definition of cognition refers to the organized set of information-processing mechanisms through which inputs are acquired, transformed, stored, retrieved, and used in flexible, context-sensitive ways. “Information” here is used deliberately: what is stored and retrieved is internal informational states that can later contribute to many different outputs – including offline cognition (imagination, planning, rehearsal). By “organization,” I mean the structural and dynamic elements: what components exist (circuits, modules, memory stores), what internal states they maintain (representations, stored information), what operations transform those states (computations, update rules), and how components connect. Cognitive organization specifically supports flexible, context-sensitive information use across different tasks and contexts.
Cognition, then, refers to how a system internally solves problems; organization refers to what constitutes the system (its components, states, operations, and connections). For example, when you remember where you parked your car, cognition involves maintaining a spatial representation of the parking lot (internal state), retrieving and updating that representation based on visual cues (operations), and using it to guide your walking (output). The cognitive account explains not just that you found your car, but how your internal information processing made it possible.
A cognitive model is therefore explanatory: it specifies what problems the organization solves, how computations are performed, what information is encoded where, and how that information is used. It also has predictive power: the organization produces systematic behavioral signatures including regularities, characteristic failures, tradeoffs, and biases.
Cognitive functions as explanatory constructs of routines
The point of cognition as a scientific construct is to characterize the internal organization of computations into functionally relevant parts, to identify the representations those computations operate over, and to explain systematic regularities and irregularities. This is why cognitive science developed constructs such as working memory, attention, cognitive control, or predictive coding. These are explanatory theoretical constructs that predict what happens when environments are manipulated, resources are limited, information is occluded, or parts of the system are perturbed.
They are also commitments about internal organization that generate testable predictions, including characteristic tradeoffs and failure modes. Positing the existence of a given construct anchors a model to a given paradigm (e.g., Baddeley’s model of working memory, dual-process theory of reasoning, predictive brain hypothesis). The model inherits the constraints of the paradigms, including the incompatibility with constructs of an adversary paradigm.
A cognitive function is thus a stable subroutine. It is identifiable because it predicts regularities and irregularities. For instance, Baddeley’s working memory predicts the information capacity people can hold temporarily (7±2), and implies the ability to chunk representations. One could hope to identify representations in an artificial system, identify the internal organization of computations, and explain systematic regularities and irregularities with concepts. In computer science and AI, these concepts are often drawn from human cognitive science, because those are the most developed mechanistic taxonomies available. But it may be necessary to develop new concepts if one aims to distinguish classes of specifically machine cognition (granted that the relevant internal variables and organizational structure can be identified).
Systems of intelligence are not necessarily systems of cognition
This means that cognition is not a synonym for intelligence. Intelligence is the externally assessed capacity to achieve objectives. Cognition is the internally assessed organization of information processing. Many forms of robust, flexible intelligence are supported by cognitive organization, but narrow task competence can be achieved with minimal internal structure. Human cognition produces intelligent behavior, but intelligent behavior can also arise from minimal internal organization, such as simple linear models, lookup tables, or heuristics. Conversely, systems with rich cognitive organization can exhibit bounded intelligence due to uncertainty, resource constraints, and biases introduced by approximations.
Whether a given system counts as cognitive depends on the mechanistic criteria we adopt. If cognition is defined as any internal transformation of inputs, then trivial algorithms qualify. If cognition requires structured representational and control organization that supports flexible, context-sensitive reuse of information across situations and time, then entire families of algorithms are excluded. Different paradigms impose stricter constraints. Some require concept-like representations, others require decoupling from immediate perception. To make a cognitive attribution and close disagreements, the choice of criteria must be made explicit.
So, how do we determine whether a given system meets these criteria? By our definitions of cognition and organization, humans and most complex animals are cognitive systems. What about simpler organisms, like cells and plants? The answer depends on which paradigm's constraints we adopt. The same applies to artificial systems. For biological systems, cognitive neuroscience employs lesion studies, neural recordings, and behavioral manipulations to test whether internal organization supports cognitive functions. For artificial systems, we need different methods.
Interpreting whether an organization meets criteria for cognition: towards a Cognitive Interpretability
Interpreting whether an artificial system instantiates cognition requires more than identifying what the system computes. Mechanistic interpretability (MechInterp) provides a descriptive model of the system’s organization by reverse-engineering how a system works: what components and circuits exist, what internal states they maintain, what operations transform those states, and how information flows between components. A Cognitive interpretability program (CogInterp) would use this foundation to provide an explanatory model, and asks whether the organization meets criteria for cognition by linking the described organization to the problems it solves and evaluating whether it supports flexible, context-sensitive information use across tasks.
This distinction can be understood using Marr’s infamous three levels of analysis. MechInterp operates primarily at the algorithmic level (what representations and procedures exist) and implementational level (what components realize them). CogInterp would coordinates these mechanistic descriptions with the computational level: what problems does the organization solve, why do those solutions matter functionally, and does the organization enable flexible, generalizable problem-solving across contexts?
Demonstrating that a system learns useful features (implementational finding) does not establish that it uses those features in cognitively meaningful ways (algorithmic + computational claim). For instance, MechInterp might reveal that a language model has neurons that activate for syntactic structures. It doesn't tell us what computational problem this organization solves, whether it enables flexible problem-solving across contexts, or whether the problem is nontrivial. Showing that a system achieves task competence (computational-level performance) does not inform whether its solution involves cognitive organization or some other form of competent procedure.
Debates about AI involve many communities, and participants may treat evidence from one level as settling questions at another. A Cognitive Interpretability research program must establish explicit, testable criteria that coordinate findings across levels. Without such a program, claims about machine cognition remain unfalsifiable. We won’t distinguish systems that instantiate cognitive organization from those that achieve competence through simpler means, nor will we establish whether we’re building toward artificial general intelligence or sophisticated narrow tools.
The work ahead is to articulate Cognitive Interpretability as a research program: What principles define cognitive organization? What testable criteria distinguish cognitive from non-cognitive systems? What methods evaluate whether organization meets those criteria? Addressing these questions is the subject of ongoing work.
References
Bayne, T., Brainard, D., Byrne, R. W., Chittka, L., Clayton, N., Heyes, C., ... & Webb, B. (2019). What is cognition?. Current biology, 29(13), R608-R615.
Siemens, G., Marmolejo-Ramos, F., Gabriel, F., Medeiros, K., Marrone, R., Joksimovic, S., & de Laat, M. (2022). Human and artificial cognition. Computers and Education: Artificial Intelligence, 3, 100107.


Nice post. Many good arguments and nuanced discussion.