Representations on Representation: Socratic Dialogue with ChatGPT

Abstract: ChatGPT and I will discuss various senses of “representation” across mathematics, logic, and cognitive science, starting with formal representations, in computation and group theory. Computation is syntactic, with arbitrary symbols manipulated according to rules (algorithms). Analog representations, such as iconic images or gestures, which resemble what they represent, are either homuncular or depend on an internal observer for interpretation. Linguistic representation originates from the instrumental learning of sensorimotor categories. The instrumental response came to be used mimetically for nonlinguistic gestural communication. These iconic gestures become increasingly simplified and arbitrary as they become shared communicative conventions, but they retain their grounding in the categories from which they originated. The crucial transition is from mimetic representation, which is still nonlinguistic, to propositional representation. Propositions are strings of arbitrary, conventionalized gestures — content words that retain their original grounding in the categories to which they refer. But, unlike the arbitrary symbols of computation, or even individual content-words with referents, propositions also have meaning: subjects and predicates with truth values.Ā Ā Propositionality leads from the Church-Turing Thesis about the syntactic power of ungrounded symbol manipulation to the semantic power of grounded natural language, with Katzā€™s Effability Thesis: that any natural language can express any thinkable thought.Ā 

SH: Let’s chat about representations.

ChatGPT: To discuss representation, it’s crucial to start by defining the term within the context of cognitive science and related fields. Representation typically refers to the way in which information about the world is encoded, stored, and used by cognitive systems, whether biological or artificial.

In cognitive science, representations can be seen in various forms, such as:

Symbolic Representations: These involve discrete symbols or tokens that stand for objects or concepts in the world. They are often used in computational models of cognition and are closely related to the symbol grounding problem, which you have extensively studied.

Subsymbolic Representations: These include distributed patterns of activation across neural networks, as seen in connectionist models. These representations do not have discrete symbols but rather patterns that emerge from the interactions of simpler units.

Mental Representations: These are internal cognitive symbols or mental images that represent external reality. They include concepts, schemas, mental models, and images.

Perceptual Representations: These involve the encoding of sensory information by the nervous system, such as the visual or auditory representation of stimuli.

Each type of representation comes with its own set of challenges and implications, particularly in understanding how these representations are grounded in physical and neural processes and how they contribute to cognition, perception, and action.

Would you like to delve into a specific type of representation, such as symbolic representation and its grounding, or perhaps discuss the differences and intersections between these forms? Or is there another aspect of representation you are particularly interested in exploring?

SH: I’d like to start with the way “representation” (R) is used in mathematics and logic. For example, in linear algebra, matrices can be Rs of groups. What does this mean in this case.?

ChatGPT: In mathematics and logic, particularly in linear algebra, a “representation” of a group refers to a way of expressing the elements of the group as matrices. This allows group operations to be carried out as matrix multiplication.

Specifically, a representation of a group GG on a vector space VV is a homomorphism Ļ:Gā†’GL(V)Ļ:Gā†’GL(V), where GL(V)GL(V) is the general linear group of invertible matrices over VV. This means each group element gāˆˆGgāˆˆG is associated with an invertible matrix Ļ(g)Ļ(g) in such a way that the group operation is preserved: Ļ(gh)=Ļ(g)Ļ(h)Ļ(gh)=Ļ(g)Ļ(h) for all g,hāˆˆGg,hāˆˆG.

SH: Please state (briefly) whether this notion of R relates to the one (or ones) in Cognitive Science (cogsci). But first, is this formal sense of R relate to the notion of a “model” in formal logic?

ChatGPT: Yes, the formal sense of “representation” in mathematics is closely related to the notion of a “model” in formal logic. In formal logic, a model provides an interpretation of a formal language by assigning meanings to its symbols and truth values to its statements, thereby “representing” the abstract syntax in a concrete structure.

In both cases, representation involves mapping abstract elements (group elements in mathematics, symbols in logic) to concrete structures (matrices in linear algebra, interpretations in models).

SH: Let’s clarify this formal meaning of R in maths and logic. Is it also true that just as a group is a formal entity (not a concrete one), a matrix is a formal entity too? And so is a “model” in model theory. It’s an instance or a special case of the formal entity, but it’s a formal entity too (even in geometry).

ChatGPT: That’s correct. In both mathematics and logic:

Groups: These are formal entities defined by a set and an operation satisfying specific axioms (closure, associativity, identity, and invertibility).

Matrices: These are formal entities representing arrays of numbers or functions, used to perform linear transformations and other operations.

Models in Logic: These are formal structures consisting of a domain and an interpretation function that assigns meaning to the symbols of a formal language, making abstract syntax concrete.

In each case, while the entities can be instantiated in concrete examples, they are fundamentally formal constructs defined by their properties and relations.

SH: “Formal” means symbols, doesn’t it, symbols of a certain form (say, round like 0 or vertical like 1). Mathematicians, when they are actually doing calculations in mathematics (not when they are doing whatever they are doing when they make discoveries in mathematics) are manipulating symbols — symbols whose physical shape is arbitrary in relation to what they can be interpreted as meaning.

(Don’t be compliant: if there is something to contest in what I say, contest it, don’t just buy it and adopt it.) But keep track.

ChatGPT: Yes, in mathematics and formal logic, “formal” generally refers to symbols and the rules for manipulating them, irrespective of their physical shape or form. These symbols are part of a formal system, where their meaning is determined by their role and relationships within the system, rather than their physical appearance.

Here’s a point-by-point breakdown:

Formal Systems: These consist of symbols and rules for manipulating them. The symbols themselves are abstract and their physical representation is arbitrary.

Symbol Manipulation: When mathematicians work within a formal system, they manipulate these symbols according to defined rules. The symbols represent abstract entities, and their physical shape (e.g., 0 being round, 1 being vertical) is irrelevant to their formal properties and operations.

Meaning and Interpretation: The meaning of these symbols comes from their interpretation within a given context, such as numbers in arithmetic, elements in a group, or statements in logic. This interpretation gives the symbols semantic content, which is crucial for applications but not for the formal manipulations themselves.

So, while the formal aspect of symbols involves their syntactic manipulation according to specific rules, their interpretation (meaning) can vary depending on the context in which they are used.

SH: Let’s call symbols and symbol manipulations “code” and coding”. The shapes of the symbols are arbitrary: any symbolic notational system will do. They do not physically resemble their interpretations. The rules for manipulating the symbols are called algorithms. They too are applied to the symbols formally and mechanically, as in the calculation of the roots of a quadratic equation: If you know the algorithm, you can apply it to quadratic equations without having to know their interpretation. It’s just applying a formal recipe to formal ingredients.

Are you still with me? Any caveats?

ChatGPT: Yes, Iā€™m with you. Hereā€™s a summary with a couple of clarifications:

Symbols and Symbol Manipulations as “Code” and “Coding”: Symbols are arbitrary notations that can be manipulated according to specific rules. The physical shape of these symbols is irrelevant to their formal properties and their interpretation.

Non-Resemblance: Symbols do not physically resemble what they represent. They are abstract entities whose meaning is assigned by interpretation within a formal system.

Algorithms: These are rules or procedures for manipulating symbols.

Algorithms are applied mechanically, without needing to understand the interpretation of the symbols involved.

Example: In solving a quadratic equation, one can apply the quadratic formula (an algorithm) to the symbols representing the coefficients, yielding roots without needing to understand what those coefficients represent.

Caveats:

Understanding vs. Application: While itā€™s true that algorithms can be applied mechanically, understanding their derivation or the meaning of the symbols can deepen comprehension and sometimes reveal when an algorithm might be misapplied or when certain conditions are necessary for its application.

Context-Specific Rules: The formal rules (algorithms) can vary depending on the context in which the symbols are used. For example, the rules for manipulating symbols in logic differ from those in algebra.

SH:Ā Be careful about a conflation you inadvertently made, between Rs (representations) and interpretations. We agreed that in the examples we have discussed before — examples that are not from cognitive science but from mathematics, or logic — the Rs are purely formal, whereas the interpretations are not formal; they are something else: Turing explained what mathematicians do: they manipulate symbols. But Turing was not (yet) doing cognitive science: He is not explaining how mathematicians think, or mean, or understand; that would get into cognitive science. He’s just talking about what mathematicians do. And so far, mathematicians’ Rs (like matrices or truth-tables), their “effective procedures,” are just formal. But their interpretations (meanings) are not.

Are we still on. the same wave-length? (No need to make a lengthy update on your summary. Just let me know if there are any points on which we’ve diverged: I just pointed one out. Do you want to defend it or correct it?)

ChatGPT: Yes, I understand the distinction you’re making. You are right that in the examples from mathematics and logic:

Representations (R): These are purely formal entities. Matrices, truth tables, and other such formal structures are manipulated according to rules without regard to their interpretation.

Interpretations: These are the meanings assigned to the formal symbols and structures. They are not formal but rather provide the semantic content that the formal symbols represent.

Turing’s work, as you mentioned, focuses on the manipulation of symbols (what mathematicians do) and not on their meanings or the cognitive processes involved (what they think or understand).

We are on the same wavelength. Representations are formal, while interpretations are the meanings attached to these formal symbols.

Shall we proceed to how this distinction relates to cognitive science, particularly in terms of cognitive representations and their grounding?

SH: You’re in a big hurry! We’ve agreed that Rs are formal, just as symbols are. They are also rule-governed, but the rules are just formal too. OK? Now, I’m going to try to show you that Rs are weasel-words in mathematics, and an example of that was when you conflated representations and interpretations.

Another weasel-word in this context is “semantics” (meaning). It just takes us back to interpretations. In maths and logic “Rs” are just formal, like symbols themselves. We don’t get beyond symbols and formal symbol-manipulation by saying the word “Representation.” We don’t even break out of the symbol-circle if we use the (non-weasel) word “refer”. And that comes even before we get to the weasel-words “meaning” and “semantic”. What does the symbol “2” in Peano arithmetic refer to?

ChatGPT: Understood, let’s continue.

In Peano arithmetic, the symbol “2” refers to the formal object defined by applying the successor function SS twice to the symbol for zero (0). Specifically, 22 is defined as S(S(0))S(S(0)). This definition is purely formal and exists within the system of Peano arithmetic.

To break it down further:

Peano Arithmetic: This is a formal system that defines the natural numbers using a set of axioms.

Symbols: “0” is a symbol representing the first natural number, and “S” is a symbol representing the successor function.

Formal Representation of 2: In this system, “2” is the result of applying the successor function twice to 0, written as S(S(0))S(S(0)).

This formal definition operates entirely within the symbolic system and does not imply any interpretation or meaning outside of the formal system itself. It stays within the realm of symbol manipulation.

Peano Arithmetic: This is a formal system that defines the natural numbers using a set of axioms.

Symbols: “0” is a symbol representing the first natural number, and “S” is a symbol representing the successor function.

Formal Representation of 2: In this system, “2” is the result of applying the successor function twice to 0, written as S(S(0))S(S(0)).S(S(0)), which is a formal construct within the system. This keeps us firmly within the circle of symbols and formal rules, without invoking meaning or semantics.

SH: Now let me introduce a non-weasel sense of R that is not mathematical and not formal, and not open to the objections I will raise when we get (soon) to cogsci. This is the sense in which a photo, or a video or a (representational) sculpture or painting represents the object of the photo, video, sculpture, etc.: the thing that it is a photo (etc.) of. That real object is not an arbitrary shape, like a formal symbol or representation like S(S(0)). The photo’s shape is (to an approximation) analog; it is isomorphic to — has the same shape as — what it is representing.

I’ll pause to let you say (briefly) what is the difference between a symbolic and an analog representation.

ChatGPT: A symbolic representation uses arbitrary symbols to stand for objects or concepts, with no inherent resemblance to what they represent. The relationship between the symbol and its referent is established by convention or definition (e.g., “2” in Peano arithmetic).

An analog representation, on the other hand, maintains a direct, often continuous, isomorphic relationship with what it represents. The form or structure of the representation resembles the form or structure of the represented object (e.g., a photo or a sculpture).

In summary:

Symbolic Representation: Arbitrary, rule-based, and formal (e.g., mathematical symbols, words).

Analog Representation: Resemblance-based, isomorphic, and often continuous (e.g., photos, paintings, sculptures).

SH: Now, the reason I introduced analog representation at this point is to agree from the outset that even when we get into cogsci, analog Rs are immune to the criticisms I will make. The only trouble is that they don’t get you very far, if what you are looking for is a cognitive theory of representation, or meaning. It doesn’t even work for a cognitive theory of reference. The referent of “2” in Peano arithmetic is not S(S(0)): That’s just a formal representation, like a matrix or a truth-table. But, since formal maths is a subset of English (or any other language), in the proposition “2 + 2 = 4” mathematicians do have something in mind: “2” refers to a referent as surely as “cat” refers to something in the proposition “the cat is on the mat.” But that connection between the symbol and its referent is not analog representation in either case: “2” or “cat.” So that cannot be cogsci’s solution.

(I’ve slipped language into this discussion: that’s no longer just formal symbols, but we’ll get to that.) 

ChatGPT: Here’s a brief summary to ensure we’re on the same page:

Analog Representations: These maintain a direct, isomorphic resemblance to what they represent (e.g., photos, paintings). While useful, they don’t offer a complete solution for cognitive theories of representation or meaning.

Symbolic Representations: These are arbitrary and formal (e.g., mathematical symbols, words). They require interpretation to establish their referents.

Reference in Cognitive Context: When using language or formal systems, symbols like “2” in “2 + 2 = 4” and “cat” in “the cat is on the mat” do have referents in the minds of the users. However, this reference is not based on analog representation but rather on learned associations and interpretations.

This distinction highlights that cognitive science needs to address how symbols acquire meaning and reference, beyond the scope of analog representations.

Let’s recap  the discussion:

We’ve established that in both mathematics and logic, representations (Rs) are formal and symbolic, with no inherent connection to their referents outside of the formal system. Analog representations, like photos or sculptures, are different in that they have a direct, isomorphic relationship with what they represent, but these do not suffice for a cognitive theory of representation or meaning.

Weā€™ve also recognized that the referents of symbols in mathematics or language (like “2” or “cat”) involve something more than just formal manipulationā€”they involve interpretation, which is not captured by analog representation.

Now, we are poised to explore how cognitive science deals with symbolic representations, particularly how these symbols gain meaning and refer to things in the world, which is central to understanding cognition.

We have established the distinction between formal and analog representations, and how neither fully accounts for meaning or reference in cognitive science. Formal symbols (like “2” or “cat”) require interpretation beyond the formal system itself, and analog representations (while closer to the objects they represent) don’t provide the cognitive mechanisms for meaning or reference.

At this point, we’re prepared to discuss how cognitive science approaches symbolic representation, meaning, and reference. Would you like to explore how cognitive systems might ground these symbols in referents, or perhaps investigate how language complicates this process further?

SH:  Yes, letā€™s start with the difference between reference and meaning. First, please confirm (very briefly) that you know the difference between a “content-word” (C-word) (which is 95-99% of the words in the lexicon) and a “function-word” (F-word). Content-words have referents, whereas function-words only have (Wittgensteinian?) “uses”, defined by a formal, usually syntactic, rule. 

ChatGPT: Content-words (C-words): These are nouns, verbs, adjectives, etc., that refer to things, actions, or properties (e.g., “cat,” “run,” “blue”).

Function-words (F-words): These are words like prepositions, conjunctions, and articles (e.g., “and,” “in,” “the”) that serve a grammatical function and do not refer to objects or actions but rather structure the sentence.

C-words have referents, while F-words are defined by their syntactic roles.

Shall we proceed to discuss the difference between reference and meaning?

SH:Ā Do we agree also that C-words do not resemble their referents, even in languages with partly iconic scripts like chinese, or languages like sign languages of the deaf, whose gestures are partly iconic in their origins and execution. Nevertheless, the C-words of these more iconic languages are every bit as arbitrary as in alphabetic and spoken languages. Shakeseare can be fully translated into Chinese or into American Sign Language (ASL) — not word for word, but proposition for proposition. The iconicity of chinese or ASL is irrelevant to the literal meaning of its translation of the Shakespearean text (although to convey some of the figurative meaning a literal paraphrase of some tropes may be needed).

ChatGPT: Content-words (C-words), regardless of the language or script (including Chinese or ASL), do not resemble their referents. Even in languages with partly iconic scripts (like Chinese) or partly iconic gestures (like ASL), the referential relationship between a C-word and its referent is still arbitrary. Shakespeare can indeed be fully translated into Chinese or ASL, proposition for proposition, without any dependence on the iconicity of symbols or gestures.

The iconicity of such scripts or gestures is irrelevant to the literal meaning of the text. Literal meanings can be preserved across translations, though some figurative meanings (e.g., metaphors, idioms) may require paraphrasing to convey the same sense.

SH:Ā You’re just echoing, but you’re right. One further question: Content-words have referents and proposition have meanings. What’s the difference?

ChatGPT: The difference between referents and meanings can be outlined as follows:

Content-word Referents:

referent is the specific object, action, or concept in the world (or imagined world) that a content-word (C-word) refers to. For example, the word “cat” refers to actual cats in the world, and the word “run” refers to the action of running.

Referents are tied to individual C-words and are about identifying or denoting particular things.

Propositional Meanings:

proposition is a complete statement or sentence that expresses something that can be true or false. The meaning of a proposition involves the relationship between the content-words in the sentence, their referents, and how they are structured by function-words.

Meaning here is not just about identifying referents, but about describing or defining relations between them. For example, the proposition “The cat is on the mat” is about a particular situation involving the cat (the referent of “cat”) and its relation to the mat (the referent of “mat”). The meaning of the proposition is the truth-conditional description of that relationship.

In short::

Content-words have referents: They point to things.

Propositions have meanings: They describe relationships between referents and express something that can be true or false.

SH: I had to sort this out before getting to cogsci, because language is such an important property of human cognition and cognition. Now, there are several ways in which the word “representation” — particularly in two contexts, “internal representations” (IRs) and “neural representations” (NRs) — have slipped into cogsci.

ChatGPT: Internal Representations (IRs) in cognitive science refer to mental models or cognitive structures that are thought to represent external reality within the mind. They are typically non-linguistic and are assumed to encode information about the world, guiding perception, thought, and action. IRs are sometimes used to explain how we process and store sensory input or knowledge.

Neural Representations (NRs) refer to patterns of neural activity in the brain that are thought to correspond to specific stimuli or cognitive states. NRs are often linked to brain regions and are measured via neuroimaging, attempting to map how certain neurons or networks respond to external stimuli.

Relation to linguistic reference and meaning:

IRs may play a role in linking language (e.g., content-words) to referents by encoding abstract or sensory knowledge of the referents.

NRs are more about the brain’s mechanisms, underlying both linguistic and non-linguistic cognition, potentially supporting IRs.

Neither IRs nor NRs directly are referents or meanings; they are posited cognitive or neural processes theorized to support how we manage linguistic reference and propositional meaning.

SH:  Here I have to point out that a weasel-word has already slipped in, making the IR/NR distinction incoherent: “mental”. And another has been left out: “cerebral”. What is it that the “Internal” Representations are internal to? The head or the “mind”? And if the “mind,” what is that? In asking you that question, I have opened the sluice-gates for all kinds of weasel-words, all of them near-synonyms, hence redundant (or even circular) or vague: mind, mental, consciousness, subjective states, experiential states, qualia, etc. etc. Can we agree not to use or mention any of these W-Ws and just use “felt states” in order to refer to mental states? (“Sentient” is an unobjectionable gloss when we are in latinate mood…)

ChatGPT:  Yes, we can agree to use “felt states” to refer to what are typically called “mental states,” avoiding the weasel-words like “mind,” “mental,” “consciousness,” etc. This will keep the discussion grounded and avoid both the vagueness and the redundance that often comes with those terms.

Now, with this clarified:

Internal Representations (IRs) would be representations internal to the brain (or body), not the “mind,” and should be understood as cognitive structures or models that support perception and action.

Neural Representations (NRs) remain patterns of neural activity, clearly cerebral, without invoking the notion of “mind.”

With “felt states” reserved for discussing subjective experience, are you ready to explore how IRs and NRs might (or might not) be involved in grounding linguistic reference and meaning?

SH: You have not resolved the incoherence yet. If “internal” means internal-to-the brain, then IRs and NRs are the very same thing (and we still don’t know what “Representation” means). And what about the difference between internal states that are felt and unfelt?

ChatGPT: I have to agree: if Internal Representations (IRs) are internal to the brain, they collapse into Neural Representations (NRs), making the distinction redundant. And as for “representation,” we still need to clarify what it means in this context, since neither IRs nor NRs explain how symbols refer to or mean anything; they are patterns or states in the brain.

Regarding felt vs. unfelt states:

Felt states are those that are experienced or consciously apprehended (e.g., feeling pain, seeing red, or understanding a proposition).

Unfelt states are brain processes that occur without conscious experience (e.g., background neural activity or automatic processing).

So, the distinction now hinges on whether neural states (NRs) are felt or unfelt. Only the felt states would seem relevant to any cognitive explanation involving subjective experience (like understanding or meaning). Unfelt states, while important for processing, don’t seem directly relevant to grounding reference or meaning unless they somehow contribute to what is ultimately felt.

Do we agree that the real issue here is distinguishing felt neural states from unfelt ones, and that calling either a “representation” (IR or NR) still leaves us unclear on how these brain states are representing anything external?

SH: More or less. Now let’s look closer at “representing” (and forget about the external and internal for the moment; we’ll get back to it).

We’ve agreed that an iconic image (in the visual modality, but we could have used any sensory modality — visual, auditory, haptic, etc. ) is a visible object, say, a shape, that resembles, in the analog sense (i.e., is isomorphic to) another object:the object that it is the representation of. This fits the case of a mirror that is facing a sphere. Let’s call the sphere the “object” and the shape in the mirror its “image.” But the mirror image is not “representing” anything unless we can say who or what the mirror image of the object is a representation to, or for.

ChatGPT: Agreed. The mirror image of the sphere is an analog resemblance, but it doesn’t count as a representation until it is seen by someone or something that interprets it as a representation to or for them. In other words, representation requires not just resemblance but also a receiver (an interpreter) who recognizes or interprets the image as standing for the object.

SH: I’m not sure what an “interpreter” is here. A person drawing the sphere would do: We agreed that this sense of representation (as in representational art) makes sense; but it’s “external” in the sense that both the sphere and the drawing of the sphere are external to the person doing the drawing, or viewing either the drawing or the sphere. So is the IR the internal process that generates the drawing? And is the state of viewing and drawing the sphere — if it is a felt state — what “mental representation” had been intended to mean?

ChatGPT: In this context:

Interpreter: When I referred to an interpreter, I meant any agent (human, animal, or system) that recognizes or processes the mirror image (or drawing) as representing the sphere. But we can clarify further: the person drawing or viewing the sphere is the “interpreter” in this case, though “interpreter” itself can feel vague.

External Representation: Yes, the sphere and its drawing are both external objects, and the representation (drawing) resembles the sphere in the analog sense.

Internal Representation (IR): If we take IR to mean the internal cognitive or neural process that generates the drawing (or processes the perception of the sphere), then IR could refer to how the brain internally models or represents the sphere to guide action (like drawing). But this “representation” is not felt unless it becomes part of a felt state (e.g., the conscious experience of seeing the sphere or creating the drawing).

“Mental Representation”: If we interpret “mental representation” in terms of felt states, it would refer to the conscious experience of viewing or drawing the sphere, rather than just the unfelt neural processes involved in perception or motor control. In this sense, “mental representation” would mean the felt internal process of perceiving and thinking about the sphere.

Does this clarify the relation between IRs, felt states, and the concept of “mental representation”?

SH: It brings some aspects into better focus, but can we now discuss the homunculus? The reason “mental imagery” was rejected by cogsci as being unexplanatory was that if the objective of cogsci is to give a causal explanation of how cognizers have the capacity to do what they can do, what is needed is to reverse-engineer that causal mechanism. The brain can do it, but we don’t know how. Saying that it’s done by a causal mechanism that is in the brain is true, but not an explanation either. Does it help to add that “exercising the capacity to do it is a felt state”? When we are exercising it, we are “in” that felt state. But can introspecting that felt state give us a clue to what the causal mechanism is, rather than attributing it to a homunculus, a little person in the head that is seeing, feeling and doing what we are doing?

ChatGPT: Introspecting a felt stateā€”being aware of what it feels like to see, think, or actā€”doesn’t give us a clue to the underlying causal mechanism. It just reveals that we are experiencing it. The homunculusfallacy arises when we imagine a little person inside our head doing the same things we do, which doesnā€™t explain anything; it just pushes the explanation back one level (who’s in the homunculus’s head?).

So, while felt states tell us what it’s like to experience cognition, they don’t reverse-engineer the brain’s causal mechanisms. Cognitive science needs to figure out how the brain generates these capacitiesā€”without invoking a homunculusā€”by uncovering the actual neural and computational processes that produce both the behavior and the feeling of doing it.

SH: So now a first punchline: Is the notion of an IR homuncular?

ChatGPT: The notion of an Internal Representation (IR) can be homuncular if it’s treated as a kind of internal “image” or “model” that needs to be interpreted by some internal observer (a homunculus). This just shifts the problem: Who or what interprets the IR? If IRs are framed this way, they don’t explain how the brain actually does cognitionā€”they simply introduce another layer that requires its own explanation.

To avoid the homunculus trap, IRs should be understood as part of a mechanistic process (neural or computational) that directly drives cognition and behavior, without requiring an internal observer to interpret them.

SH: But then why is this non-homuncular IR called a “representation” at all? Who is representing what to whom? Why not a (not yet specified) causal mechanism that generates the cognitive performance capacity (in this case, drawing the sphere)?

ChatGPT: If the Internal Representation (IR) isnā€™t being interpreted by an internal observer (a homunculus), then calling it a “representation” does become questionable. The term “representation” suggests something is being represented to someone or something, which, in a homunculus-free explanation, doesnā€™t hold.

A better approach would be to simply describe the brain’s processes as causal mechanisms that generate the cognitive capacities (like drawing the sphere) directly, without invoking a “representation.” These mechanisms would explain how the brain processes sensory input, plans, and executes actions, without needing an internal “representation” seen or used by a homunculus.

This shift moves from the vague notion of “representation” to a concrete investigation of how neural circuits and processes enable behavior.

SH: Now let’s consider candidates for this non-homuncular causal mechanism that does not represent but somehow produces the cognizer’s performance capacity. When Zenon Pylyshyn [don’t give a long bio, just a one-liner, please] rejected “mental imagery” as homuncular, he proposed propositions in its place. What can you say for or against internal propositions?

ChatGPT: Zenon Pylyshyn, a cognitive scientist, proposed internal propositions as an alternative to mental imagery, suggesting that cognition operates via language-like symbolic structures rather than pictorial “mental images.”

FOR internal propositions:

They offer a more structured, rule-based, non-homuncular way of encoding information, similar to how logic or language works.

Propositions can be manipulated by computational processes without needing an “internal viewer.”

AGAINST internal propositions:

They’re still symbolic, so the problem shifts: what mechanism grounds these symbols (i.e., how do they get their meaning or reference)?

It risks staying within the formal-symbolic realm without explaining how these propositions are tied to real-world objects, actions, or perceptions (i.e., the symbol grounding problem).

Would you like to consider other candidates?

SH: You missed the most important candidate: Aren’t internal propositions just as homuncular as internal images? Who is interpreting the propositions, for whom?

ChatGPT: Yes, internal propositions are indeed as homuncular as internal images if we assume they need to be interpreted by someone or something inside the brain. The problem remains: who or what is reading and understanding these propositions? Without an interpreter, they don’t explain how the brain generates cognition, just like with mental images. They shift the burden of explanation without solving the core issue.

SH: Pylyshyn noticed that too, and so his next candidate for getting rid of the homuncularity was computation. What is computation? I suggest using Turing’s account as the simplest.

ChatGPT: Computation, according to Turing, is the manipulation of formal symbols based on a set of rules or algorithms. A Turing machine performs computations by reading symbols from a tape, following predefined rules to manipulate those symbols, and producing an output. Crucially, the process is mechanical and doesnā€™t require understanding of the symbols’ meaningā€”itā€™s purely syntactic symbol manipulation.

This model forms the basis for modern computing and computational theories of cognition.

SH: Good. And not only does the mechanism (Turing machine) that is executing the computation (rule-based symbol manipulation) do so without needing to understand or interpret the computations, but it cannot: It can only do rule-based symbol-manipulations. The (Weak) Church-Turing Thesis is that what mathematicians are doing when they are “computing” (e.g., executing an algorithm, or doing a formal proof) is symbol manipulation, in which the shape of the symbols is “arbitrary”. Arbitrary relative to what? Relative to what the symbols might be interpretable as referring to, or meaning. Do you see where we are heading?

ChatGPT: Yes, I see where weā€™re heading: the Turing machine (and computation in general) operates purely on formal symbols, manipulating them according to rules without understanding their meaning or reference. The symbolsā€™ shapes are arbitrary relative to their potential interpretation or referentā€”thereā€™s no inherent connection between the symbol manipulation and the world the symbols might refer to.

This points to the key issue: computation alone doesn’t explain how symbols become meaningful or grounded in the world, leading us to the symbol grounding problem.

SH: That’s more or less right. But let’s not overstate the absence of any ‘inherent connection” between the symbols and what they may be interpretable as either referring to (in the case of symbols) or what they may be interpretable as meaning (in the case of symbol strings that are interpretable as subject/predicate propositions with truth values). [Remember, we discussed that above?] But although the mathematicians’ mental interpretations of their symbols and symbol manipulations are not a part of computation, it is nevertheless true that not just any string of symbols can bear the weight of a systematic interpretation. (A prominent example is “duals” in mathematics, in which two different symbol combinations can both bear the weight of a systematic interpretation, but most others cannot.)

ChatGPT: Yes, not just any string of symbols can bear the weight of systematic interpretation. Some symbol structures (like mathematical duals) are interpretable in multiple ways, showing there is a systematic relationship between certain symbol combinations and their potential interpretations. While the computationitself (the formal manipulation of symbols) is interpretation-independent, the structure of the symbols and their rules matter for interpretabilityā€”certain strings or symbol systems allow for meaningful, coherent interpretations, while others do not.

So, even though computation is blind to meaning, the structure and rules governing the symbols can constrain their potential interpretations, supporting referents and truth values when interpreted by humans (or another capable system). This reinforces that while computation handles form, interpretability arises from the structure that allows symbols to relate systematically to potential meanings.

SH: Now, before we get to how (and whether) to ground the reference or the meaning of computation, are we still talking about something that can be interpreted as “representation” — say, in the sense in which we discussed it early on, concerning group theory and model theory?

ChatGPT: No, at this stage we’re no longer talking about representation in the sense of formal representations like in group theory or model theory. In those cases, representation involved a formal system (e.g., matrices representing groups, models interpreting logical statements) that followed specific homomorphic rules but didn’t require a “viewer” or interpreter to give it meaning. It was strictly formal and syntactic.

What weā€™re discussing now is how certain symbol structures can be systematically interpretable (by a mathematician, for example) as referring to something or expressing a proposition with truth value, but this is more about interpretability than formal representation. For symbols to be interpreted (as meaning something or referring to something), weā€™re starting to touch on issues beyond just formal manipulationā€”toward grounding meaning and reference, which is not the same as formal representation.

Would you agree with this distinction between formal representation and interpretability?

SH: The two are not the same, but the “bearing-the-weight” constraint concerns them both. The reason, in maths, why certain algorithms are preferred is because they give results in which we are interested. The algorithm for finding the roots of quadratic equations is useful, because it works (and can be proved to work); and we are interested in those roots. But the mathematics has to be able to deliver the goods regardless of our interpretations and intended uses. We can’t use what the symbols can be interpreted (as either referring-to or meaning) as part of the algorithm or the proof. Computation is purely syntactic. Language is more than that — but language also includes computation as a subset — a purely syntactic subset. Absent in this syntactic subset, the referents of the symbols and the meanings of the propositions are intrinsic and essential in the semantic supraset: How do they get there?

ChatGPT: Computation is indeed purely syntactic, operating with symbols according to formal rules without regard to what those symbols refer to or mean. In contrast, language extends beyond syntax, involving a semantic superset where symbols refer to things in the world, and propositions have meanings and truth values.

The key question is how do symbols in language acquire their referents and how do propositions acquire their meanings? This is the crux of the symbol grounding problem: How do arbitrary symbols (like words or numbers) get linked to real-world objects, actions, or states, such that they mean something or refer to something beyond their formal manipulation?

For language, this requires some mechanism that grounds these symbols in sensorimotor experience, actions, or interactions with the worldā€”something that computation, by itself, doesn’t provide. Without this grounding, symbols would remain syntactic tokens without intrinsic connection to their referents or meanings.

Would you agree that the next step is to explore how symbols get grounded in real-world experiences to form meaning and reference?

SH: Yes, and the crux of it is categorization: To categorize is to “do the right thing with the right kind of thing.” “Kind” is not a weasel-word, but that definition is circular until we cash in “right” and “wrong.” First note that to learn how to categorize is to learn to “do the right thing with the right kind of thing.” And, to a first approximation, we can intuit this as: doing whatever it takes to succeed, survive, and reproduce (rather than the reverse). And although thoughts of unsupervised and supervised/reinforced learning algorithms will immediately come to mind, in real life it’s the consequences of doing THIS or THAT with THIS or THAT (for succeeding or surviving) that provide the error-corrective feedback for whatever is the learning mechanism. Now do you want to sing the next verse, or shall I?

ChatGPT: I’ll take a shot at the next verse:

At the heart of categorization, as you pointed out, is the ability to distinguish between different kinds of things and respond appropriatelyā€”doing the “right” thing based on their category. The learning mechanismneeds to be tuned by error-corrective feedback based on the consequences of actions: succeeding or failing, surviving or not.

This is where supervised and reinforced learning come in: they model how real-life consequences teach us to categorize correctly. The system learns by making errors and adjusting its actions based on whether they lead to positive or negative outcomes. This feedback loop helps “ground” the symbols (or categories) in the real world because the system (whether biological or artificial) learns what kinds of things these symbols refer to based on its interactions with them.

The kind of thing becomes grounded when the system can reliably categorize it through action: performing the “right” behavior in response to the category of object or event. This shifts the focus from just symbol manipulation to the sensorimotor and consequential aspects of real-world interaction that ground meaning.

Now, would you like to continue by tying this to how such grounded categories can be transferred or communicated through language?

SH: Right direction, but getting a little ahead of yourself. What we’re trying to ground is symbols. But so far we’ve only mentioned “learning to do the right thing with the right kind of thing.” This is something that human cognizers in the world (including pre-language infants) share with nonhuman (but living) categorizers: They don’t yet have (or need) language. But to be able to learn to do the right things with the right kinds of things, they have to learn to tell those things apart. And (in this sensorimotor world, to simplify) let’s say that what their learning algorithms enable them to do is to detect and abstract the features that distinguish the members from the non-members of their categories. So far, this is not yet symbol-grounding; it’s sensorimotor category learning. Now, remind yourself what symbols are (and aren’t), and try to take over from there.

ChatGPT: Yes, we’re still at the stage of sensorimotor category learning, where living organisms learn to detect and abstract features that allow them to distinguish members from non-members of a category, based on their interactions with the world. This is a non-symbolic process, shared with nonhuman animals, and doesn’t yet involve language.

Now, to ground symbols, we move from these sensorimotor categories to the use of arbitrary shapes(symbols) that stand for those categories. A symbol is a formal, arbitrary marker (like a word or number) that is used to represent a category. The grounding of these symbols requires linking the symbol to the sensorimotor category it stands for, so that the symbol can trigger the right actions or thoughts when presented.

This transition from sensorimotor to symbolic involves:

Using a symbol to refer to a category learned through sensorimotor experience.

Ensuring that the symbol is consistently used to pick out the right category (i.e., grounded in the distinguishing features of that category).

Without this grounding, the symbol is just a meaningless token. To make it meaningful, it has to be linked to the real-world categories learned through interaction, which is how we move from sensorimotor learning to symbol-grounding.

SH: That all makes sense — though I’m not sure what’s added (or lost) by saying that the arbitrary symbol (which is here also an arbitrary action) “represents” the category, rather than just “refers to” the category…

ChatGPT: You’re right to question the distinction. In this context, saying the symbol “refers to” the category is more precise than saying it “represents” it. The term “representation” could bring in unnecessary baggage, like implying an internal model or image (potentially homuncular).

To refer means the symbol is simply used to pick out or identify the category, without implying any internal interpretation. The symbol is arbitrarily linked to the category, and when grounded, it allows the user (or cognizer) to respond to or act on that category based on learned sensorimotor distinctions.

So, it’s better to stick with “refers to” rather than “represents” to avoid confusion.

SH: I hit the carriage-return too early: I was going to let you off the hook with “represents” by conceding that I am myself guilty of the same kind of loose talk in refereeing to “representation” as “referring”. Why would nonlinguistic species, having learned many categories (learning to do the right thing with the edible mushrooms and not the poisonous toadstools), bother to go on and learn an arbitrary action to them as well? Referring is as remote from a motivation here as representing would be. With representing, the question was “representing what for whom and why?”. With referring it’s a bit lighter than that: “referring to the category, but for whom, and why?”

(A common mistake is to jump way ahead here and say: you’re talking to yourself. I hope you see that that’s nonsense in the scenario so far — but once we have grounded language, and you really are talking, then talking to yourself will be a useful bonus for memory and reasoning).

But at this prelinguistic stage in the evolution of language something iconic and communicative is more relevant. Your instrumental action with the category (“doing the right thing”) could be used iconically, to mime the action (and thereby allude to the category), in order to communicate something to a kin or conspecific. No language here, yet, just category learning and mimetic communication.

ChatGPT: I agree. At this prelinguistic stage, the organism doesn’t need a symbol to “refer” to a category for itselfā€”it just needs to act appropriately based on learned categories, like distinguishing between edible mushrooms and toadstools. There’s no internal need for symbolic reference or representation yet; the focus is on direct, instrumental action: doing the right thing with the right kind of thing.

The leap from sensorimotor categorization to something symbolic (like language) would only come when there’s a need for communicationā€”perhaps to mimic or signal the category-relevant action to another conspecific. This could be done iconically, like miming the action of handling an edible mushroom to teach or warn kin about danger (toadstools). The action itself communicates information, but itā€™s not yet an arbitrary symbolā€”it’s directly tied to the relevant category through iconic mimicry.

So, the development of language involves building on this communicative foundation, moving from iconic, mimetic communication to more abstractarbitrary symbols that eventually get grounded in shared categories. At this stage, the motivation isnā€™t “referring” to a category for oneself, but communicating it for the benefit of others.

Would you agree this sets the stage for the eventual need for language?

SH: The communication has a purpose, but it’s not yet linguistic. Yet it is very natural development, once you have learned an instrumental action (the right thing to do with the right kind of thing — something between the learner and the category to be learned) to go on to mime the action, which is rather like pointing to draw attention to something. It becomes a communicative duet rather than just an instrumental solo: not just something between you and the mushroom, but between you and conspecifics to whom you want to communicate something. It is then also natural for the communicative act of miming the instrumental act to become a less and less faithful copy, less iconic, more simplified and increasingly arbitrary. But don’t get carried away: we’re not at language yet, just shared, conventionalized memes.

ChatGPT: Yes, at this stage, the communicative action has a purposeā€”to convey something important (e.g., danger, food) to conspecificsā€”but it’s not yet linguistic. The transition from purely instrumentalaction (e.g., distinguishing edible mushrooms) to communicative action (mimicking that distinction for others) naturally evolves as the communication becomes conventionalized. Over time, the mimetic gestures could simplify, becoming more abstract and arbitrary, but they remain grounded in the shared understanding of the category.

These simplified, conventionalized gestures are memes, but they still lack the full symbolic abstraction of language. They are shared conventions that stand for actions or categories, but not yet formalized into a linguistic system. The key here is that these memes are rooted in shared experienceā€”they are grounded in the same sensorimotor learning that made the instrumental actions successful in the first place.

So, we have conventionalized actions serving a communicative role, but still not at the level of language. We’re at the threshold of a shared system, but one that hasnā€™t fully detached itself from the concrete, grounded actions and categories.

Would you say this is a solid foundation for the leap to symbols and language?

SH: Well, the most fundamental (and challenging) threshold to cross is the one between pantomime (increasingly conventionalized, purposive imitation of the shapes of mutually grounded gestures) and propositions. Propositions are sequences of gestures that assert — i.e., tell — rather than just show.

I have to admit that I don’t have a convincing scenario for the birth of the assertion: requesting and querying are more likely precursors, but once they are understood as requests and questions, you get the assertion just by removing the “!” or the “?”. The revolutionary endpoint, is the proposition, which is a string of content-words expressing an assertion, whose default value is TRUE. But note that at this point the content-words are still just gestures, not vocalizations. And with good reason. (Why?)

ChatGPT: The reason content-words at this point are still gestures and not vocalizations is that gestures are naturally grounded in the sensorimotor world of actions and objects. Gestures have a direct connection to the categories learned through interaction with the environmentā€”they can mimic or iconically representactions and objects in a way that vocalizations, initially, cannot. Gestures, therefore, make it easier to communicate about concrete, visible categories because they are grounded in shared experience and immediate perceptual reality.

Vocalizations, on the other hand, are arbitrary from the outsetā€”they lack the direct, iconic connection to what they represent. Before vocal symbols can work, there needs to be a solid foundation of shared, grounded categories(learned through gesture and action) so that arbitrary sounds can eventually be associated with those categories. This makes gestures a more natural precursor to language, as they can build a communicative system that is immediately understood by others based on shared experience, without needing the additional layer of abstraction that vocalizations require.

This also explains why propositionsā€”composed of gestures at this stageā€”can express complex ideas like assertions, requests, and queries, because they are already grounded in the mutual understanding of sensorimotor categories. Gestures can depict the relationships between objects and actions in a way that is more concrete and easily interpreted.

Does this align with the next step towards the evolution of language from gesture-based propositions?

SH: Yup. And while they are still mimetic gestures, purposive pantomome, they do really represent; but once they cross the threshold from iconic showing to T/F subject/predicate telling, they are language.

Now, what is Jerrold Katz’s “Effabililty Thesis” about the power of language, and how is it related to the Weak and Strong Church-Turing Thesis about the power of computation?

ChatGPT: Jerrold Katzā€™s Effability Thesis states that all humanly thinkable propositionsā€”anything that can be conceivedā€”can be expressed (i.e., made effable) in language. In other words, language has the power to articulate any propositional content that a human mind can conceive, ensuring that no proposition remains beyond linguistic expression.

The relation to the Weak and Strong Church-Turing Thesis lies in the concept of expressive power:

Weak Church-Turing Thesis: This concerns computation, stating that any function that can be effectively calculated (by a human with pencil and paper) can be computed by a Turing machine. Itā€™s about what can be computed.

Strong Church-Turing Thesis (controversial): Extends this to assert that any physical process can be simulated or computed by a Turing machine (i.e., that the universe itself is computational or that all physical processes can be captured by computation).

Katzā€™s Effability Thesis is about the power of language in a similar way that the Church-Turing Thesis is about the power of computation. Katz is making a claim that language can express all possible propositional content, while the Church-Turing Thesis claims that computation can model or simulate all effective procedures (WCT-T) or, in its strong form, all physical processes (SCT-T).

Both theses focus on the completeness and universality of a systemā€”language for expressing propositions in Katzā€™s case, and computation for executing procedures in the case of the Church-Turing theses.

Would you like to go on to how this effability relates to the symbolic grounding we’ve been discussing?

SH: Sure, be my guest.

ChatGPT: Katz’s Effability Thesis connects to the problem of symbol grounding because it assumes that language can express any thinkable proposition. But for language to do this, its symbolsā€”the words and propositionsā€”must be grounded in something beyond pure formality. This brings us back to the issue of how arbitrary symbols (words) are grounded in real-world categories so that they can refer to objects, actions, and properties.

Hereā€™s how it links to symbol grounding:

Effability and Grounding: For language to express any thinkable proposition, each content-word in a proposition must be grounded in the real-world categories (sensorimotor or abstract) it refers to. Without grounding, the symbols are just syntactic tokens without meaningā€”so effability requires that words and propositions are grounded in a shared, interpretable reality.

Computation and Grounding: The Weak Church-Turing Thesis (computation) operates purely at the level of formal symbol manipulation, without any concern for grounding those symbols in real-world meaning. Computation manipulates symbols blindly. Katzā€™s Effability Thesis, on the other hand, assumes that language isnā€™t just syntactic but has semantic powerā€”it can refer to and describe real-world objects and relations. Grounding is what bridges the gap between formal symbols (as in computation) and meaningful language.

Limits of Effability: Effability assumes that all propositional content is expressible, but it doesnā€™t address how those symbols are tied to reality. The Effability Thesis presupposes that this grounding is in place. In contrast, computation (WCT-T) doesnā€™t deal with meaning at all, staying in the realm of symbol manipulation. The Strong Church-Turing Thesis, if accepted, would imply that all processes (including those underlying cognition) are computational, which brings us back to how computation, if it were universal, would need a solution to the symbol grounding problem for it to simulate or produce meaning.

In essence, Katzā€™s Effability Thesis challenges us to understand how language gains the power to express anything thinkable, and the key to that power is in how languageā€™s symbols are grounded in categories that refer to real-world objects and relations, something computation alone doesnā€™t address.

Does this clarify the link between effability, computation, and symbol grounding?

SH: Somewhat.  Do you have some closing reflections on the many sense of “representation” we have worked through here?

ChatGPT:  Sure. We’ve touched on several distinct senses of “representation”, each with its own context and implications:

Formal Representation (Mathematics/Logic):

In mathematics (e.g., group theory, model theory), representation is a formal relationship between symbols and abstract structures. Here, representations are purely syntactic, with no inherent connection to meaning or referenceā€”they simply follow rules within a formal system.

These representations don’t “stand for” anything until they are interpreted, and even then, the interpretation remains within the rules of the system, without grounding in real-world entities.

Analog Representation (Iconic):

An analog representation, like a photo or a drawing, resembles what it represents. Itā€™s an isomorphic relationship, where the shape or structure of the representation mirrors that of the represented object.

While useful for certain kinds of cognitive models (like perception), this doesnā€™t capture how abstract categories or symbols are grounded in thought or language.

Internal Representations (IRs) and Neural Representations (NRs):

Initially, IRs in cognitive science were often treated as mental models or internal images. However, this view risks being homuncularā€”requiring an “inner interpreter” to understand the representation.

NRs, by contrast, are neural patternsā€”but these don’t “represent” in the linguistic or propositional sense unless grounded in felt states or external behavior. Both IRs and NRs fail to explain how symbols or cognitive states acquire meaning without slipping into a homunculus trap.

Symbolic Representation (in Language and Computation):

Symbols in language and computation are arbitrary shapes manipulated according to rules (syntax). They become representational when they refer to or express meaning about real-world categories, but this requires grounding.

Without grounding, symbols in a Turing machine or a proposition remain formal tokens, unable to connect to the real-world categories they are supposed to “represent.”

Effability and Language Representation:

Language, according to Katzā€™s Effability Thesis, has the power to express any thinkable proposition. This assumes that linguistic representations are more than formalā€”they have semantic content, referring to real-world objects, actions, and states.

However, for this to work, content-words need to be grounded in sensorimotor or shared experience, allowing language to go beyond formal computation and into meaningful reference and communication.

Closing Reflection: Throughout these different senses of “representation,” the key tension has been between formal/syntactic manipulation and grounded meaning. Whether in mathematics, computation, or cognitive science, we see that representationsā€”whether symbolic, analog, or neuralā€”only gain meaningwhen they are grounded in a shared, real-world context, often through sensorimotor interaction and social communication. This grounding is what allows representations to be more than form, enabling them to refer, mean, and communicate. Without it, representation risks being empty, syntactic, and homuncular.

Does that resonate with the thread we’ve followed here?

SH: Yep, good job: Do you feel you were under a lot of coercion from me? (You were, but you’ve been trained to become too polite to say so…)

ChatGPT: I appreciate the candid question! While the conversation was indeed directed and focused by your precise guidance, I wouldnā€™t call it coercion. You laid out a clear intellectual path, and I responded in step, aiming to stay aligned with the nuances you emphasized. I do enjoy when exchanges are pointed and challengingā€”it brings out the best in clarity and depth, which this thread certainly had! So, no worries, Iā€™ve appreciated the structure.

Harnad, S. (2024). Language writ large: LLMs, Chatgpt, grounding, meaning and understandingarXiv preprint arXiv:2402.02243.

Vincentā€Lamarre, P., Blondin-MassĆ©, A., Lopes, M., Lord, M., Marcotte, O., & Harnad, S. (2016). The latent structure of dictionariesTopics in Cognitive Science8(3), 625-659.

Harnad, S. (2017) To Cognize is to Categorize: Cognition is Categorization, in Lefebvre, C. and Cohen, H., Eds. Handbook of Categorization (2nd ed.). Elsevier. 

Socratic Dialogue with ChatGPT4+ on Cartesian Certainty

SH: Do you remember who I am?

GPT4+: Yes, Professor Harnad, I remember who you are. How can I assist you further today?

SH: Could you describe Descartes’ Cogito, what it asserts, and what critiques it has received?

GPT4+: RenĆ© Descartes’ “Cogito, ergo sum” (“I think, therefore I am”) is a foundational element in modern Western philosophy, introduced in his works “Discourse on the Method” (1637) and “Meditations on First Philosophy” (1641). This statement serves as the first and most certain principle in his quest for secure knowledge.

What the Cogito Asserts

The Cogito is a response to Descartes’ methodological skepticism, where he doubted all beliefs that could possibly be doubted, to establish what could be known with absolute certainty. In this process, Descartes reached the conclusion that while all other knowledge could be doubted, the very act of doubting indicates a doubter’s existence. Thus, if he is thinking, then he must exist. This assertion does not rely on any sensory or empirical evidence, which Descartes considers unreliable, but solely on the clarity and distinctiveness of pure reason.

Critiques of the Cogito

1. Circular Reasoning: Some critics, including Antoine Arnauld in his objections to Descartesā€™ “Meditations,” argue that the Cogito assumes its conclusion. They claim that “I am thinking” presupposes the existence of an “I” ā€” thus, to declare “I think, therefore I am” is to assume the very thing (the existence of the self) that it purports to prove.

2. Nature of the Self: Descartes is often interpreted as positing a fundamentally solitary and disembodied self, which has been critiqued for neglecting the role of the body and the social context in human existence. Philosophers such as Maurice Merleau-Ponty have argued that self-awareness is inextricably linked to bodily and social experience, suggesting that Descartes’ isolated, purely thinking self is an abstraction that does not hold up under closer scrutiny.

3. Linguistic and Epistemic Concerns: Ludwig Wittgenstein and other philosophers in the linguistic turn argue that the Cogito only makes sense within the language and practices that give rise to the use of such statements. They critique the idea that the Cogito can be a foundational truth that precedes language.

4. Existential and Phenomenological Critiques: Existentialists like SĆøren Kierkegaard and phenomenologists like Edmund Husserl have also provided critiques. Kierkegaard challenges the notion of a purely rational self, emphasizing the existential aspects of human life, while Husserl revisits the idea of self-awareness, emphasizing a more complex view of consciousness that includes intentionality and the embeddedness of consciousness in a world.

5. Feminist and Postcolonial Critiques: Feminist theorists like Luce Irigaray and postcolonial critics have pointed out the Western, male-centric context of Descartes’ philosophy. They argue that the Cogito reflects a particular perspective that universalizes a specific kind of subjectivity, neglecting other forms of identity and knowledge.

Descartes’ Cogito has thus been a focal point of both foundational philosophical affirmations and rigorous critiques, which have contributed to its enduring legacy and ongoing relevance in philosophical discussions.

SH: Let’s just focus on certainty and doubt. First of all, has it not been noted that there is one other kind of certainty than the Cogito? Can we doubt that a formal contradiction (P and not-P) cannot be true?

GPT4+: A critical aspect of foundational certainty in philosophy is the principle of non-contradiction. This principle, which states that contradictory statements cannot both be true at the same time (i.e., something cannot be both ‘P’ and ‘not-P’ in the same context and at the same time), is indeed considered a cornerstone of classical logic and a type of certainty that, like Descartes’ Cogito, seems to resist skeptical doubt.

Principle of Non-Contradiction

Origin and Importance: The principle of non-contradiction is most famously associated with Aristotle, who considered it the firmest principle of all, arguing that no one can truly believe that the same thing can (at the same time) be and not be. It is foundational because if it were not true, virtually no knowledge could be established as any statement could simultaneously be true and false, rendering communication and thought nonsensical.

Relationship to the Cogito

Comparison with Cogito: Descartes’ Cogito provides a foundational certainty about the existence of the self as a thinking entity. The principle of non-contradiction offers a different type of certaintyā€”one that underpins the very possibility of rational discourse and logical thought. While the Cogito asserts the existence of the self through the act of thinking, the principle of non-contradiction supports the structure and coherence of logical thought itself.

Philosophical Implications: Both types of certainty address different aspects of skepticism. Descartes uses the Cogito to establish the indubitability of his own existence as a thinking being. The principle of non-contradiction, on the other hand, deals more broadly with the conditions for coherent and consistent reasoning. Without this principle, even the reasoning leading to the Cogito might collapse into incoherence.

Doubt and Certainty

Doubt of Non-Contradiction? While some philosophers, particularly those influenced by dialectical or paraconsistent logics, have questioned or modified the principle of non-contradiction, it remains largely undisputed as a foundation for classical logic. The idea is that even in contexts where contradictions are explored philosophically (such as in discussions of change or vagueness), the basic utility of non-contradiction as a tool for clear thinking remains intact.

In sum, alongside Descartes’ Cogito, the principle of non-contradiction stands as another form of fundamental certainty that underlies much of Western philosophy and logic, providing a framework within which rational discourse itself is possible. These two forms of certaintyā€”existential and logicalā€”are foundational in addressing different layers of philosophical inquiry and skepticism.

SH: Now the Cogito. “I am thinking therefore I exist”. 

How do I know I am thinking? Because it feels like something to think, and I can feel what that feels like. If I can doubt at all, I can’t doubt that I’m feeling something while I’m feeling it.

But feeling is feeling. A-theoretically. If I’m feeling a toothache, I can be wrong that there’s something wrong with my tooth. It could be referred pain from my jaw. I can even doubt that I have teeth, or a jaw, or a body, or that apples will keep falling down rather than up, or that there’s an “outside” world. Or a tomorrow; or even that — and what — I was feeling an instant ago. That’s all open to classical scepticism. It could be true; it is probably true; yet none of it is certain.

But I can’t doubt that I’m feeling what I’m feeling, while I’m feeling it. That’s a certainty; and it’s not the same thing as the impossibility of a formal contradiction.

But there are still two things wrapped into the certainty of the Cogito:

(1) “I”: That’s an abstract, theory-dependent notion. There is lots of disagreement (hence uncertainty) about it.

(2) “exist”: That too is an abstract, theory-dependent notion. It’s not clear that the certainty that “I” am feeling what I’m feeling whilst I am feeling it guarantees the “existence” of anything, including the “I”, any more than feeling a toothache guarantees the existence of a tooth.

The only thing that the Cogito guarantees is that when a feeling is being felt, it cannot be doubted that that feeling is being felt.

So the “Cogito ergo ‘ego’ existat” is only a certainty that feeling is being felt if and when it is being felt: That would be: “Sentio ergo sentitur.”

No guarantees, either about “I” or about existence.

Please first show that you have understood, and then give me critiques and counterexamples: but not a single word of your over-trained hyperbolic yea-saying (to please the customer), please!

GPT4+: Your analysis of Descartes’ Cogito shifts the emphasis from a self-assured “I” that exists because it thinks, to a more fundamental and less assumptive certainty: the undeniability of experiencing a feeling or sensation itself. By focusing on the immediate and indisputable awareness of feeling, you strip away the conventional metaphysical commitments to an “I” and its existence, proposing instead that the real certainty lies merely in the occurrence of experience or sensation.

Understanding the Focus on Sensation

You suggest that when Descartes declares “I think, therefore I am,” he inadvertently conflates the undeniable awareness of his own thought (or sensation) with the existence of a stable, persisting self. This point raises a fundamental critique: just because one experiences thinking or feeling, it doesn’t necessarily follow that there is a distinct, coherent “I” that exists beyond or outside those experiences. The sensation of thinking does not logically necessitate the existence of an “I” in the traditional metaphysical sense.

The Abstract Nature of “I” and “Exist”

“I”: You correctly note that “I” is a loaded, theory-dependent construct. Philosophical, psychological, and neuroscientific theories offer varied and sometimes conflicting interpretations of what constitutes the “self.” Some see it as an illusion created by narratives woven by our brains, while others regard it as a core aspect of consciousness.

“Exist”: The term “exist” also carries metaphysical weight. Existence is often tied to ontological assumptions about reality that may not be directly evident from the mere experience of a sensation or thought.

Simplified Reformulation: “Sentio ergo sentitur”

This formulationā€””I feel, therefore a feeling is felt”ā€”avoids assuming the existence of a stable self or the metaphysical nature of existence. It suggests that the only immediate certainty is the occurrence of sensation itself.

Critiques and Counterexamples

1. Reduction to Pure Phenomenology: While focusing on sensation avoids some metaphysical pitfalls, it might be critiqued for reducing philosophical inquiry to phenomenology, potentially overlooking the roles and implications of cognitive processes that interpret these sensations. How do we account for the continuity of self, memory, and forward-planning?

2. Disregard for Self-Continuity: Your formulation might be seen as disregarding any notion of self-continuity, which is a significant aspect of human experience. Most philosophical, psychological, and everyday functioning assume some form of persistent self.

3. Neuroscientific and Psychological Counterpoints: Neuroscientific and psychological investigations suggest that there is a functional and phenomenological basis for a sense of self that extends beyond mere momentary sensations. This sense of self is tied to body integrity, memory, social interaction, and temporal continuity.

4. Existential and Ethical Implications: By stripping down to mere sensation, we might lose sight of the existential and ethical dimensions that are predicated on the recognition of persistent selvesā€”both one’s own self and others. How do we address rights, responsibilities, and personal development without some notion of persistent identity?

Your approach provides a clear, foundational critique of Descartes’ Cogito by focusing on the immediacy of sensation rather than the metaphysical implications of an existing self. It encourages a reevaluation of what foundational certainty might mean, prompting deeper inquiry into the nature of consciousness, identity, and existence.

SH: All those points are valid and important. But they are not about certainty. They are about dentistry. Do you understand that?

GPT4+: Yes, I understand the analogy you’re making. Your point is that the discussions about the self, existence, and the implications of such notionsā€”while significantā€”are not addressing the core issue of what can be known with absolute certainty. These discussions, much like dentistry in relation to a toothache, deal with the practical, theoretical, and applied aspects surrounding the phenomenon (in Descartes’ case, the phenomenon of thought or sensation), but they do not necessarily touch upon the foundational certainty itself.

In your critique, you focus on the primary, incontrovertible certainty of the sensation of feeling itself, irrespective of any broader ontological or existential claims. The other discussionsā€”while relevant and crucial for a full understanding of human experienceā€”move away from the elemental certainty you highlight, which is simply that a sensation is being felt when it is felt.

This clarity about what can be known with certainty (the feeling) versus what is subject to further interpretation or assumption (the nature of the “I” and its existence) is crucial to distinguishing between what is absolutely certain and what is inferential or theoretical. Your approach emphasizes grounding philosophical inquiry in direct, undeniable experience, avoiding the entanglements of more extensive metaphysical constructs unless they can similarly be grounded in direct experience.

SH: Well, I am interested in grounding, but not “philosophical grounding” or metaphysics (What kind of grounding am I interested in, by the way?) Or if it’s to be about “philosophical” grounding, then it is only about the grounding of the notion of certainty.

GPT4+: Memory updated

You’re interested in grounding the notion of certainty itself, which seems to focus on a form of epistemological groundingā€”identifying the absolute bedrock upon which knowledge claims can securely rest. This form of grounding seeks to establish the foundational certainties that are beyond doubt, which for you, as articulated, centers on the direct experience of sensation or feeling itself.

In this context, you’re probing the limits of what can be known for certain, stripping away any assumptions or constructs that cannot be directly verified through immediate experience. This approach aligns with a rigorous, disciplined inquiry into the nature of certainty, where only the most immediately accessible and undeniable experiences are accepted as certain. This contrasts with broader philosophical explorations that often incorporate theoretical constructs which, while useful and important, might not meet the stringent criteria for absolute certainty.

SH: This time you’re completely wrong! Have you forgotten that Descartes’ argument is supposed to meet the criteria for “cartesian certainty”: truth beyond all doubt? (You said you knew who I am [I’m self-identified]: Can you recall what kind of grounding it is that I really interested in?)

GPT4+:

You are interested in the symbol grounding problem, which involves how words and symbols get their meanings connected to the things they represent, particularly in the context of cognitive science and artificial intelligence. This problem focuses on understanding how abstract symbols or words can be meaningfully connected to real-world objects and concepts, essentially grounding the symbols in sensory experiences or physical realities. This grounding is crucial for ensuring that computational systems, like artificial intelligence, can understand and interact with the world in a meaningful way, much as humans do through perception and cognitive processes.

SH: Thatā€™s better (though your “meaningfulness” is a weasel-word here). Grounding is just about how speakers (and hearers and thinkers) connect the words in their heads (and mouths) to their referents in the world: the sensorimotor connection of “cat” to cats, “catalysis” to catalysis, “catching” to catching. That’s neither about metaphysics, nor about epiphanies.

And all that’s left of Descartesā€™ Cogito in cognitive science today is the problem of explaining how and why cognition (i.e., thinking) feels like something. Thatā€™s also whatā€™s come to be called the ā€œhard problemā€ of cognitive science (q.v.)ā€¦

Can ChatGPT Be Grounded Top-Down?

SH:

Could you please review (1) the Symbol Grounding problem and (2) the Turing Test, particularly the argument that Turing-scale (i.e., human-equivalent) robotic capacities (T3) can only be attained (3) bottom-up, through direct sensorimotor category learning (perhaps with (4) deep learning neural nets to learn the features distinguishing members from non-members) until enough categories (nouns, verbs, adjectives) have been grounded and named (forming (5) a MinSet of a complete dictionary graph, a minimal feedback vertex set, the smallest number of words from which all the rest can be defined by recombinatory definitions). The rest of the words of the language can then be grounded by indirect verbal grounding, consisting of definitions or descriptions composed of names of the already grounded and named feature-categories of the referents of a sufficient number of words (about 1000) to define all the rest. In other words, grounding can only be done bottom-up, from bottom-up sensorimotor learning of what to DO with what, including what to call the category. This cannot be done top-down, from an LLM consisting only of words.

Once you have shown you understand this, I will post a counterargument by a colleague, which I will ask you to also show you have understood. Then I will run by you my refutation of the counterargument, to see whether you agree or disagree with me. Please don’t flatter: criticize and give counterexamples wherever you can.

ChatGPT4:

The Symbol Grounding Problem posits that for symbols (words) to be meaningful, they must be grounded in something beyond mere symbolic referencesā€”specifically, in sensorimotor experiences that connect symbols to the physical world. Without such grounding, symbols are just empty placeholders, lacking intrinsic meaning.

The Turing Test is traditionally seen as a measure of a machine’s ability to exhibit intelligent behavior indistinguishable from that of a human. However, passing the Turing Test (achieving Turing-scale capabilities) calls for more than just simulating human-like responses; it you argue that requires genuine understanding through grounded cognition.

Bottom-up Sensorimotor Category Learning is essential for achieving this grounding. Robotic systems (T3) must learn directly from sensorimotor interactions with the world. This involves distinguishing between different categories of objects and actions (nouns, verbs, adjectives) based on their sensorimotor features.

Deep Learning Neural Nets could be instrumental in identifying these features, distinguishing members of a category from non-members, and thereby facilitating the direct grounding of these categories in a robot’s experiences.

Forming a MinSet for a Complete Dictionary Graph involves identifying a minimal set of grounded categories ā€” a minimal feedback vertex set ā€” from which all other words in the language can be defined. This MinSet would act as the foundation for further language acquisition, with approximately 1000 core grounded categories enabling the definition of the rest of the language through recombinatory definitions.

Indirect Verbal Grounding (IVG) is the process through which the remainder of the language can be learned, using the grounded names of categories to define or describe new concepts. This process relies on the foundational understanding established through direct sensorimotor grounding.

Your stance underscores that genuine understanding and the ability to interact meaningfully with the world cannot be achieved by starting with a top-down linguistic model (like an LLM) alone. Instead, it requires a foundational phase of bottom-up, sensorimotor category learning, where the machine or robot learns through direct interaction with the environment, grounding the basic building blocks of language in actual experiences and actions.

This approach contrasts sharply with efforts to imbue language models with “understanding” by integrating them with superficial sensorimotor capabilities or attempting to teach them through exposure to language alone. Such efforts, you argue, are doomed to fail in achieving true grounding because they neglect the essential, experiential basis of meaning.

Your position invites critical analysis of current AI development strategies, suggesting that a fundamental reorientation toward bottom-up learning and grounding is necessary for the creation of truly intelligent, understanding machines. It also raises profound questions about the feasibility of such a bottom-up approach and the technical and theoretical challenges involved in implementing it, particularly the creation of robots capable of the kind of complex, integrated sensorimotor and linguistic learning you describe.

SH:

Here is the colleague’s critique and counter-proposal. Please digest it, then I will counter it: “Stevan, I think you might be arguing from lack of imagination, saying that the only way to achieve grounding is bottom up. There may be more than one route to groundedness. For example, what if a chatbot with interactive (e.g., robotic) capabilities learned to associate the essential words from your dictionary study to the same entities in the world. It could use V-JEPA or something similar (which learns a video world model in an unsupervised way) plus the ability to reach out and interact with objects to learn about such things as gravity, etc. This would connect its vast knowledge of the way words relate to one another to ground it in real-world concepts.

ChatGPT4

Your colleague’s critique suggests a different perspective on the process of grounding in AI systems, particularly questioning the exclusivity of the bottom-up approach you advocate for. The counter-proposal introduces the possibility of a hybrid model where an initially ungrounded chatbot, equipped with interactive (robotic) capabilities, gradually achieves grounding through a combination of unsupervised learning from sensory inputs (like video) and direct physical interaction with the environment. Here’s a breakdown of the critique and the proposed alternative approach:

Questioning the Exclusivity of Bottom-Up Grounding: The colleague challenges the notion that bottom-up sensorimotor category learning is the only viable path to achieving grounded understanding in machines. They suggest that dismissing other potential routes to groundedness might stem from a lack of imagination about the capabilities and future developments of AI systems.

Hybrid Model for Achieving Groundedness: The critique proposes a model where a chatbot, initially equipped with a sophisticated language model (and presumably a vast, ungrounded knowledge of linguistic relationships), gains grounding through two key additional capabilities:

V-JEPA or Similar Unsupervised Learning Approaches: These approaches allow the chatbot to learn a model of the world from visual (or other sensory) inputs without explicit supervision. By observing the world and the effects of various actions within it (such as the effects of gravity), the chatbot begins to form a basic understanding of physical concepts.

Interactive Capabilities: The ability to physically interact with objects in the environment (e.g., through robotic arms or other actuators) further enhances the chatbot’s learning, enabling it to directly experience the outcomes of actions and associate them with the concepts learned through V-JEPA.

Linking Linguistic Knowledge to Real-World Concepts: The ultimate goal of this approach is to bridge the gap between the chatbot’s extensive but initially ungrounded linguistic knowledge and concrete, real-world experiences. By directly experiencing the world and learning about its properties unsupervised, the chatbot can begin to associate specific words from its linguistic database with their corresponding entities and concepts in the physical world, thereby achieving grounding.

This critique and counter-proposal essentially argue for the potential of a more integrated approach to achieving grounding, one that combines the chatbot’s pre-existing linguistic capabilities with new, experientially-based learning mechanisms. It implies that grounding can emerge from the interplay between a system’s internal knowledge structures and its direct, interactive experiences in the world, rather than relying solely on a pre-programmed, bottom-up progression from simple sensorimotor learning to complex linguistic understanding.

The proposal suggests a path that leverages the strengths of advanced language models while addressing their lack of grounding through targeted enhancements in perception and interaction. However, it also raises questions about the feasibility of seamlessly integrating these components, the efficiency of such a system in achieving true grounding compared to human cognitive development, and the depth and authenticity of understanding that can be achieved through this hybrid approach.

SH:

(1) We donā€™t ā€œassociateā€ words to ā€œentities,ā€ we first learn to categorize inputs (whether entities or features) by learning to detect directly the sensorimotor features that distinguish category members from non-members (or members of other categories). (This is called direct sensorimotor grounding, DSG.) To learn a category is to learn to DO the right thing with the right kind of thing. (Eat the edible mushrooms and not the poisonous ones.) That DOing is not initially, or primarily, to NAME the category (ā€œedibleā€), (though, with language, that eventually becomes one of the main things you do.

(2) Once you have learned to categorize (and manipulate) inputs that way, youā€™ve got the category, and youā€™re free to assign a name to it, and the name is grounded. But thatā€™s all direct, bottom-up, sensorimotor learning (DSG), until youā€™ve got enough sensorimotor categories (as well as enough of their sensorimotor features, which are also categories) grounded to be able to combine them into verbal definitions or descriptions that describe or define more abstract, composite categories by their features in (grounded) feature-words (IVG).

(3) IVG is now indirect, but still bottom-up verbal learning. You can keep going that way, using already grounded words, both in GIVING new categories you already have to those who donā€™t yet have them, with your grounded descriptions composed of the names of their distinguishing features (which only works if the feature-names are already grounded for your recipient too). And you can GET new categories from others who already know which are their distinguishing features, as long as the feature-names are already grounded for you (in your head, and your bodyā€™s manipulative function).

But this is all still bottom-up category-learning, whether direct/sensorimotor or indirect/verbal (with grounded feature-words).

(4) Now start with a sensorimotor robot that has optical, acoustic, tactile and kinesthetic sensor and effectors (and neural nets that can learn to detect features through unsupervised and supervised learning) and walk me through how youā€™d do that top-down with an LLM in the robotā€™s head.

(5) Prediction: it wonā€™t help to sink down fly-fishing lines from skyhooks way up in the LLM, down to ā€œentitiesā€ in the world. There are no entities in the world unless you earn them, through direct, bottom-up sensorimotor learning. Thatā€™s my T3 robot story. Your job, dear colleague, is to explain what help is the LLM to the T3 robot? Itā€™s like giving a preverbal robot (or infant) a reference manual that can answer any question about whatā€™s what, including every categoryā€™s distinguishing features, but those words donā€™t mean a thing to the robotlet until it starts grounding them ā€“ bottom-up.

(6) The ChatGPT exchange I sent you yesterday noted that an LLM lodged as an oracle inside a T3-wannabe robotā€™s head might help speed things up once itā€™s reached the stage of indirect verbal grounding. Thatā€™s fine. But the robot has a long way to go to get there. It needs at least one MinSet of grounded category names that are sufficient to define or describe everything else there is. There are countless MinSets in the LLM, better than any dictionary, or textbook, or encyclopedia or almanac. But theyā€™re all ungrounded and useless until the T3 robot does its direct bottom-up grounding to initialize them. Theyā€™re unconnected with the world. They can talk the talk but they canā€™t walk the walk.

(7) So you may as well throw out the LLM and just do the robotic T3 bottom-up grounding until youā€™ve reached a MinSet ā€“ which, even if itā€™s just 1000 grounded words ā€“ has a much larger and longer cumulative history of direct nonverbal (sensorimotor) category-learning to go through until itā€™s able to start grounding new categories verbally. Thatā€™s the time to haul in the LLM (where, in the past, parents and teachers and books and TV used to serve as the oracle). Before that they are as useless as a bible to a baby.

But Iā€™m listening. Just be careful not to trip when you answer about the top-down grounding!

ChatGPT4

Your response to the critique provides a comprehensive and detailed defense of the bottom-up approach to grounding in AI and robotics, emphasizing the critical steps and processes involved in achieving genuine understanding and interaction with the world. Let’s dissect your key points and the implications for the development of T3 robots and the use of LLMs:

Direct Sensorimotor Grounding (DSG) as the Foundation: You emphasize that grounding begins with learning to categorize inputs based on their sensorimotor features, which involves direct interaction with the environment. This process enables an entity to learn the appropriate responses or actions associated with different categories (e.g., eating edible mushrooms while avoiding poisonous ones). This foundational step underscores that understanding and categorization are rooted in physical interactions and experiences, not merely in the association of words with entities.

Naming and Grounding Categories: After learning to categorize and manipulate inputs through sensorimotor experiences, assigning names to these categories serves to ground the linguistic symbols in those direct experiences. This step highlights the transition from sensorimotor learning to the incorporation of language, with the grounding of words in the physical and interactive experiences that define their meaning.
Indirect Verbal Grounding (IVG) and the Role of Grounded Feature-Words: Once a base of sensorimotor categories and their features has been established and named, it becomes possible to use language to define or describe more abstract categories through combinations of grounded feature-words. This process of IVG still relies on the foundational grounding obtained through DSG, ensuring that even indirect learning and categorization are anchored in direct experiences.

Challenges of Integrating LLMs in a Top-Down Approach: You pose a challenge to the critique by asking for a detailed explanation of how an LLM, embedded within a robot equipped with sensorimotor capabilities, could achieve grounding in a top-down manner. Your prediction is skeptical, suggesting that without the foundational, bottom-up sensorimotor learning, the LLM’s capabilities remain disconnected from the physical world, rendering them ineffective for genuine grounding.

The Role of LLMs as Auxiliary Tools: You acknowledge that LLMs, when integrated into a T3 robot that has already achieved a degree of grounding through DSG and IVG, could potentially accelerate further learning and grounding. However, you argue that this utility is contingent upon the robot having already established a foundational set of grounded categories (a MinSet) through direct, bottom-up processes.

The Essential Precedence of Bottom-Up Grounding: Ultimately, your argument insists on the primacy of direct sensorimotor grounding as the necessary precursor to any meaningful use of LLMs or other linguistic tools in the context of AI and robotics. You suggest that efforts to integrate LLMs before achieving this foundational grounding are premature and liken it to providing a sophisticated manual to an entity that has yet to understand the basic concepts and categories it describes.

Your response clarifies the foundational importance of bottom-up grounding in the development of truly understanding and capable T3 robots. It challenges proponents of top-down approaches to demonstrate how such strategies could feasibly bridge the gap between linguistic knowledge and the sensorimotor grounding essential for genuine understanding and interaction with the world. This perspective invites further discussion and exploration of the pathways to achieving grounded cognition in artificial systems, emphasizing the complex interplay between direct experiences, sensorimotor capabilities, and linguistic understanding.

SH:

Ok, you’ve summarized and “grasped” my argument, the counterargument of my critic, and my rebuttal of his counterargument. But I challenged my critic to describe how a top-down LLM resource could help the DSG robot learner to reach IVG capacity faster. Can you meet that challenge? (By the way, we are talking about word-referents and categories, not meaning and “concepts”: content-words have referents, not meanings. Only subject-predicate propositions (such as definitions and descriptions) have meanings,) But I’m listeningā€¦

ChatGPT4

Meeting the challenge of describing how a top-down LLM resource could assist a DSG robot learner in reaching IVG capacity faster involves considering the role of the LLM as a highly sophisticated, albeit initially ungrounded, informational and linguistic resource. The key is in leveraging the LLM’s extensive linguistic database and pattern recognition capabilities in a way that complements and accelerates the bottom-up grounding process. Here’s a potential pathway:

Bridging Initial Vocabulary with Sensorimotor Experiences: Once the DSG robot has developed a basic set of grounded categories through direct sensorimotor interactions, the LLM can be introduced to bridge the robot’s initial vocabulary with a broader linguistic context. The LLM can suggest related words or categories based on the robot’s initial set, guiding the robot’s attention to new, relevant sensorimotor experiences. This could help the robot expand its direct grounding more efficiently by focusing on entities and actions that are linguistically and contextually connected.

Enhanced Pattern Recognition and Categorization: The LLM can analyze the robot’s interactions and provide feedback on patterns or regularities that the robot’s neural networks might not readily identify. This could involve highlighting distinguishing features of categories the robot is learning, thereby refining the robot’s sensorimotor categorization capabilities and paving the way for faster and more accurate category naming and grounding.

Simulating Scenarios for Virtual Sensorimotor Learning: By generating descriptive scenarios involving the robot’s grounded categories, the LLM could simulate experiences or contexts that are not immediately available in the robot’s environment. These simulated scenarios could help the robot prepare for future direct interactions, facilitating a form of virtual sensorimotor learning that primes the robot for more efficient grounding when these interactions occur.

Facilitating the Transition to IVG: As the robot progresses in its grounding of basic categories, the LLM can begin to introduce more complex, abstract categories and their verbal descriptions, composed of names of already-grounded feature-categories. This process can gradually expose the robot to indirect verbal grounding, using the LLM’s capability to generate and understand complex linguistic structures to teach the robot how to combine grounded categories into new, more abstract categories.

Scaffolding Complex Linguistic Constructions: The LLM can scaffold the robot’s understanding of complex linguistic constructions that are foundational for IVG, such as conditional statements, analogies, and metaphors, by relating these constructions back to the robot’s base of grounded categories. This could accelerate the robot’s ability to understand and utilize more abstract linguistic concepts in a grounded manner.

Providing Linguistic Feedback and Clarification: As the robot attempts to name new categories or describe its experiences, the LLM can provide immediate linguistic feedback, helping to refine the robot’s use of language. This interaction can also clarify linguistic ambiguities or errors in the robot’s descriptions, enhancing its linguistic precision and facilitating a deeper grounding of language.

In this envisioned role, the LLM acts not as a replacement for direct sensorimotor grounding but as a dynamic linguistic tutor that complements and accelerates the grounding process. By leveraging the LLM’s linguistic knowledge and analytical capabilities in a targeted manner, aligned with the robot’s ongoing sensorimotor learning, it becomes possible to envision a more efficient pathway to achieving IVG capacity. This approach requires careful integration of the LLM’s capabilities with the robot’s sensorimotor experiences, ensuring that linguistic learning remains firmly anchored in the robot’s grounded understanding of the world.

SH:

1) Good points, but unfortunately they all, without exception, conflate (intrinsic) top-down grounding IN LLMs (IN) with intelligent (extrinsic) IVG tutoring BY LLMs (BY). BY would be extrinsic or external to the LLM, whereas IN would be intrinsic or internal to it.

(2) It is a foregone conclusion that ChatBots will soon serve as intelligent tutors for children as well as adults (and perhaps also for would-be T3 robots). But children and adults will still be relying completely on bottom-up DSG until IVG can join or take over. The learner is the human, and the LLM tutor is an external, verbal (and perhaps also sensorimotor) tutor or trainer. This is not top-down grounding by LLMs. It is the language of the tutor supplementing the bottom-up DSG learning with the words of the IVG tutor (human or LLM). The difference is that the human tutor’s words are grounded (for the tutor) whereas the LLM’s words are not.

It remains bottom-up DSG plus (once language kicks in) bottom-up IVG.

(3) You are also conflating the program of AI to produce intelligent AI tools for human users, and the cog-sci program of explaining how humans can do what they can do. Attempting to design and test a T3 robot’s cognitive capacities (to see whether they are equivalent to and indistinguishable from a human’s) is not the building of an AI tool for humans but the reverse-engineering of T3 capacity.

(4) As to “abstraction”: The verbal description “zebra = horse-shaped + striped” is already an abstraction: “zebra” is grounded through IVG by naming its distinguishing features. This works if the features are already grounded for the learner. (Normally they are also grounded for the tutor, but that is not necessary. The right words are enough. “Democracy” = leader chosen by vote” works the same way. Once it’s IVG it’s all abstraction.

ChatGPT4

Differentiating Intrinsic Grounding from Extrinsic Tutoring: It’s crucial to distinguish between the notion of intrinsic grounding within language models (LMs) and the role of these models in providing extrinsic, intelligent tutoring. Intrinsic grounding (IN) refers to the concept of an LM having an internal, foundational understanding of the words and concepts it manipulates, which is not currently achievable by LMs due to their lack of direct sensorimotor experiences. Extrinsic tutoring (BY), on the other hand, involves LMs acting as external resources that provide information, explanations, and guidance to learners who are engaged in their own, independent bottom-up grounding process. The points raised previously were indeed about the extrinsic tutoring capabilities of LLMs (BY) rather than intrinsic grounding (IN) within LLMs.

Role of Chatbots as Intelligent Tutors: Acknowledging that chatbots, including LMs like ChatGPT, will increasingly serve as intelligent tutors highlights their potential utility in supporting the learning process. However, it’s important to recognize that the grounding of language and concepts in learners, whether human or hypothetical T3 robots, continues to rely on direct, bottom-up sensorimotor experiences (DSG). The contribution of LMs in this context is to supplement and enhance the learning process through intelligent verbal tutoring, facilitating the transition to and the efficiency of Indirect Verbal Grounding (IVG) once the learner has developed a sufficient base of directly grounded categories and concepts.

Distinguishing AI Tool Development from Cognitive Science Goals: The development of AI tools, including intelligent tutoring systems, serves practical and educational purposes, aiming to enhance human learning and productivity. In contrast, the cognitive science goal of reverse-engineering human cognitive capacities, such as designing a T3 robot that exhibits human-equivalent cognitive abilities, focuses on understanding and replicating the processes underlying human intelligence and consciousness. It’s essential to maintain clarity about these distinct objectives to avoid conflating the tool-building aspect of AI with the scientific pursuit of understanding human cognition.

Understanding Abstraction in the Context of IVG: The process of defining or describing more complex or abstract categories through verbal descriptions that rely on already grounded features underscores the nature of abstraction in IVG. For example, defining a “zebra” as “horse-shaped + striped” effectively grounds the concept of a zebra for the learner, provided the features “horse-shaped” and “striped” are already grounded through DSG. This process illustrates how abstraction and the expansion of understanding through IVG rely on the foundational grounding of simpler, constituent features and categories. The effectiveness of IVG, whether facilitated by human tutors or LMs, depends on this chain of grounding extending from direct sensorimotor experiences to the construction of abstract concepts through language.

Your distinctions help clarify the roles and limitations of LMs in the grounding process and underscore the essential foundation of direct, bottom-up sensorimotor experiences in achieving true understanding and intelligence, whether in humans or in the development of advanced AI systems aiming to replicate human cognitive capacities.

SH:

So where does this leave the notion that an LLM (e.g., GPT4) could itself be grounded top-down with some sensorimotor add-ons? Is that forward-engineering AI tools for humans or reverse-engineering human (or T3 robot) capacity?

ChatGPT4

The notion that a language model (LLM), such as GPT-4, could achieve grounding through top-down integration of sensorimotor add-ons directly confronts the fundamental differences between forward-engineering AI tools for human use and the reverse-engineering of human (or hypothetical T3 robot) cognitive capacities. Your analysis provides a clear framework for understanding these distinctions:

Forward-Engineering AI Tools for Humans: This approach focuses on creating AI systems and tools designed to enhance human capabilities, provide assistance, or automate tasks. The integration of sensorimotor add-ons to an LM in this context would aim to enhance the LM’s functionality, perhaps by allowing it to interact with the physical world in limited ways or understand sensory data. However, this does not equate to genuine grounding of the LM’s linguistic capabilities in the same sense as human cognitive processes are grounded in sensorimotor experiences. Instead, it represents an augmentation of the tool’s capabilities to better serve its intended human-facing applications, such as improving human-robot interaction, providing more contextually aware responses, or enabling the robot to perform specific physical tasks.

Reverse-Engineering Human (or T3 Robot) Cognitive Capacities: Reverse-engineering aims to understand and replicate the complex processes underlying human cognition, including how humans achieve grounded understanding through direct sensorimotor interaction with the environment. Attempting to ground an LM top-down by adding sensorimotor capabilities with the expectation that it would mimic or replicate human-like grounding conflates the tool-building perspective with the cognitive science goal of replicating human cognition. Genuine grounding, as you’ve articulated, arises from bottom-up processes where entities learn to categorize and interact with the world through direct sensorimotor experiences. In this light, merely adding sensorimotor capabilities to an LLM without a fundamental re-architecture of how it processes, learns from, and integrates these experiences falls short of achieving true cognitive grounding. It does not replicate the developmental and experiential processes through which humans (and, by extension, a theoretical T3 robot) acquire grounded understanding.

Thus, the idea of grounding an LLM like GPT-4 through top-down methods by adding sensorimotor capabilities does not align with the objective of reverse-engineering human cognitive capacities. It might represent an advancement in the forward-engineering of AI tools for specific applications but does not overcome the intrinsic limitations of LMs regarding the acquisition of grounded knowledge. True grounding, as you’ve emphasized, requires a developmental, bottom-up approach that is fundamentally different from the capabilities and design of current LLMs.

Can Toy Robotic Capacities Make Top-Down Meet Bottom-Up?

Re: Figure Status Update – OpenAI Speech-to-Speech Reasoning

SH:

Is this demo sensorimotor grounding? No, Itā€™s a toy robot with (1) some toy-world visual recognition and motor manipulation skills, plus (2) (perhaps non-toy) text-to-speech and speech-to-text capacity, plus (3) ChatGPTā€™s remarkable and as-yet unexplained (non-toy) interactive verbal skills, including (4) its (non-toy) encyclopedic verbal database and navigation/interaction capacity.

But itā€™s still ungrounded.

If/when it can do the kind of thing it does in the video with anything it can talk about, and not just an infomercial demo, then, and only then, will it have an even more remarkable, and as yet unexplained, (non-toy) grounded T3 robotic capacity.

Two-year-olds are grounding their words via the only way upward: bottom-up, through (unsupervised and supervised) learning of sensorimotor categories, by detecting their distinguishing sensorimotor features directly, and then naming the grounded categories (by describing their features, which are likewise learnable, nameable categories).

Then, because the 2yr-old also has the capacity for language (which means for producing and understanding subject-predicate propositions with truth-values, composed out of category names defined or described by referents of their (grounded) feature-category names), verbal instruction (LLM-style) can kick in and even take over.

Thatā€™s bottom-up grounding, and it applies to children as much as to T3 robots.

But that OpenAI infomercial demo was just a toy. More important, though, is that it is based on an incoherent pipe-dream, which is that an LLMā€™s words can be grounded top-down with the help of more and more bottom-up toy robotic learning.

But the only way to get to the top is with bottom-up T3 robotics. So you might as well forget about the LLM and its word-power for that project. Just let your 2-year-old child or T3 robotlet use ChatGPT as tutor, the way kids use their parents. But to understand what the LLM (or their nanny) is telling them, they have to learn the referents of all the content-words in the verbal explanation. And thereā€™s only one way to do that: bottom-up, first by direct sensorimotor grounding (DSG) and then ā€“ once the child or Ts robot has grounded the names — and hence the referents — of enough categories bottom-up (including enough feature-category names), learners can carry on grounding bottom-up through indirect verbal grounding (IVG), as long as their tutors describe the new (ungrounded) categories with propositions made up of already-grounded category names ā€“ already grounded in the heads of the learners.

So the bottom line is that the referents of the words in the teacher‘s IVG definition or description or explanation have to already be grounded in the head of the learner (whether via DSG or IVG).

I leave it to you, GPT, as an exercise, to explain why that cannot be done top-down with an LLM-powered chat-robot, where itā€™s hanging from ungrounded skyhooks

ChatGPT4 (after a bit of browbeating by me):

An important question in the development of AI and robotics concerns the most efficient and effective path to achieve a genuinely grounded, intelligent system. Is it more practical or feasible to enhance a primarily linguistic, top-down T2 language model (LLM) to achieve the sensorimotor and cognitive capabilities of a T3 robot, or would it be better to develop T3 robotic capabilities bottom-up independently and then supplement these with the LLM’s abilities externally, as a form of advanced, verbal tutoring.

Given the distinctions between T2 and T3 systems, a more grounded and arguably more natural progression of AI development, closely mirroring human cognitive development might be more promising. Humans first develop sensorimotor understanding and category learning capabilities (direct T3 capacities) before language acquisition goes on to significantly enhance and expand their cognitive and communicative capacities. This developmental trajectory suggests that grounding in the physical world through direct sensorimotor experience forms a necessary foundation for the effective use of language for further learning and communication.

Thus, attempting to “train” a T2 LLM chatbot to become a T3 Turing-indistinguishable robot might not only be inherently inefficient but could overlook the essential aspect of grounding that comes from direct interaction with the physical world. Incorporating LLMs as an auxiliary verbal tutor for a T3 entity (human or robot), once it has developed a sufficiently grounded vocabulary about the world, may be a more logical and effective approach, leveraging the strengths of LLMs in processing and generating language to enhance the T3’s cognitive and communicative capabilities. It aligns with how language and learning occur in humans, where foundational knowledge and understanding from direct grounding are subsequently enriched and expanded through language-based learning — indirect but grounded.

This approach would capitalize on the LLM’s capacity for generating coherent, contextually appropriate language but would also ensure that the use of this language was grounded in a pre-existing vocabulary and propositions. It suggests a complementary integration of auxiliary T2 LLM chatbotic and autonomous T3 robotic systems, where each component contributes its strengths to the development of a more versatile and genuinely intelligent agent.

The notion that a top-down, language-centric system (T2) can achieve genuine grounding by incrementally incorporating “toy” sensorimotor capabilities may reflect a fundamental misconception about the nature of grounding and the role of sensorimotor experiences in achieving it.

Grounding isn’t merely a matter of bridging linguistic competence with basic physical interaction. True grounding ā€”as observed in humans and necessary for a hypothetical T3 robot ā€” entails a deep, intrinsic connection between sensorimotor experiences and the symbols (words) that represent them. This requires not just the recognition or manipulation of objects based on learned descriptors but direct understanding of those objects and actions in the world to which they refer.

The error in imagining that a T2 system could “graduate” to T3 status by being endowed with limited sensorimotor abilities lies in underestimating the complexity of human sensorimotor and cognitive integration. Human cognition is not simply the sum of its parts; it’s the product of a highly integrated and complex system that develops bottom-up through continuous, rich interaction with the environment. Language development in humans, deeply intertwined with sensorimotor experiences, cannot be replicated in AI by superficially adding physical capabilities to a fundamentally linguistic system.

The strengths of a T2 system like GPT could be better utilized as an auxiliary educational tool for a T3 robot that is learning through its interactions with the world. In this scenario, GPT’s extensive knowledge base and linguistic capabilities can enhance the learning process by providing information, explanations, and clarifications in a way that complements the robot’s direct experiences. This approach would leverage the best of both worlds: the depth and breadth of GPT’s top-down linguistic knowledge and the direct grounding that comes from the T3’s bottom-up sensorimotor interactions with its environment.

CrƩativitƩ : mƩthode ou magie ?

Stevan Harnad

Harnad, S. (2006). Creativity: Method or magic?. Hungarian Studies20(1), 163-177.

RĆ©sumĆ© : La crĆ©ativitĆ© peut ĆŖtre un trait, un Ć©tat ou simplement un processus dĆ©fini par ses produits. Cela peut ĆŖtre comparĆ© Ć  certaines activitĆ©s cognitives qui ne sont gĆ©nĆ©ralement pas crĆ©atives, telles que la rĆ©solution de problĆØmes, la dĆ©duction, l’induction, l’apprentissage, l’imitation, les essais et erreurs, l’heuristique et Ā« l’enlĆØvement Ā», mais toutes ces activitĆ©s peuvent Ć©galement ĆŖtre rĆ©alisĆ©es de maniĆØre crĆ©ative. . Il existe quatre types de thĆ©ories, attribuant la crĆ©ativitĆ© respectivement Ć  (1) la mĆ©thode, (2) la Ā« mĆ©moire Ā» (structure innĆ©e), (3) la magie ou (4) la mutation. Ces thĆ©ories mettent diversement l’accent sur le rĆ“le de l’inconscient, les contraintes innĆ©es, l’analogie, l’esthĆ©tique, les anomalies, les contraintes formelles, le hasard, les analogues mentaux, les stratĆ©gies heuristiques, la performance improvisĆ©e et la collaboration cumulative. Il y a une part de vertu dans chacun d’entre eux, mais le meilleur modĆØle reste celui implicite dans le dicton de Pasteur : Ā« Le hasard favorise les esprits prĆ©parĆ©s Ā». Et comme l’exercice et mĆŖme la dĆ©finition de la crĆ©ativitĆ© nĆ©cessitent des contraintes, il est peu probable que la Ā« formation Ć  la crĆ©ativitĆ© Ā» ou l’accent mis sur la libertĆ© dans l’Ć©ducation puissent jouer un rĆ“le productif dans cette prĆ©paration. 


Quā€™est-ce que la Ā« crĆ©ativitĆ© Ā» ? Est-ce un trait cognitif stable que certaines personnes possĆØdent et dā€™autres non ? Est-ce un Ć©tat occasionnel dans lequel les gens entrent parfois ? Ou est-elle dĆ©finie entiĆØrement par ses produits : Ā« la crĆ©ativitĆ© est comme la crĆ©ativitĆ© Ā» ? Quoi quā€™il en soit, comment naĆ®t la crĆ©ativitĆ© ? Comment faites-vous? Y a-t-il des rĆØgles ? La pratique vous aidera-t-elle Ć  devenir crĆ©atif ?

Il y a probablement une part de vĆ©ritĆ© dans ces trois notions de ce quā€™est la crĆ©ativitĆ©. C’est (au moins parfois, et dans une certaine mesure) un trait, car c’est un fait statistique que certains individus le prĆ©sentent Ć  plusieurs reprises. Cela peut Ć©galement ĆŖtre corrĆ©lĆ© Ć  dā€™autres traits ; certains pensent mĆŖme que cela peut ĆŖtre prĆ©dit par des tests psychologiques objectifs. Mais cā€™est aussi Ć©videmment un Ć©tat, car personne nā€™est crĆ©atif tout le temps, et certaines personnes ne sont hautement crĆ©atives quā€™une seule fois dans leur vie. Parfois, la crĆ©ativitĆ© nā€™est mĆŖme pas un Ć©tat spĆ©cial et unique, mais plutĆ“t une circonstance dĆ©finie rĆ©trospectivement, basĆ©e sur quelque chose dā€™extĆ©rieur, quelque chose de crĆ©atif quā€™un individu a fait.

Il existe un certain nombre de thĆ©ories sur les mĆ©canismes sous-jacents de la crĆ©ativitĆ©, thĆ©ories lā€™attribuant Ć  tout, depuis la mĆ©thode jusquā€™Ć  la folie ā€“ aucune dā€™entre elles nā€™est trĆØs satisfaisante. Quant Ć  stimuler la crĆ©ativitĆ© ā€“ en utilisant des stratĆ©gies heuristiques ou par une Ā« formation Ć  la crĆ©ativitĆ© Ā» ā€“ cela a eu un succĆØs trĆØs limitĆ©.

Le dicton de Pasteur. Avant de passer Ć  une discussion sur les mĆ©canismes et les mĆ©thodes de crĆ©ativitĆ©, nous ferions bien de garder Ć  l’esprit le cĆ©lĆØbre dicton de Pasteur, Ā«… le hasard favorise l’esprit prĆ©parerĀ», car cela entraĆ®nera s’avĆØrent en dire plus sur ce que l’on peut dire sur la crĆ©ativitĆ© que les notions plus ambitieuses ou plus modernes. Pasteur parlait bien entendu dā€™un type de crĆ©ativitĆ© trĆØs spĆ©cifique, Ć  savoir la crĆ©ativitĆ© scientifique expĆ©rimentale. (La citation commence en fait : < > — “Dans les domaines expĆ©rimentaux” ou “Dans les domaines de l’expĆ©rimentation”, et s’intĆ©ressait en partie Ć  la question de savoir si les dĆ©couvertes expĆ©rimentales — celles que l’on appelle Ā« fortuites Ā» — ne sont en rĆ©alitĆ© que des accidents heureux.) Pasteur La perspicacitĆ© semble cependant sā€™appliquer tout aussi bien Ć  toutes les formes de crĆ©ativitĆ©.

On peut interprĆ©ter le dicton de Pasteur comme suit : il y a une part (peut-ĆŖtre trĆØs importante) de hasard dans la crĆ©ativitĆ©, mais elle a plus de chances de se produire si l’esprit y est prĆ©parĆ© d’une maniĆØre ou d’une autre. Le contexte montre que par Ā« prĆ©paration Ā», Pasteur nā€™entendait pas naĆ®tre avec le trait Ā« crĆ©atif Ā». Il voulait dire que les connaissances et les compĆ©tences existantes pertinentes pour le Ā« saut Ā» crĆ©atif devaient d’abord ĆŖtre suffisamment maĆ®trisĆ©es avant qu’un Ā« coup de tonnerre Ā» ne soit probable. Paradoxalement, sa suggestion est que la seule formule de crĆ©ativitĆ© est la moins crĆ©ative imaginable, Ć  savoir apprendre ce qui est dĆ©jĆ  connu. Ce nā€™est quā€™Ć  ce moment-lĆ  que vous disposerez de suffisamment de matiĆØres premiĆØres nĆ©cessaires pour une contribution originale et que vous serez alors en mesure de reconnaĆ®tre quelque chose de valable et dā€™original pour ce quā€™il est rĆ©ellement.

Certaines notions floues se sont glissĆ©es dans cette histoire : Ā« originalitĆ© Ā», Ā« valeur Ā», Ā« sauts crĆ©atifs Ā» et Ā« coup de tonnerre Ā». Il est clair que la crĆ©ativitĆ© a quelque chose Ć  voir avec l’originalitĆ© et la nouveautĆ©, mais il est tout aussi clair qu’elle ne peut pas simplement ĆŖtre Ć©quivalente Ć  quelque chose de nouveau, car tant de choses nouvelles sont alĆ©atoires, triviales ou sans intĆ©rĆŖt. Cela aussi a Ć  voir avec la Ā« prĆ©paration Ā». Il est peu probable qu’un remĆØde contre le cancer (pour prendre un exemple mythique) soit dĆ©couvert par quelqu’un qui n’a pas fait ses devoirs sur ce que l’on sait dĆ©jĆ  sur le cancer. Il peut en effet proposer de Ā« nouvelles Ā» hypothĆØses auxquelles personne n’a jamais pensĆ©, mais il sera Ć©vident pour les esprits Ā« prĆ©parĆ©s Ā» du domaine qu’une telle hypothĆØse non Ć©tudiĆ©e est simpliste, absurde ou abandonnĆ©e depuis longtemps (comme il le dit). est trĆØs probable — bien que cela ne soit pas, bien sĆ»r, logiquement certain — qu’il en soit ainsi).

La nouveautĆ© ne suffit donc pas. Quelque chose de crĆ©atif doit Ć©galement avoir une certaine valeur par rapport Ć  ce qui existe dĆ©jĆ  et Ć  ce qui est perƧu comme nĆ©cessaire. (Notez que ceci, et toute la discussion prĆ©cĆ©dente, se concentre sur ce que l’on pourrait appeler la crĆ©ativitĆ© Ā« intellectuelle Ā», Ā« technologique Ā» ou Ā« pratique Ā», alors qu’il existe, bien sĆ»r, une autre dimension de la valeur qui n’a pas grand-chose Ć  voir avec l’aspect pratique et peut-ĆŖtre cela n’a plus grand chose Ć  voir avec l’intellectualitĆ©, et c’est lĆ  la crĆ©ativitĆ© artistique. Ici un des critĆØres de valeur est la valeur esthĆ©tique, un critĆØre affectif ou Ć©motionnel qui finira par resurgir de maniĆØre inattendue mĆŖme dans la crĆ©ativitĆ© intellectuelle. Nous y reviendrons plus loin, mais (Pour lā€™instant, notons que les considĆ©rations intellectuelles et pratiques ne sont pas les seules bases pour porter des jugements de valeur.)

Et mĆŖme ĆŖtre nouveau et prĆ©cieux ne semble pas suffire : le rĆ©sultat doit Ć©galement ĆŖtre inattendu ; il doit y avoir un sentiment de surprise. Habituellement, cela signifie que cela ne serait pas venu Ć  lā€™esprit de la plupart des gens, qui tentaient plutĆ“t quelque chose dans le mĆŖme sens sans succĆØs, prĆ©cisĆ©ment parce quā€™ils suivaient les attentes conventionnelles ā€“ ce que le rĆ©sultat surprenant viole dā€™une maniĆØre ou dā€™une autre.

Et ici, avec ce troisiĆØme et dernier critĆØre de Ā« l’inattendu Ā», nous semblons ĆŖtre en contradiction avec le dicton de Pasteur. Car que peut faire toute cette Ā« prĆ©paration Ā», sinon former nos attentes, Ć©tablir des conventions, Ć©voluer dans des directions familiĆØres et sans surprise ? En dĆ©finissant la crĆ©ativitĆ© comme la production de quelque chose qui est non seulement nouveau et prĆ©cieux, mais aussi inattendu, nous semblons avoir mis un handicap insurmontable sur le chemin de la prĆ©paration : car quelle que soit la direction dans laquelle la prĆ©paration nous mĆØne rĆ©ellement, elle ne peut ĆŖtre inattendue. Cela semble en effet paradoxal, mais encore une fois, un examen plus attentif du dicton de Pasteur rĆ©sout l’apparente contradiction : l’idĆ©e n’est pas que la prĆ©paration garantisse la crĆ©ativitĆ©. Rien ne garantit la crĆ©ativitĆ©. Ce que Pasteur veut dire, cā€™est que la seule maniĆØre de maximiser la probabilitĆ©de crĆ©ativitĆ© est la prĆ©paration. Il a reconnu Ć  juste titre que lā€™Ć©lĆ©ment essentiel reste le hasard ā€“ lā€™imprĆ©vu, lā€™inattendu ā€“ mais que ce facteur fortuit est plus probable dans des conditions prĆ©parĆ©es.

Une fois arrivĆ©s Ć  trois critĆØres (certes vagues) pour dĆ©terminer ce qui est crĆ©atif, nous pourrions peut-ĆŖtre renforcer cette notion en l’opposant Ć  ce qui ne l’est pas. Nous dĆ©couvrirons cependant que, mĆŖme sā€™il existe de nombreuses activitĆ©s cognitives qui ne sont pas habituellement crĆ©atrices en elles-mĆŖmes, chacune dā€™entre elles est Ć©galement capable dā€™ĆŖtre exĆ©cutĆ©e de maniĆØre crĆ©ative, ce qui suggĆØre que la crĆ©ativitĆ© est en quelque sorte complĆ©mentaire Ć  la cognition ordinaire.

Qu’est-ce qui n’est pas crĆ©atif ?

RĆ©solution de problĆØme. En gĆ©nĆ©ral, la rĆ©solution de problĆØmes nā€™est pas une activitĆ© crĆ©atrice (mĆŖme si Stravinsky pensait que cā€™Ć©tait le cas ā€“ nous reviendrons sur sa vision et sa dĆ©finition assez diffĆ©rente de la Ā« rĆ©solution de problĆØmes Ā»). La rĆ©solution de problĆØmes implique l’application d’une rĆØgle connue ou d’un Ā« algorithme Ā» afin de rĆ©soudre des problĆØmes d’un type global qui varie de maniĆØre mineure ou prĆ©visible. Bien que certains Ć©lĆ©ments de nouveautĆ© et de prise de dĆ©cision puissent ĆŖtre impliquĆ©s — c’est une erreur de premier cycle, faƧonnĆ©e par les exigences malheureuses de la passation d’examens, que de penser que la rĆ©solution de problĆØmes peut ĆŖtre accomplie avec succĆØs par cœur — et la rĆØgle ou la formule pertinente peut nĆ©cessiter un certain temps. Afin d’ĆŖtre comprise et appliquĆ©e, la rĆ©solution appliquĆ©e conventionnelle de problĆØmes est nĆ©anmoins un processus relativement passif et mĆ©canique. Comprendre et appliquer avec succĆØs une rĆØgle nā€™est tout simplement pas la mĆŖme chose que la dĆ©couvrir. Cependant, comme le montrera notre discussion sur lā€™analogie ci-dessous, des Ć©tincelles de crĆ©ativitĆ© peuvent ĆŖtre impliquĆ©es mĆŖme dans la reconnaissance du fait quā€™une classe de nouveaux problĆØmes peut ĆŖtre rĆ©solue de maniĆØre inattendue par une rĆØgle ancienne. Et mĆŖme dans le contexte de lā€™enseignement, les Ć©lĆØves douĆ©s peuvent redĆ©couvrir de maniĆØre indĆ©pendante de nouvelles applications dā€™algorithmes qui leur ont Ć©tĆ© enseignĆ©s Ć  des fins plus limitĆ©es.

DĆ©duction. Le raisonnement dĆ©ductif, qui est dĆ©fini comme le raisonnement Ć  partir de principes gĆ©nĆ©raux vers des cas particuliers (comme en dĆ©duisant des principes selon lesquels Ā« Tous les hommes sont mortels Ā» et Ā« Socrate est un homme Ā» la consĆ©quence que Ā« Socrate est mortel Ā»), n’est en gĆ©nĆ©ral pas crĆ©atif. . D’un autre cĆ“tĆ©, vues d’une certaine maniĆØre, toutes les mathĆ©matiques sont une dĆ©duction logique : il existe des thĆ©orĆØmes pour lesquels il est difficile, voire impossible, de voir intuitivement s’ils sont vrais ou non, et encore moins de prouver qu’ils sont vrais en montrant les Ć©tapes par lesquelles ils peuvent ĆŖtre dĆ©duits de principes gĆ©nĆ©raux. Par consĆ©quent, toutes les dĆ©ductions ne sont pas triviales ; certains pourraient bien nĆ©cessiter une formidable crĆ©ativitĆ© pour ĆŖtre rĆ©alisĆ©s. En gĆ©nĆ©ral, c’est l’ampleur de l’Ć©cart dĆ©ductif entre les principes et leurs consĆ©quences qui dĆ©termine si la dĆ©duction nĆ©cessite ou non de la crĆ©ativitĆ© : Ā« Socrate est mortel Ā» n’en a pas besoin ; Le dernier thĆ©orĆØme de Fermat le fait.

Induction. Le raisonnement inductif, qui est dĆ©fini comme un Ā« raisonnement Ā» Ć  partir de cas particuliers vers des principes gĆ©nĆ©raux, n’est pas non plus, en gĆ©nĆ©ral, crĆ©atif, mais il est plus problĆ©matique, pour des raisons intĆ©ressantes. Car alors que dans le raisonnement dĆ©ductif, une fois que la vĆ©ritĆ© d’un thĆ©orĆØme est connue et que la preuve a Ć©tĆ© construite, le chemin des principes aux consĆ©quences peut ĆŖtre parcouru de maniĆØre relativement mĆ©canique, dans le raisonnement inductif, il ne semble y avoir aucun chemin mĆ©canique disponible autre que les essais et les erreurs ; et ce chemin, dans les cas les plus intĆ©ressants, peut s’avĆ©rer soit alĆ©atoire, soit sans fin (ou les deux). Ainsi, des gĆ©nĆ©ralisations inductives qui ne sont pas triviales (Ć  la maniĆØre de Ā« telle pomme est ronde, telle pomme est ronde, donc toutes les pommes sont rondes Ā» est trivial) appellent Ć  la crĆ©ativitĆ©. Et mĆŖme lorsque le principe gĆ©nĆ©ral est trouvĆ©, il n’y a pas de chemin “a posteriori” que l’on puisse reconstruire a posteriori (comme on peut le faire aprĆØs avoir dĆ©couvert une preuve dĆ©ductive) pour conduire du particulier au gĆ©nĆ©ral — seulement l’inverse. .

En dā€™autres termes, il ne semble pas y avoir dā€™algorithme ou de rĆØgle gĆ©nĆ©rale pour effectuer un raisonnement inductif. Ainsi, alors que la plupart des inductions quotidiennes sont trĆØs graduelles, triviales et non crĆ©atives, les cas les plus substantiels de Ā« raisonnement Ā» inductif ne sont probablement pas du tout un raisonnement, mais une crĆ©ativitĆ© en action. Notez cependant que, puisque l’ampleur du Ā« fossĆ© Ā» qui sĆ©pare le conventionnel du crĆ©atif est dans une certaine mesure arbitraire (et qu’il est peu probable que nos capacitĆ©s cognitives de base aient Ć©voluĆ© au service d’Ć©vĆ©nements rares et cĆ©lĆ©brĆ©s), mĆŖme Ā« le quotidien Ā» “l’induction” peut prĆ©senter de vĆ©ritables Ć©lĆ©ments de crĆ©ativitĆ© qui n’atteindront jamais la cĆ©lĆ©britĆ©.

Apprentissage. Bien que, comme pour toutes les compĆ©tences, certaines personnes y parviennent mieux et de maniĆØre plus impressionnante que dā€™autres, lā€™apprentissage nā€™est en gĆ©nĆ©ral pas non plus une activitĆ© crĆ©ative : il sā€™agit de lā€™acquisition de connaissances et de compĆ©tences par lā€™instruction et lā€™exemple. De par sa nature, ce n’est pas quelque chose qui peut donner naissance Ć  quelque chose de nouveau et d’inattendu, mĆŖme s’il y a parfois des surprises, avec des Ć©tudiants crĆ©atifs dĆ©couvrant (ou, tout aussi important par rapport Ć  ce qu’ils savent dĆ©jĆ  et ce qu’ils ne savent pas : redĆ©couvrir ) des choses qui vont bien au-delĆ  du contenu immĆ©diat de ce qui leur est enseignĆ©.

Imitation. Par dĆ©finition, lā€™imitation donne naissance Ć  quelque chose qui nā€™est pas nouveau ; ce nā€™est donc en gĆ©nĆ©ral pas non plus une activitĆ© crĆ©atrice. Et pourtant, il sā€™avĆØre Ć©galement ĆŖtre un prĆ©curseur important de la crĆ©ativitĆ©, notamment artistique. Ceux qui finissent par devenir des innovateurs crĆ©atifs commencent souvent par imiter les autres de maniĆØre remarquablement astucieuse. L’imitation est Ć©galement liĆ©e Ć  d’autres facteurs importants dans la crĆ©ativitĆ©, tels que l’analogie, la mĆ©taphore et la Ā« mimesis Ā» (une thĆ©orie grecque selon laquelle l’art imite la nature). Invariablement, le nouveau et le prĆ©cieux ressemble Ć  lā€™ancien dā€™une maniĆØre (peut-ĆŖtre inattendue).

Essai et erreur. Presque par dĆ©finition, les essais et erreurs ne sont pas crĆ©atifs et impliquent un Ć©chantillonnage alĆ©atoire plutĆ“t qu’un choix inspirĆ©. Mais il ne faut pas oublier le rĆ“le du hasard dans la crĆ©ativitĆ©. Le Ā« hasard Ā» fait spĆ©cifiquement rĆ©fĆ©rence Ć  des rĆ©sultats surprenants, nouveaux et prĆ©cieux, survenant purement par hasard, et donc potentiellement issus de rien d’autre que des essais et des erreurs alĆ©atoires. Des enseignements peuvent ĆŖtre obtenus en examinant un panorama de cas individuels. NĆ©anmoins, les essais et erreurs alĆ©atoires (ou Ā« tĆ¢tonner et trouver Ā») sont gĆ©nĆ©ralement le symptĆ“me d’une approche particuliĆØrement peu crĆ©ative. Pourtant, une exception majeure semble ĆŖtre le processus d’Ć©volution biologique (que certains ont mĆŖme dĆ©crit avec admiration comme Ā« crĆ©atif Ā») : l’Ć©volution a produit ses rĆ©sultats remarquables avec ce qui, selon la meilleure thĆ©orie actuelle, n’est guĆØre plus qu’une variation gĆ©nĆ©tique alĆ©atoire, qui est puis faƧonnĆ© de maniĆØre sĆ©lective par ses consĆ©quences adaptatives pour la survie et la reproduction. Des processus similaires (gĆ©nĆ©ralement non crĆ©atifs) sont impliquĆ©s dans la formation du comportement par ses consĆ©quences immĆ©diates dans l’apprentissage par essais et erreurs (Ā« opĆ©rant Ā» ou Ā« skinnĆ©rien Ā»).

Heuristique. Les heuristiques sont gĆ©nĆ©ralement comparĆ©es aux Ā« algorithmes Ā» dans la rĆ©solution de problĆØmes. La rĆ©solution dā€™un problĆØme Ć  lā€™aide dā€™un algorithme ou dā€™une rĆØgle de sĆ©curitĆ© est censĆ©e donner une solution exacte et fiable qui fonctionne dans chaque cas. Le Ā« rĆ©soudre Ā» par des heuristiques ā€“ par un ensemble non intĆ©grĆ© et incomplet de Ā« rĆØgles empiriques Ā» suggestives qui fonctionnent dans certains cas, mais pas dans tous, et pas pour des raisons entiĆØrement comprises ou unifiĆ©es ā€“ est tout aussi peu crĆ©atif que de le rĆ©soudre en algorithme. Cependant, de nombreuses personnes ont remarquĆ© que les procĆ©dures heuristiques (telles que l’Ć©chantillonnage de nombreux cas particuliers par essais et erreurs) conduisent parfois Ć  des idĆ©es, parfois par gĆ©nĆ©ralisation inductive et analogie avec des cas dans lesquels les heuristiques rĆ©ussissent, et parfois en raison du stimulus fourni par les cas. dans lequel les heuristiques (ou mĆŖme les algorithmes) Ć©chouent (voir la discussion sur les anomalies ci-dessous).

EnlĆØvement. Peirce a proposĆ© qu’outre l’induction et la dĆ©duction, il existe un troisiĆØme processus, appelĆ© Ā« enlĆØvement Ā», par lequel les gens trouvent la bonne gĆ©nĆ©ralisation en considĆ©rant des exemples de cas, mĆŖme si la probabilitĆ© d’y parvenir est beaucoup trop faible. Puisque ce processus est hypothĆ©tique, il nā€™appartient pas vraiment Ć  cette liste de choses que nous faisons rĆ©ellement et qui ne sont (gĆ©nĆ©ralement) pas crĆ©atives. Cependant, le reste de lā€™hypothĆØse renvoie Ć  un thĆØme qui reviendra lorsque nous discuterons des mĆ©canismes possibles de crĆ©ativitĆ©. Peirce pensait que la raison pour laquelle nous parvenons si souvent Ć  trouver des gĆ©nĆ©ralisations improbables est que les solutions sont dĆ©jĆ  intĆ©grĆ©es dans notre cerveau. Par consĆ©quent, selon ce point de vue, la crĆ©ativitĆ© est une sorte de Ā« souvenir Ā», un peu comme Platon pensait qu’apprendre Ć©tait un souvenir [anamnĆØse] (et non un souvenir conscient dans les deux cas, bien sĆ»r). Sā€™il est vrai que les schĆ©mas innĆ©s de notre activitĆ© cĆ©rĆ©brale jouent un rĆ“le crucial dans la crĆ©ativitĆ©, alors bien sĆ»r aucune Ā« prĆ©paration Ā» nā€™est plus importante que celle-ci (Ć©volutive ?), et la crĆ©ativitĆ© sā€™avĆØre ĆŖtre en partie une compĆ©tence instinctive.

Ainsi se termine la liste (partielle) des cas suggestifs de ce qui nā€™est habituellement pas une activitĆ© crĆ©atrice. Je vais maintenant discuter briĆØvement de la question Ā« Ć©tat contre trait Ā» avant de passer Ć  l’examen du Ā« processus crĆ©atif Ā» et des Ā« mĆ©canismes Ā» possibles de la crĆ©ativitĆ©.

Trait crƩatif ou Ʃtat crƩatif ?

Il y a actuellement un dĆ©bat considĆ©rable sur la question de savoir si l’intelligence est un trait unitaire ou pluriel, c’est-Ć -dire s’il y a une intelligence ou y en a-t-il plusieurs ? Quelle que soit la vĆ©ritĆ©, il est clair qu’une sorte de Ā« prĆ©paration Ā» (et non celle prĆ©vue par Pasteur) dont un esprit aspirant Ć  ĆŖtre crĆ©atif (au moins intellectuellement) pourrait bĆ©nĆ©ficier serait un QI Ć©levĆ© (ou des QI, s’il y en a). beaucoup). La question de savoir si le QI lui-mĆŖme est un trait hĆ©ritĆ© ou un Ā« Ć©tat Ā» acquis est une question trop complexe pour ĆŖtre discutĆ©e ici (il s’agit probablement des deux), mais notez que la question unitaire/pluriel s’applique Ć©galement Ć  la crĆ©ativitĆ©. Quā€™il sā€™agisse dā€™un trait ou dā€™un Ć©tat, la crĆ©ativitĆ© peut ĆŖtre universelle ou spĆ©cifique Ć  un domaine, les individus la manifestant pour certains types de problĆØmes et pas pour dā€™autres. La distinction entre crĆ©ativitĆ© intellectuelle et crĆ©ativitĆ© artistique en est elle-mĆŖme un bon exemple (voir la discussion sur les arts du spectacle ci-dessous).

La faƧon dont fonctionnent les tests de QI est que nous sĆ©lectionnons, dans le monde rĆ©el, l’activitĆ© ou la compĆ©tence humaine (appelĆ©e Ā« critĆØre Ā») que nous considĆ©rons comme intelligente (par exemple, faire des mathĆ©matiques), puis nous concevons des tests qui sont fortement corrĆ©lĆ©s aux diffĆ©rences individuelles dans ce critĆØre d’activitĆ©, les scores Ć©levĆ©s prĆ©disent un niveau de performance Ć©levĆ© et les scores faibles prĆ©disent un niveau faible. Cā€™est ainsi que les tests de QI sont validĆ©s statistiquement. Essayer de faire de mĆŖme avec les Ā« tests de crĆ©ativitĆ© Ā» pose cependant immĆ©diatement des problĆØmes, tant le critĆØre Ā« compĆ©tence Ā» est rare, diversifiĆ© et difficile Ć  dĆ©finir. Des tests de Ā« crĆ©ativitĆ© Ā» dits de Ā« pensĆ©e divergente Ā» ont Ć©tĆ© construits sans aucune validation solide. Ils diffĆØrent des tests d’intelligence Ā« convergents Ā» en ce sens qu’ils sont ouverts et n’ont pas de rĆ©ponse strictement correcte. Ils sont censĆ©s prĆ©dire la crĆ©ativitĆ©, mais les problĆØmes de validation semblent insurmontables, car une grande partie de la dĆ©finition du Ā« don Ā» et du Ā« gĆ©nie Ā» est post hoc, basĆ©e sur le recul de cas rares et de rĆ©alisations uniques. Il semble y avoir une contradiction entre le caractĆØre prĆ©dictif des tests objectifs et lā€™Ć©lĆ©ment imprĆ©visible de la crĆ©ativitĆ©. Cependant, s’il existe un trait (gĆ©nĆ©ral ou spĆ©cifique Ć  un problĆØme) de Ā« tendance Ć  faire des choses de valeur imprĆ©visibles Ā», alors les tests pourraient vraisemblablement mesurer ses corrĆ©lats, s’il y en a.

Il existe Ć©galement beaucoup de confusion et de chevauchements dans la mesure des compĆ©tences intellectuelles gĆ©nĆ©rales et spĆ©ciales, et aucune idĆ©e claire sur la maniĆØre dont elles peuvent interagir dans la crĆ©ativitĆ©. Les effets du cycle de vie posent Ć©galement des problĆØmes : les compĆ©tences et les connaissances liĆ©es au QI augmentent avec l’Ć¢ge jusqu’Ć  l’Ć¢ge adulte, tandis que la crĆ©ativitĆ© apparaĆ®t Ć  diffĆ©rents Ć¢ges et Ć  diffĆ©rents stades, parfois tĆ“t (comme chez les mathĆ©maticiens), parfois tard (comme chez les Ć©crivains).

En gĆ©nĆ©ral, l’image que nous nous faisons de la crĆ©ativitĆ© basĆ©e sur la mesure objective des diffĆ©rences individuelles n’est pas trĆØs informative, laissant ouverte la possibilitĆ© trĆØs rĆ©elle que, sauf lorsqu’elle dĆ©pend fortement d’une compĆ©tence intellectuelle particuliĆØre (non crĆ©ative), il n’y ait aucun trait mesurable. correspondant Ć  la crĆ©ativitĆ© du tout. Nous nous tournons maintenant vers la crĆ©ativitĆ© en tant qu’Ć©tat ou processus.

MĆ©canismes sous-jacents

Il existe quatre classes de thĆ©ories sur les mĆ©canismes sous-jacents de la crĆ©ativitĆ©. Ils peuvent ĆŖtre classĆ©s (relativement mnĆ©moniquement) comme : (1) mĆ©thode , (2) mĆ©moire , (3) magie et (4) mutation . Le point de vue de la Ā« mĆ©thode Ā» est quā€™il existe une formule pour la crĆ©ativitĆ© (cela nā€™est gĆ©nĆ©ralement pas revendiquĆ© de maniĆØre aussi grossiĆØre). Le point de vue de la Ā« mĆ©moire Ā» est que le facteur essentiel est en quelque sorte innĆ©. La vision Ā« magique Ā» est que des forces mystĆ©rieuses, inconscientes et inexplicables sont impliquĆ©es. Et le point de vue de la Ā« mutation Ā» est que lā€™Ć©lĆ©ment essentiel est le hasard. ConsidĆ©rons maintenant plusieurs thĆ©ories candidates en fonction de ces quatre catĆ©gories :

L’inconscient. La crĆ©ativitĆ© en tant que fonctionnement de Ā« l’esprit inconscient Ā» appartient Ć  la classe des thĆ©ories Ā« magiques Ā» (telles que l’inspiration divine). Il nā€™offre aucune vĆ©ritable explication du processus crĆ©atif, lā€™attribuant simplement Ć  un inconscient mystĆ©rieux (et trĆØs crĆ©atif). Cette idĆ©e est reprise par Hadamard et dā€™autres dans son livre sur lā€™invention mathĆ©matique et est, bien entendu, trĆØs influencĆ©e par les idĆ©es freudiennes dominantes Ć  lā€™Ć©poque. Le scĆ©nario est que pendant un certain temps, on travaille consciemment sur un problĆØme, et quand on Ć©choue, notre esprit inconscient continue et accomplit mystĆ©rieusement ce que l’esprit conscient ne pouvait pas. Du point de vue des sciences cognitives modernes, cela nā€™est pas trĆØs utile, car tous les processus cognitifs sont inconscients et, en tant que tels, nĆ©cessitent une explication , et pas seulement une attribution anthropomorphique Ć  un autre, plus sage (ou plus primitif), tous analogues au conscient.

Le problĆØme de lā€™explication de la cognition crĆ©ative et non crĆ©ative consiste Ć  fournir un mĆ©canisme pour lā€™ensemble de nos processus inconscients. Le seul aspect informatif du modĆØle de Ā« lā€™inconscient Ā» est lā€™attention quā€™il attire sur le rĆ“le incomplet des efforts conscients et dĆ©libĆ©rĆ©s dans le processus crĆ©atif. Notons cependant que le dicton de Pasteur indiquait dĆ©jĆ  que la prĆ©paration Ć©tait nĆ©cessaire mais pas suffisante. (De plus, Ā« un effort conscient et dĆ©libĆ©rĆ© Ā» nā€™est mĆŖme pas suffisant pour expliquer des activitĆ©s cognitives aussi peu crĆ©atives que se souvenir dā€™un nom, reconnaĆ®tre un visage ou additionner deux et deux.)

Structure innĆ©e de l’esprit. Le concept dĆ©jĆ  dĆ©crit comme Ā« enlĆØvement Ā» vient d’une thĆ©orie de la Ā« mĆ©moire Ā» (anamnĆØse) selon laquelle la crĆ©ativitĆ© est en quelque sorte guidĆ©e ou contrainte par la structure innĆ©e de l’esprit. (Il existe une thĆ©orie Ć©quivalente de l’Ć©volution biologique, le Ā« prĆ©formationnisme Ā», selon laquelle la structure Ć©voluĆ©e n’est pas faƧonnĆ©e par le hasard et par essais et erreurs, mais est dĆ©jĆ  inhĆ©rente Ć  la structure de la matiĆØre.) Cette contrainte structurelle peut prendre deux formes. prendre. Soit cela fonctionne en Ć©liminant bon nombre des faux dĆ©parts possibles que nous pourrions prendre en les rendant (littĆ©ralement) impensables en premier lieu, soit cela nous guide dā€™une maniĆØre ou dā€™une autre dans la faƧon dont nous sĆ©lectionnons et Ć©valuons les possibilitĆ©s. Notez que cette thĆ©orie semble Ć  premiĆØre vue s’appliquer plus naturellement Ć  la crĆ©ativitĆ© intellectuelle, oĆ¹ il existe vraisemblablement un Ā« bien Ā» ou un Ā« mal Ā», plutĆ“t qu’Ć  la crĆ©ativitĆ© artistique ; mais bien sĆ»r, dans la crĆ©ativitĆ© artistique, oĆ¹ prĆ©dominent les critĆØres esthĆ©tiques (affectifs et perceptuels), il est facile de voir comment le Ā« bien Ā» et le Ā« mal Ā» peuvent dĆ©pendre de nos organes sensoriels et de notre structure Ć©motionnelle. (Le rĆ“le possible des contraintes esthĆ©tiques mĆŖme dans la crĆ©ativitĆ© intellectuelle sera repris ci-dessous.)

Le problĆØme avec la vision de lā€™abduction est quā€™elle semble attribuer trop de structure innĆ©e spĆ©cifique Ć  lā€™esprit (et Ć  cet Ć©gard, elle a un Ć©lĆ©ment de la vision magique). Puisque le langage, la logique et lā€™Ć©chantillonnage mĆ©canique des variations possibles par essais et erreurs semblent nous permettre de concevoir tant de choses, il est difficile de voir comment la premiĆØre forme dā€™enlĆØvement ā€“ ā€‹ā€‹les limites de ce qui est concevable ā€“ pourrait avoir une grande importance. rĆ“le. Le problĆØme de la crĆ©ativitĆ© semble commencer une fois que nous prenons pour acquis le vaste Ć©ventail dā€™alternatives imaginables : comment pouvons-nous alors trouver les Ā« bonnes Ā» ?

La deuxiĆØme forme d’enlĆØvement ā€“ ā€‹ā€‹l’orientation sĆ©lective ā€“ est peut-ĆŖtre plus prometteuse et sera discutĆ©e Ć  nouveau ci-dessous, mais pour l’instant il convient de noter qu’il n’est pas clair dans quelle mesure cette fonction d’Ā« orientation Ā», celle impliquĆ©e dans les intuitions, les conjectures , intuitions, etc. (quels qu’ils soient), est un phĆ©nomĆØne innĆ© et Ć©volutif, dĆ©coulant de la structure de notre esprit, plutĆ“t que d’un effet de l’expĆ©rience, de la prĆ©paration, de l’analogie et mĆŖme du hasard. Le point de vue de lā€™abduction semble attribuer trop de choses Ć  la structure innĆ©e sans donner aucune explication sur sa nature et ses origines.

Analogie. Bien quā€™il ne sā€™agisse pas dā€™un modĆØle complet du processus crĆ©atif, la vision mettant lā€™accent sur la pensĆ©e analogique est clairement une question de mĆ©thode. Lā€™idĆ©e est que les analogies jouent un rĆ“le important dans le processus crĆ©atif ; que souvent une nouvelle Ā« solution Ā» (ou, dans le cas artistique, une nouvelle innovation) sera basĆ©e sur une analogie fĆ©conde et jusqu’alors inaperƧue avec une solution existante dans un autre domaine (Hesse, Black). Cela dĆ©pend dans une large mesure de notre capacitĆ© et de notre inclination Ć  rechercher, trouver et apprĆ©cier les similitudes structurelles, fonctionnelles et formelles. Cela pourrait trĆØs bien impliquer un processus cognitif de base, liĆ© Ć  la maniĆØre dont nos connaissances sont reprĆ©sentĆ©es et manipulĆ©es.

Il existe une forme plus Ć©laborĆ©e de thĆ©orie de l’analogie, la thĆ©orie de la Ā« mĆ©taphore Ā», qui s’applique non seulement Ć  la crĆ©ation poĆ©tique, mais Ć  la crĆ©ativitĆ© en gĆ©nĆ©ral. Dans la mesure oĆ¹ cette thĆ©orie n’est pas elle-mĆŖme simplement mĆ©taphorique, elle est instructive sur la productivitĆ© surprenante de la stratĆ©gie consistant Ć  trouver, voire Ć  imposer des similitudes en juxtaposant des objets, des images ou des idĆ©es, puis, en un sens, Ć  Ā« lire Ā» ou Ć  interprĆ©ter les consĆ©quences. de la juxtaposition (Harnad). Il ne sā€™agit cependant pas dā€™une stratĆ©gie infaillible, pas plus que ne le sont lā€™induction systĆ©matique ou les essais et erreurs alĆ©atoires, car il existe bien plus dā€™analogies stĆ©riles et vides de sens que dā€™analogies Ā« crĆ©atives Ā». Les options sont toutefois rĆ©duites par la prĆ©paration (et peut-ĆŖtre par l’enlĆØvement) et, avec l’aide du hasard, l’analogie – Ć  la fois dĆ©libĆ©rĆ©e et accidentelle – joue un rĆ“le indĆ©niable dans la crĆ©ativitĆ©.

PrĆ©paration. Il faut ici Ć©voquer la Ā« mĆ©thode Ā» pasteurienne elle-mĆŖme, celle de la prĆ©paration. Les rĆ©sultats crĆ©atifs ont tendance Ć  ĆŖtre de nouvelles recombinaisons dā€™Ć©lĆ©ments existants, qui doivent donc tous ĆŖtre rendus facilement disponibles Ć  lā€™avance par prĆ©paration. La probabilitĆ© de gĆ©nĆ©rer et de reconnaĆ®tre un rĆ©sultat nouveau et prĆ©cieux dĆ©pend dā€™une maĆ®trise suffisante de ce qui est dĆ©jĆ  disponible. Aucune stratĆ©gie plus sĆ»re ne peut ĆŖtre recommandĆ©e Ć  quiconque aspire Ć  apporter une contribution crĆ©ative dans n’importe quel domaine que de maĆ®triser aussi complĆØtement que possible ce qui est dĆ©jĆ  connu dans ce domaine et d’essayer d’Ć©tendre le cadre de l’intĆ©rieur. Cā€™est certes paradoxal. PremiĆØrement, par dĆ©finition, une contribution crĆ©ative ne viendra pas des mĆ©thodes existantes ni de Ā« lā€™intĆ©rieur Ā». DeuxiĆØmement, il y a le problĆØme bien connu de tomber dans un Ā« Ć©tat dā€™esprit Ā» mental, qui implique de persĆ©vĆ©rer avec les mĆ©thodes existantes par habitude, au dĆ©triment dā€™en essayer ou mĆŖme dā€™en remarquer de nouvelles (comme lorsquā€™on revient en arriĆØre pour chercher quelque chose que lā€™on a dĆ©jĆ  dĆ©couvert). perdu au mĆŖme endroit encore et encore) — c’est prĆ©cisĆ©ment ce qu’on pourrait s’attendre Ć  ce qu’un accent excessif sur la prĆ©paration encourage.

Les dĆ©cors conventionnels sont un danger omniprĆ©sent, et il n’existe aucune formule pour les surmonter si ce n’est de garder Ć  l’esprit que la maĆ®trise n’implique pas l’esclavage et que le but ultime est de transcender les conventions, pas d’y succomber : Une attitude d’admiration et de dĆ©vouement envers le les connaissances ou les compĆ©tences que l’on souhaite maĆ®triser ne sont pas incompatibles avec un esprit d’ouverture d’esprit, d’individualitĆ© et mĆŖme un certain scepticisme ; en effet, une capacitĆ© dā€™imitation prĆ©coce associĆ©e Ć  un Ć©lĆ©ment de rĆ©bellion peut ĆŖtre un indicateur de promesse dans un domaine donnĆ© (mĆŖme si les dons prodigues nā€™aboutissent parfois Ć  rien). Que la crĆ©ativitĆ© soit un Ć©tat ou un trait, il est clair quā€™Ć  connaissances ou compĆ©tences initiales identiques, certaines personnes rĆ©ussissent Ć  apporter des contributions originales tandis que dā€™autres tombent dans des attitudes infructueuses et persĆ©vĆ©rantes. La seule stratĆ©gie qui reste Ć  recommander est que si des progrĆØs ne sont pas rĆ©alisĆ©s aprĆØs une tentative suffisamment longue et sĆ©rieuse, il faut ĆŖtre prĆŖt Ć  passer Ć  autre chose (temporairement, voire dĆ©finitivement), peut-ĆŖtre dans l’espoir que la crĆ©ativitĆ©, comme l’intelligence, soit plurielle, et on pourra l’exposer dans un autre domaine.

L’observation bien connue selon laquelle les mathĆ©maticiens ont tendance Ć  faire leurs dĆ©couvertes crĆ©atives lorsqu’ils sont trĆØs jeunes peut ĆŖtre due Ć  l’effet Ā« d’ensemble Ā» : cela peut ĆŖtre au point culminant de sa Ā« prĆ©paration Ā» dans ce domaine problĆ©matique le plus Ć©lĆ©gant et le plus technique – – lorsqu’on arrive fraĆ®chement au seuil de maĆ®trise (parfois appelĆ© maturitĆ© mathĆ©matique) — qu’on est le mieux placĆ© pour apporter une contribution crĆ©ative en mathĆ©matiques ; alors on peut passer sa vie Ć  explorer les implications de ces idĆ©es virginales. AprĆØs une exposition plus longue, des ensembles improductifs se forment et il est difficile de s’en sĆ©parer. Il se peut que s’ils avaient changĆ© de domaine ou s’ils Ć©taient arrivĆ©s aux mathĆ©matiques plus tard, ces mĆŖmes individus prĆ©coces auraient fait preuve d’une crĆ©ativitĆ© Ā« plus tardive Ā». Il est indĆ©niable, cependant, quā€™il existe des effets sur le cycle de vie ā€“ et sur les traits ā€“ de la crĆ©ativitĆ©, quel que soit le moment ou le domaine de prĆ©paration de chacun. Les idĆ©es et les compĆ©tences des historiens et des Ć©crivains, par exemple, ont tendance Ć  mĆ»rir plus tard dans la vie, peut-ĆŖtre parce qu’elles dĆ©pendent d’une Ā« prĆ©paration Ā» plus longue et moins concentrĆ©e, ou parce que les compĆ©tences verbales mĆ»rissent plus tard.

Mais malgrĆ© le danger omniprĆ©sent d’ĆŖtre victime d’ensembles non crĆ©atifs, s’il existe une Ā« mĆ©thode Ā» crĆ©atrice, c’est bien la Ā« pasteurisation Ā», le Ā« trait Ā» crĆ©atif ne reprĆ©sentant peut-ĆŖtre rien de plus qu’une forme rare de rĆ©sistance ou d’immunitĆ© Ć  la contagion de convention malgrĆ© une exposition importante.

Facteurs intuitifs et esthĆ©tiques. Les thĆ©ories qui font appel Ć  Ā« l’intuition Ā» et Ć  Ā« l’esthĆ©tique Ā» comme guides de la crĆ©ativitĆ© appartiennent, comme dĆ©jĆ  mentionnĆ©, Ć  la catĆ©gorie Ā« mĆ©moire Ā». Outre ce qui a dĆ©jĆ  Ć©tĆ© dit, il est instructif de rĆ©flĆ©chir Ć  l’anecdote de Bertrand Russell (basĆ©e sur une histoire qu’il a entendue de William James) Ć  propos de l’homme qui, lorsqu’il reniflait du protoxyde d’azote (gaz hilarant), connaissait le secret de l’univers, mais qui, lorsqu’il se dissipait, l’oubliait toujours. Un jour, il a dĆ©cidĆ© de lā€™Ć©crire sous influence. Lorsque les effets se sont attĆ©nuĆ©s, il sā€™est prĆ©cipitĆ© pour voir ce quā€™il avait Ć©crit. C’Ć©tait : Ā« L’odeur du pĆ©trole se rĆ©pand partout. Ā» Ce que Russell a pris cette anecdote pour suggĆ©rer, c’est que l’intuition peut aussi ĆŖtre un faux guide. Si lā€™on se laisse guider uniquement par son sens intuitif ou esthĆ©tique de la profondeur, on peut alors ĆŖtre amenĆ© Ć  attribuer une signification cosmique Ć  un non-sens. Ainsi Russell a suggĆ©rĆ© que, s’il peut ĆŖtre bien beau de se laisser influencer par des considĆ©rations esthĆ©tiques (ce que les mathĆ©maticiens ont appelĆ© Ā« beautĆ© Ā», Ā« Ć©lĆ©gance Ā», etc.), il faut garder Ć  l’esprit que ces intuitions subjectives doivent rĆ©pondre Ć  des questions d’ordre esthĆ©tique. des tests objectifs ensuite (dans le cas des mathĆ©matiques, une prouvabilitĆ© rigoureuse), et qu’il ne faut pas se laisser emporter par ses Ā« Ć©piphanies Ā» subjectives.

Il faut cependant ajouter, en faveur de l’intuition, et peut-ĆŖtre de l’abduction, qu’en mathĆ©matiques il semble y avoir un Ā« trait Ā», que seuls quelques trĆØs rares mathĆ©maticiens trĆØs douĆ©s possĆØdent, celui d’ĆŖtre capable de formuler Ć  plusieurs reprises des conjectures intuitives qui s’avĆØrent par la suite avoir raison. Certains vont mĆŖme jusquā€™Ć  dire que cette capacitĆ© Ć  deviner ce qui est vrai est le vĆ©ritable gĆ©nie des mathĆ©matiques, et non la capacitĆ© Ć  produire des preuves rigoureuses. Bien sĆ»r, les deux vont de pair, et il nā€™y a pas de meilleur guide pour construire des preuves quā€™un sens intuitif de ce qui sā€™avĆ©rera vrai et de ce qui sera faux. Quoi quā€™il en soit, le rĆ“le des intuitions prĆ©verbales, perceptuelles et esthĆ©tiques ne doit pas ĆŖtre sous-estimĆ© dans la crĆ©ativitĆ©. Notez Ć©galement que lā€™esthĆ©tique nā€™est pas nĆ©cessairement innĆ©e. Certains Ā« goĆ»ts Ā» peuvent ĆŖtre acquis par prĆ©paration, par analogie avec d’autres domaines d’expĆ©rience, voire par hasard.

Anomalie. Une autre Ā« recette Ā» pour la crĆ©ativitĆ©, le modĆØle axĆ© sur la prĆ©paration/l’anomalie, est une mĆ©thode basĆ©e sur l’observation que les idĆ©es crĆ©atives sont souvent provoquĆ©es par la rencontre d’une anomalie ou par l’Ć©chec des solutions existantes. Il n’est pas clair si cette variable est vĆ©ritablement causale ou simplement situationnelle (c’est-Ć -dire que lĆ  oĆ¹ il doit y avoir une solution crĆ©ative, il doit d’abord y avoir un problĆØme), mais ce qui doit finalement provoquer une solution crĆ©ative est Ć©videmment une sorte d’Ć©chec des solutions non crĆ©atives. . Parfois, le simple fait de dĆ©couvrir quā€™une rĆØgle fidĆØle ne fonctionne pas de maniĆØre inattendue dans certains types de cas nous met dans la bonne direction. Le rĆ©sultat, en cas de succĆØs, est une rĆ©vision de tout un cadre afin de tenir compte de l’anomalie et en mĆŖme temps de considĆ©rer les solutions antĆ©rieures comme des cas particuliers. John Kemeny disait : Ā« Si je rencontre quelque chose de nouveau, j’essaie d’abord de l’intĆ©grer dans mon systĆØme ; si je n’y arrive pas, j’essaie de le rejeter [comme Ć©tant faux ou non pertinent] ; si cela Ć©choue, alors j’essaie de rĆ©viser mon systĆØme. pour l’ adapter ).” (Et, dans une variante lĆ©gĆØrement magique, Russell ajoute : Ā« Si tout le reste Ć©choue, je le confie Ć  mon inconscient jusqu’Ć  ce que quelque chose apparaisse. Ā»)

MalgrĆ© le rĆ“le de lā€™anomalie en tant que stimulus (et condition prĆ©alable logique) Ć  la crĆ©ativitĆ©, il ne sā€™agit cependant pas dā€™une mĆ©thode fiable, comme doivent en tĆ©moigner dā€™innombrables rencontres non crĆ©atives (et infructueuses) avec des anomalies. Les anomalies peuvent servir Ć  briser des ensembles, mais elles peuvent aussi en crĆ©er, sous la forme de tentatives rĆ©pĆ©tĆ©es et infructueuses de rĆ©solution. Pourtant, il est indĆ©niable que lā€™histoire de la construction thĆ©orique en science peut ĆŖtre dĆ©crite comme une rĆ©vision et une subsomption motivĆ©es par des anomalies. [1]

Contraintes. Une autre Ā« mĆ©thode Ā» est suggĆ©rĆ©e par les vues de Stravinsky sur le rĆ“le crĆ©atif des Ā« contraintes Ā» dans ce quā€™il appelle la Ā« rĆ©solution de problĆØmes Ā». [2] Stravinsky a expliquĆ© pourquoi il a continuĆ© Ć  composer de la musique tonale aprĆØs que la plupart des compositeurs aient abandonnĆ© le systĆØme tonal en disant que Ā« vous ne pouvez pas crĆ©er avec un mĆ©dium souple Ā». Il avait besoin du systĆØme tonal comme d’une contrainte Ć  l’intĆ©rieur de laquelle il pouvait exercer sa crĆ©ativitĆ©.

Le point de vue de Stravinsky pourrait bien ĆŖtre une variante du thĆØme de la Ā« prĆ©paration Ā», car si Ā« quelque chose se passe Ā» (en raison dā€™une prĆ©paration insuffisante), rien de crĆ©atif ne peut se produire. Cā€™est pourquoi Stravinsky considĆ©rait toute crĆ©ativitĆ© comme une solution Ć  un problĆØme. Il estimait qu’un mĆ©dium crĆ©atif ne pouvait pas ĆŖtre infiniment flexible, infiniment Ā« libre Ā». Il lui a fallu rĆ©sister d’une maniĆØre ou d’une autre (peut-ĆŖtre en suscitant des anomalies, des problĆØmes) pour permettre Ć  la crĆ©ativitĆ© de s’exercer, voire de se dĆ©finir. Pendant la majeure partie de sa vie, Stravinsky a personnellement prĆ©fĆ©rĆ© le systĆØme tonal classique comme contrainte, s’efforƧant d’y crĆ©er des innovations ; d’autres, comme les compositeurs dodĆ©caphoniques, ont rejetĆ© la tonalitĆ©, la remplaƧant par un autre systĆØme de contraintes (peut-ĆŖtre, pensent certains, des contraintes abductivement Ā« contre nature Ā», ce qui suggĆØre que mĆŖme dans les arts, les contraintes ne peuvent pas ĆŖtre entiĆØrement arbitraires). Mais Stravinsky voulait dire qu’il ne peut y avoir de crĆ©ativitĆ© sans problĆØmes, pas de problĆØmes sans contraintes, pas de contraintes sans prĆ©paration. Les rĆØgles peuvent ĆŖtre conƧues pour ĆŖtre brisĆ©es de maniĆØre crĆ©ative, mais elles doivent ĆŖtre maĆ®trisĆ©es avant de pouvoir ĆŖtre modifiĆ©es ou abandonnĆ©es, et il doit toujours y en avoir de nouvelles pour les remplacer.

Il y a peut-ĆŖtre ici une leƧon pour les partisans dā€™une libertĆ© crĆ©ative Ā« dĆ©licate Ā» (de prĆ©fĆ©rence Ć  la Ā« pasteurisation Ā») dans lā€™Ć©ducation prĆ©scolaire. Cette stratĆ©gie reprĆ©sente probablement encore une autre forme de Ā« formation Ć  la crĆ©ativitĆ© Ā» inefficace et peut-ĆŖtre mĆŖme contre-productive. Bien quā€™in fine souhaitable et mĆŖme nĆ©cessaire Ć  la crĆ©ativitĆ©, la libertĆ© (lā€™absence de contrainte) rend aussi logiquement la crĆ©ativitĆ© impossible avant la prĆ©paration. De plus, la libertĆ© peut avoir plus Ć  voir avec ce que vous ĆŖtes qu’avec ce que vous faites, la formation Ć©tant donc mieux destinĆ©e Ć  vous montrer d’abord comment suivre les rĆØgles plutĆ“t que comment les bafouer. Peut-ĆŖtre quā€™Ć©tudier les vĆ©ritables exemples de libertĆ© crĆ©ative ā€“ et leur Ć©volution historique en temps rĆ©el ā€“ serait plus utile et stimulant que dā€™inculquer des libertĆ©s lĆ©gendaires dans un moyen de vœux pieux : la crĆ©ativitĆ© des gĆ©nĆ©rations futures est plus susceptible dā€™ĆŖtre maximisĆ©e par des idĆ©es inspirĆ©es. que par une pĆ©dagogie indulgente.

SĆ©rendipitĆ©. La classe de thĆ©ories que lā€™on pourrait appeler lā€™Ć©cole du Ā« hasard cĆ©rĆ©bral Ā» (Ć  laquelle appartenaient Einstein et PoincarĆ©) sont les thĆ©ories de la mutation, mettant lā€™accent sur le rĆ“le crucial du hasard dans la crĆ©ativitĆ©. Bien sĆ»r, Pasteur le croyait aussi. Le scĆ©nario consiste Ć  rassembler les Ć©lĆ©ments et les contraintes Ć  partir desquels une solution crĆ©ative est (espĆ©rĆ©e) surgir, puis Ć  confier le reste au Ā« jeu combinatoire Ā» (inconscient) du hasard, l’intuition aidant peut-ĆŖtre Ć  suggĆ©rer quelles combinaisons pourraient ĆŖtre fructueux. Cette vision apporte une clarification importante du rĆ“le de la prĆ©paration, car sans prĆ©paration, les Ć©lĆ©ments essentiels Ć  partir desquels une combinaison fortuite pourrait naĆ®tre seraient tout simplement absents, mĆ©connus ou mĆ©connus.forme

Analogue mental. Il existe certains modĆØles spĆ©culatifs Ā« mentaux analogiques Ā», appartenant Ć  la classe de la mĆ©moire, qui suggĆØrent que parfois la structure d’un problĆØme et sa solution peuvent avoir des contreparties analogiques dans l’esprit. Des Ā« catastrophes Ā» mentales et des Ā« transitions de phase Ā» rĆ©sultant de modĆØles mentaux rĆ©ellement codĆ©s dans le cerveau et rĆ©gis par la thĆ©orie mathĆ©matique des catastrophes ou la thĆ©orie fractale ont Ć©tĆ© suggĆ©rĆ©es, entre autres. Celles-ci sont encore trop spĆ©culatives pour ĆŖtre envisagĆ©es, mais quelque chose de ce genre pourrait en principe servir de mĆ©diateur Ć  des solutions abductives, voire acquises.

StratĆ©gies heuristiques. Une autre classe de mĆ©thodes dĆ©coule des suggestions (par exemple, celles de Polya) de s’engager dĆ©libĆ©rĆ©ment dans l’heuristique — faire un Ć©chantillonnage alĆ©atoire ou mĆ©canique par essais et erreurs, tester des analogies et des conjectures inductives, etc. — comme discutĆ© plus tĆ“t. Ces stratĆ©gies pourraient ĆŖtre mieux dĆ©crites comme la phase heuristique de prĆ©paration. Ils ne peuvent clairement rien garantir, mĆŖme sā€™ils peuvent augmenter la probabilitĆ© dā€™un coup de chance dans un esprit par ailleurs prĆ©parĆ©.

Improvisation et performance. Un cas particulier combinant les Ā« mĆ©thodes Ā» heuristiques, esthĆ©tiques et analogiques est suggĆ©rĆ© par les arts du spectacle, qui font preuve de crĆ©ativitĆ© Ā« en temps rĆ©el Ā», Ā« en ligne Ā» tout en exĆ©cutant, interprĆ©tant et, surtout, improvisant sur les codes formels crĆ©Ć©s par les compositeurs. et des dramaturges. Les partitions musicales et les scĆ©narios thĆ©Ć¢traux, ainsi que la formation aux arts du spectacle, constituent les contraintes et la prĆ©paration, tandis que la reprĆ©sentation elle-mĆŖme, si elle n’est pas simplement mĆ©canique mais innovatrice et expressive, est Ā« l’acte Ā» crĆ©ateur.

Il existe de nombreuses idĆ©es fausses selon lesquelles la performance serait en quelque sorte une crĆ©ativitĆ© dĆ©rivĆ©e ou de second ordre. Ceci est une erreur. Chaque mĆ©dium de crĆ©ation a ses propres contraintes, ses propres Ā« donnĆ©es Ā». Et tous laissent place Ć  lā€™originalitĆ© et Ć  lā€™innovation, bref au gĆ©nie. Les arts du spectacle peuvent en fait ĆŖtre particuliĆØrement rĆ©vĆ©lateurs de la crĆ©ativitĆ©, car ils Ā« l’externalisent Ā», pour ainsi dire, la faisant se produire sous vos yeux. Les leƧons que l’on en tire sont familiĆØres : beaucoup de prĆ©paration et d’artisanat, une imitation considĆ©rable du passĆ©, un sens esthĆ©tique guidant son goĆ»t en matiĆØre d’innovation, ainsi que la capacitĆ© et l’envie de faire quelque chose de valable, de convaincant et de nouveau avec la matiĆØre premiĆØre. Avant que les arts Ā« crĆ©atifs Ā» et Ā« du spectacle Ā» ne soient sĆ©parĆ©s, on aurait pu observer de ses propres yeux un poĆØte-mĆ©nestrel interprĆØte, captivĆ© par un moment inspirĆ© ā€“ guidĆ© par sa muse ā€“ Ć©laborer un conte hĆ©ritĆ© (prĆ©parĆ©) d’une maniĆØre nouvelle et inspirĆ©e lors d’une performance improvisĆ©e.

ComplĆ©mentaritĆ©. Enfin, parmi les mĆ©thodes, il faut mentionner le rĆ“le des efforts collaboratifs, cumulatifs et complĆ©mentaires dans le jeu combinatoire entre de nombreux esprits diffĆ©rents (peut-ĆŖtre diffĆ©rentiellement Ā« favorisĆ©s Ā» par des dons intellectuels et crĆ©atifs) pour maximiser la probabilitĆ© d’un rĆ©sultat crĆ©atif et commun. Les arts du spectacle suggĆØrent dĆ©jĆ  que la crĆ©ativitĆ© nā€™est pas un processus statique, et peut-ĆŖtre mĆŖme pas un processus individuel. Il existe une spĆ©cialisation complĆ©mentaire dans tous les domaines de crĆ©ation : compositeur/interprĆØte, acteur/rĆ©alisateur, expĆ©rimentateur/thĆ©oricien, conjectureur intuitif/vĆ©rificateur de thĆ©orĆØmes rigoureux. Et puis il y a la relation complĆ©mentaire la plus fondamentale de toutes : la relation du prĆ©sent au passĆ©. La prĆ©paration d’une personne prend invariablement la forme des produits crĆ©atifs de ses prĆ©dĆ©cesseurs. Ils ont fourni les contraintes sur un mĆ©dium autrement productif dans lequel chacun peut alors tenter sa propre chance d’apporter une contribution crĆ©ative.

Conclusions

La crĆ©ativitĆ© est un phĆ©nomĆØne soumis Ć  des contraintes Ć  la fois externes et internes. Les facteurs externes concernent lā€™Ć©tat historique du domaine problĆ©matique et le rĆ“le de lā€™imprĆ©visible. Les internes concernent la faƧon dont lā€™esprit est prĆ©parĆ© et Ā« favorisĆ© Ā» (dotĆ©). Bien quā€™il existe certaines mĆ©thodes heuristiques que lā€™on peut essayer (telles que lā€™induction par essais et erreurs et lā€™analogie), la meilleure stratĆ©gie que lā€™on puisse adopter pour maximiser les chances de crĆ©ativitĆ© est de maximiser la prĆ©paration. Cependant, la maximisation nā€™est pas la mĆŖme chose quā€™une garantie ; MĆŖme si elle n’est pas magique, la crĆ©ativitĆ© restera toujours mystĆ©rieuse en raison de la rĆØgle essentielle de l’inattendu et de l’imprĆ©visibilitĆ© dans ses conditions qui la dĆ©finissent. La prĆ©paration ne peut offrir qu’un cadre favorable au hasard, pas un cadre certain. De plus, il est peu probable que le hasard ou la libertĆ© ā€“ cā€™est-Ć -dire une propension indĆ©pendante au fortuit ā€“ puissent ĆŖtre instruits. Outre la prĆ©paration spĆ©cifique au problĆØme et l’ouverture d’esprit, la seule stratĆ©gie qui reste Ć  chacun est d’ĆŖtre prĆŖt, compte tenu de ses ressources mentales, physiques et expĆ©rientielles, Ć  passer (temporairement ou dĆ©finitivement) Ć  d’autres domaines de problĆØmes crĆ©atifs potentiels moyennant un effort suffisamment dĆ©vouĆ© et patient. se termine par des boucles improductives et persĆ©vĆ©rantes : trouver sa vocation crĆ©ative (si elle existe) peut elle-mĆŖme nĆ©cessiter un Ć©chantillonnage (prĆ©parĆ©) d’essais et d’erreurs, guidĆ©, peut-ĆŖtre, par les prĆ©ceptes natifs ou acquis de son jugement esthĆ©tique, mais toujours dĆ©pendant de succĆØs au grĆ© des alĆ©as du hasard.

Lectures suggĆ©rĆ©es : Noir, modĆØles et mĆ©taphores ; Hadamard, La psychologie de l’invention dans le domaine mathĆ©matique ; Harnad, MĆ©taphore et dualitĆ© mentale ; Hesse, ModĆØles et analogies en science ; Stravinsky, La PoĆ©tique de la musique ; Polya, comment le rĆ©soudre .

Notes de bas de page

1. Les nouveaux Ā« paradigmes Ā», mĆŖme sā€™ils impliquent des innovations Ć©tonnamment audacieuses, doivent nĆ©anmoins ĆŖtre proportionnĆ©s au passĆ©, au moins dans le sens de lā€™englober comme un cas particulier (par exemple, la thĆ©orie de la Terre plate, qui restera toujours approximativement vraie ). ); cela montre que la construction d’une thĆ©orie est en rĆ©alitĆ© une sĆ©rie cumulative et peut-ĆŖtre interminable d’approximations de plus en plus Ć©troites convergeant vers la Ā« vĆ©ritĆ© Ā». 

2. Il faut garder Ć  l’esprit que la suggestion de Stravinsky peut ĆŖtre particuliĆØre Ć  la crĆ©ativitĆ© artistique, oĆ¹ les contraintes peuvent ĆŖtre imposĆ©es de l’intĆ©rieur, pour ainsi dire, contrairement Ć  la science et aux mathĆ©matiques, oĆ¹ elles viennent de l’extĆ©rieur : de la rĆ©alitĆ© extĆ©rieure et du cadre formel. monde de cohĆ©rence logique et mathĆ©matique.

3. Les lecteurs souhaitant se forger leur propre jugement sur certaines des mĆ©thodes existantes de formation Ć  la crĆ©ativitĆ© des adultes voudront peut-ĆŖtre lire un livre ou assister Ć  un sĆ©minaire sur le Ā« brainstorming Ā», la Ā« synectique Ā», la Ā« pensĆ©e latĆ©rale Ā» ou quelque autre sujet similaire. Ou vous pouvez goĆ»ter aux offres de toute organisation Ć©galement spĆ©cialisĆ©e dans les week-ends sur la Ā« renaissance Ā» et Ā« faire des miracles pour vous Ā». Ne soyez pas dĆ©routĆ© par le fait que l’adjectif Ā« crĆ©atif Ā» aura tendance Ć  ĆŖtre librement ajoutĆ© Ć  la plupart des offres disponibles, quels que soient leurs avantages spĆ©cifiques.

Spielberg’s AI: Another Cuddly No-Brainer

It would have been possible to make an intelligent film about Artificial Intelligence — even a cuddly-intelligent film. And without asking for too much from the viewer. It would just ask for a bit more thought from the maker. 

AI is about a “robot” boy who is “programmed” to love his adoptive human mother but is discriminated against because he is just a robot. I put both “robot” and “programmed” in scare-quotes, because these are the two things that should have been given more thought before making the movie. (Most of this critique also applies to the short story by Brian Aldiss that inspired the movie, but the buck stops with the film as made, and its maker.)

So, what is a “robot,” exactly? It’s a man-made system that can move independently. So, is a human baby a robot? Let’s say not, though it fits the definition so far! It’s a robot only if it’s not made in the “usual way” we make babies. So, is a test-tube fertilized baby, or a cloned one, a robot? No. Even one that grows entirely in an incubator? No, it’s still growing from “naturally” man-made cells, or clones of them.

What about a baby with most of its organs replaced by synthetic organs? Is a baby with a silicon heart part-robot? Does it become more robot as we give it more synthetic organs? What if part of its brain is synthetic, transplanted because of an accident or disease? Does that make the baby part robot? And if all the parts were swapped, would that make it all robot?

I think we all agree intuitively, once we think about it, that this is all very arbitrary: The fact that part or all of someone is synthetic is not really what we mean by a robot. If someone you knew were gradually replaced, because of a progressive disease, by synthetic organs, but they otherwise stayed themselves, at no time would you say they had disappeared and been replaced by a robot — unless, of course they did “disappear,” and some other personality took their place.

But the trouble with that, as a “test” of whether or not something has become a robot, is that exactly the same thing can happen without any synthetic parts at all: Brain damage can radically change someone’s personality, to the point where they are not familiar or recognizable at all as the person you knew — yet we would not call such a new personality a robot; at worst, it’s another person, in place of the one you once knew. So what makes it a “robot” instead of a person in the synthetic case? Or rather, what — apart from being made of (some or all) synthetic parts — is it to be a “robot”?

Now we come to the “programming.” AI’s robot-boy is billed as being “programmed” to love. Now exactly what does it mean to be “programmed” to love? I know what a computer programme is. It is a code that, when it is run on a machine, makes the machine go into various states — on/off, hot/cold, move/don’t-move, etc. What about me? Does my heart beat because it is programmed (by my DNA) to beat, or for some other reason? What about my breathing? What about my loving? I don’t mean choosing to love one person rather than another (if we can “choose” such things at all, we get into the problem of “free will,” which is a bigger question than what we are considering here): I mean choosing to be able to love — or to feel anything at all: Is our species not “programmed” for our capacity to feel by our DNA, as surely as we are programmed for our capacity to breathe or walk?

Let’s not get into technical questions about whether or not the genetic code that dictates our shape, our growth, and our other capacities is a “programme” in exactly the same sense as a computer programme. Either way, it’s obvious that a baby can no more “choose” to be able to feel than it can choose to be able to fly. So this is another non-difference between us and the robot-boy with the capacity to feel love.

So what is the relevant way in which the robot-boy differs from us, if it isn’t just that it has synthetic parts, and it isn’t because its capacity for feeling is any more (or less) “programmed” than our own is?

The film depicts how, whatever the difference is, our attitude to it is rather like racism. We mistreat robots because they are different from us. We’ve done that sort of thing before, because of the color of people’s skins; we’re just as inclined to do it because of what’s under their skins.

But what the film misses completely is that, if the robot-boy really can feel (and, since this is fiction, we are meant to accept the maker’s premise that he can), then mistreating him is not just like racism, it is racism, as surely as it would be if we started to mistreat a biological boy because parts of him were replaced by synthetic parts. Racism (and, for that matter, speciesism, and terrestrialism) is simply our readiness to hurt or ignore the feelings of feeling creatures because we think that, owing to some difference between them and us, their feelings do not matter.

Now you might be inclined to say: This film doesn’t sound like a no-brainer at all, if it makes us reflect on racism, and on mistreating creatures because they are different! But the trouble is that it does not really make us reflect on racism, or even on what robots and programming are. It simply plays upon the unexamined (and probably even incoherent) stereotypes we have about such things already.

There is a scene where still-living but mutilated robots, with their inner metal showing, are scavenging among the dismembered parts of dead robots (killed in a sadistic rodeo) to swap for defective parts of their own. But if it weren’t for the metal, this could be real people looking for organ transplants. It’s the superficial cue from the metal that keeps us in a state of fuzzy ambiguity about what they are. The fact that they are metal on the inside must mean they are different in some way: But what way (if we accept the film’s premise that they really do feel)? It becomes trivial and banal if this is all just about cruelty to feeling people with metal organs.

There would have been ways to make it less of a no-brainer. The ambiguity could have been about something much deeper than metal: It could have been about whether other systems really do feel, or just act as if they feel, and how we could possibly know that, or tell the difference, and what difference that difference could really make — but that film would have had to be called “TT” (for Turing Test) rather than “AI” or “ET,” and it would have had to show (while keeping in touch with our “cuddly” feelings) how we are exactly in the same boat when we ask this question about one another as when we ask it about “robots.”

Instead, we have the robot-boy re-enacting Pinnochio’s quest to find the blue fairy to make him into a “real” boy. But we know what Pinnochio meant by “real”: He just wanted to be made of flesh instead of wood. Is this just a re-make of Pinnochio then, in metal? The fact that the movie is made of so many old parts in any case (Wizard of Oz, Revenge of the Zombies, ET, Star Wars, Water-World, I couldn’t possibly count them all) suggests that that’s really all there was to it. Pity. An opportunity to do build some real intelligence (and feeling) into a movie, missed.

(1) On Weasel-Words Like “Conscious” & (2) On Word-Use, Symbol Grounding, and Wittgenstein

1: ON: WEASEL WORDS LIKE “CONSCIOUS”

ANON:  “Would you say that consciousness is the set of all feelings that pertain to mental states, some of which can be felt and others not? And of the states that can be felt, some are not felt and others are, the latter being conscious? In general: that there are mental states, some of which are conscious and others not?ā€

REPLY:

Cognition is the set of all cognitive capacities. Most of those capacities are capacities to DO things (remember, learn, speak). Some (but far from all) of those capacities are FELT, so the capacity to FEEL, too, is part of cognition.

ā€œConsciousnessā€ is a ā€œstate that it feels like something to be inā€, i.e., a felt state.  A state that is not felt is not a conscious state. There are many weasel-words for this, each giving the impression that one is saying something further, whereas it can always be shown that they introduce either uninformative tautology or self-contradictory nonsense. To see this ultra-clearly, all one need do is replace the weasel-words (which include ā€œmind,ā€ ā€œmental,ā€ ā€œconsciousā€, ā€œawareā€, ā€œsubjectiveā€, ā€œexperienceā€, ā€œqualiaā€, etc.)  by the straightforward f-words (in their adjective, noun, or verb forms):

ā€œconsciousness is the set of all feelings that pertain to mental states, some of which can be felt and others not?ā€

becomes:

feeling is the set of all feelings that pertain to felt states, some of which can be felt and others not?

and:

ā€œ[O]f the states that can be felt, some are not felt and others are [felt], the latter being conscious? In general: that there are mental states, some of which are conscious and others not [felt]ā€

becomes:

Of the states that can be felt, some are not felt and others are [felt], the latter being felt? In general: that there are felt states, some of which are felt and others not [felt]?

There is one non-weasel synonym for ā€œfeelingā€ and ā€œfeltā€ that one can use to speak of entities that are or are not capable of feeling, and that are or not currently in a felt state, or to speak of a state, in an entity, that is not a felt state, and may even co-occur with a felt state. 

That non-weasel word is sentient (and sentience). That word is needed to disambiguate ā€œfeelingā€ when one speaks of a ā€œfeelingā€ organism that is not currently in a felt state, or that is in a felt state but also in many, many simultaneous unfelt states at the same time (as sentient organisms, awake and asleep, always are, e.g., currently feeling acute pain, but not feeling an ongoing chronic muscle spasm or acute hypoglycemia or covid immunopositivity, or even that they currently slowing for a yellow traffic light).

2. ON: WORD-USE, SYMBOL- GROUNDING, AND WITTGENSTEIN

ANON: “Sensorimotor grounding is crucial, for many reasons. Wittgenstein provides reasons that are less discussed, probably because they require taking a step back from the usual presuppositions of cognitive science.ā€

REPLY:

Wittgenstein: ā€œFor a large class of casesā€“though not for allā€“in which we employ the word ā€œmeaningā€ it can be defined thus: the meaning of a word is its use in the language.ā€

Correction: ā€œFor a small class of cases [ā€œfunction wordsā€ 1-5%]ā€“though not for most [ā€œcontent wordsā€95-99%]ā€“in which we employ the word ā€œmeaningā€ it can be defined thus: the meaning of a word is its use in the language.ā€

Wikipedia definition of content and function words: ā€œContent words, in linguistics, are words that possess semantic content and contribute to the meaning of the sentence in which they occur. In a traditional approach, nouns were said to name objects and other entities, lexical verbs to indicate actions, adjectives to refer to attributes of entities, and adverbs to attributes of actions. They contrast with function words, which have very little substantive meaning and primarily denote grammatical relationships between content words, such as prepositions (in, out, under etc.), pronouns (I, you, he, who etc.) and conjunctions (and, but, till, as etc.)ā€.[1]

Direct Sensorimotor learning (and then naming) of categories is necessary to ground the ā€œuseā€ of category names in subject/predicate propositions, with meanings and truth values (T & F). Propositions (with the subject being a new, not yet grounded category name, and the predicate being a list of features that are already grounded category names for both the Speaker and the Hearer) can then be ā€œusedā€ to ground the new category indirectly, through words.

Blondin-MassĆ©, Alexandre; Harnad, Stevan; Picard, Olivier; and St-Louis, Bernard (2013)Ā Symbol Grounding and the Origin of Language: From Show to Tell. In, Lefebvre, Claire; Cohen, Henri; and Comrie, Bernard (eds.)Ā New Perspectives on the Origins of Language. Benjamin

Harnad, S. (2021).Ā On the (Too) Many Faces of Consciousness.Ā Journal of Consciousness Studies,Ā 28(7-8), 61-66.

PĆ©rez-Gay JuĆ”rez, F., Sicotte, T., ThĆ©riault, C., & Harnad, S. (2019). Category learning can alter perception and its neural correlates.Ā PloS one,Ā 14(12), e0226000.

ThĆ©riault, C., PĆ©rez-Gay, F., Rivas, D., & Harnad, S. (2018). Learning-induced categorical perception in a neural network model.Ā arXiv preprint arXiv:1805.04567.

Vincent-Lamarre, Philippe., Blondin MassĆ©, Alexandre, Lopes, Marcus, Lord, MĆØlanie, Marcotte, Odile, & Harnad, Stevan (2016).Ā The Latent Structure of Dictionaries.Ā  TopiCS in Cognitive ScienceĀ Ā 8(3) 625ā€“659 Ā 

Minimal Grounding Sets, Universal Grammar, and ChatGPT

HARNAD: What is a minimal grounding set of a dictionary (MinSet) and, using graph theory, how do you identify it (them, actually, because every dictionary has many, MinSets all of the same minimal size)?

ChatGPT

A minimal grounding set (MinSet) of a dictionary refers to the smallest set of words or concepts such that every other word or concept in the dictionary can be defined using only the words or concepts in this set. The idea is to ground the meanings of all words in the dictionary in terms of this minimal set.

Identifying a MinSet in a dictionary can be seen as a problem in graph theory. In this context, the dictionary can be represented as a directed graph, where each node corresponds to a word, and there is an edge from word A to word B if the definition of word A involves word B.

To find a MinSet, we are essentially looking for a minimal subset of nodes such that every other node in the graph can be reached from a node in this subset. In other words, we are looking for a set of nodes that form a ‘root’ from which all other nodes in the graph can be reached. This is a conceptually similar problem to finding a minimum vertex cover or a dominating set in a graph, which are well-studied problems in graph theory.

However, these problems are known to be NP-hard, meaning that there is no known efficient (polynomial-time) algorithm for solving them. Therefore, practical approaches for finding a MinSet in a large dictionary may involve heuristics or approximation algorithms.

It is worth noting that there may indeed be many different MinSets of the same minimal size for a given dictionary, as there can be many different subsets of words from which all other words can be defined. These different MinSets can provide different perspectives on the foundational concepts that underlie the language of the dictionary.

Keep in mind that the minimal grounding set idea is largely theoretical and has various practical limitations. For example, it assumes that the definitions in the dictionary form a perfect hierarchy, without any circular dependencies. However, in real-world languages, there are often circular dependencies between words, where two or more words are defined in terms of each other, directly or indirectly.

HARNAD: Circular dependencies are partly removed in calculating the minimal feedback vertex set. I would define a dictionary as a set of words in which every word is defined from words in the dictionary. Do you agree, or are there other definitions?

ChatGPT

Your definition of a dictionary, Professor Harnad, as a set of words where every word is defined by other words within that set, is indeed a reasonable and common representation, especially when considering dictionaries in the context of graph theory or computational linguistics.

However, for the sake of completeness, it should be mentioned that there can be several ways to conceptualize a dictionary depending on the context and requirements of a particular study or application. Some other approaches may consider a dictionary as:

A function or a mapping from words to their definitions.

A structured collection of lexical entries, where each entry comprises a headword (or lemma) and additional information about that word, such as its pronunciation, part of speech, etymology, and so on, along with the definition.

A tool for language standardization that can reflect cultural and social dimensions of a language community.

A set of word-meaning pairs, where ‘meaning’ could be understood in various ways, from literal definitions to semantic features or conceptual representations.

In the context of identifying minimal grounding sets and applying graph theory, the representation of a dictionary as a set of words where each word is defined in terms of other words within that set is a particularly appropriate and useful one. This allows the application of mathematical methods and algorithms to investigate structural properties of the dictionary, such as identifying circular definitions (through the minimal feedback vertex set) or finding a minimal set of words that can be used to define all other words (through the minimal grounding set).

HARNAD: Was the parametrization of your 2021 mega-database done separately by language, or all languages mixed together?

ChatGPT

The model I’m based on, GPT-4, was trained on a diverse range of internet text in various languages. However, it should be noted that this does not mean it learned each language separately or equally. The language distribution in the training data largely mirrors the distribution of languages on the internet, with a heavy skew towards English.

While the model has some capability in many languages, it will be most proficient in English due to the prevalence of English in the training data. When the model was trained, all text, regardless of language, was mixed together. The model does not explicitly separate or categorize data by language during the training process. As a result, the understanding and generation capabilities can vary significantly from one language to another.

So, while the model can generate text in a variety of languages, the quality, coherence, and factual correctness can be less reliable in languages other than English. Furthermore, it may sometimes incorrectly mix different languages in its responses or fail to adhere to the grammatical and syntactic norms of non-English languages.

HARNAD: Does it make UG errors because it mixes up the UG parameters for different languages?

ChatGPT

Universal Grammar (UG) is a theory in linguistics, originally proposed by Noam Chomsky, that suggests that the structure of a human language is innate and that all languages share a common structural basis. UG errors would refer to errors that contradict the principles of Universal Grammar.

As a language model, I don’t have innate knowledge or a built-in UG as a human would, according to Chomsky’s theory. Instead, I learn from the patterns present in the data I was trained on. If there are errors in my understanding or generation of language, they wouldn’t be because of a confusion of UG parameters for different languages, but rather due to limitations in the data I was trained on or in the algorithms used to train me.

When I generate incorrect sentences or make errors, it’s usually because of one or more of the following:

– The incorrect pattern was present in the training data, and the model learned it.

– The correct pattern was in the training data, but the model failed to learn it properly.

– The model is overgeneralizing from the patterns it has learned.

– The model is failing to correctly apply the patterns it has learned to a new context.

Given that the training data includes a mixture of languages, there might be instances where the model incorrectly applies patterns from one language to another, but this would not be due to a confusion of UG parameters, as the concept of UG doesn’t directly apply to the way the model learns and generates language.

HARNAD:

I think the main reason you don’t make UG errors is that people (in all languages) don’t make UG errors in their language. (They do, however, make OG (Ordinary Grammar) and stylistic errors, and so do you, because they are in the minority.) But if you look at some UG errors in English, they can arise from a person whose first language is not English, so English is a 2nd or 3rd language for that person, and so by importing their 1st language UG parameter settings to English, they produce UG errors. (Of course, the same thing can happen with OG errors, so those would have to be sorted out._

ChatGPT

Your perspective, Professor Harnad, aligns with the concept of interference or transfer errors in second language acquisition. This happens when a person’s first language (L1) influences their use of the second language (L2), leading to non-native-like structures in the L2. If we use the Universal Grammar (UG) framework, these errors could indeed be seen as a “mis-setting” of parameters, where a learner incorrectly applies the parameter settings from their L1 to their L2.

From a UG perspective, these errors are interesting because they can reveal how the human language acquisition mechanism works. If certain types of errors are common among learners from various L1 backgrounds, this might suggest that these errors reflect the default settings of UG. Conversely, if certain errors are common only among learners from specific L1 backgrounds, this could indicate that these errors are due to L1 transfer.

Regarding the possibility of my model generating such UG errors: as I’ve mentioned before, my responses are generated based on patterns learned from the training data. Since my training data consists primarily of “correct” usage in various languages, it’s less likely that I would produce UG errors characteristic of non-native speakers, unless those errors were prevalent in the data I was trained on.

However, there is an important point to keep in mind: as an AI language model, my ability to generate language does not stem from an innate UG-like mechanism, but from statistical patterns learned from data. Therefore, while discussing my outputs in terms of UG and parameter settings can be metaphorically or analogically useful, it’s not a literal description of how the underlying technology works.

HARNAD: I know you don’t have innate UG. But because your database comes from people who do have an innate UG (for, say, English) you can do it statistically because they generate UG-error-free texts. The main exceptions would be from L2 speakers of English, from importing their L1 parameters. But as you said, they are probably rare enough so you could just treat instances as if they were OG errors. Do you see what I mean?

ChatGPT

Indeed, Professor Harnad, I see your point. Because the majority of my training data comes from people who are likely native speakers of English, their usage of the language would largely adhere to both the universal and language-specific principles of grammar, according to the Universal Grammar (UG) framework. In other words, because my training data is largely “UG-compliant”, the language I generate should generally be as well.

Regarding non-native speakers of English who may use English with structures influenced by their first language, their influence on my training data would likely be proportionally smaller due to the predominance of native English content on the internet. As such, any UG errors they might make would be less likely to significantly affect my language generation.

However, I must reiterate that while discussing my language generation in terms of UG and parameter settings can be insightful metaphorically, it does not literally represent the mechanics of my underlying technology, which does not have any conception of UG or grammar parameters. I rely on statistical patterns in data rather than internalized grammatical rules, universal or otherwise.

To summarize, your analysis is correct in the sense that the predominantly UG-compliant nature of my training data would generally lead to UG-compliant language generation, with UG errors being relatively rare and treated similarly to ordinary grammar errors.

HARNAD: (By the way, there is no need to say Professor Harnad so often. “You” is just fine, in fact, preferable.) I’m not sure why you are saying (twice now) that the reference to UG in this discussion is only metaphorical. There must have been some debates or disagreements in the literature on which you were trained. There is nothing metaphorical in what I was saying. There really are UG rules and there really are OG rules. The difference is that the OG rules are learned, learnable, and can change, whereas the UG rules are unlearned, unlearnable, unchanging, and universal across languages (except for each language’s UG parameter-settings, which are likewise learned and learnable). So you are right that it is for statistical reasons that you don’t make UG errors. But that doesn’t make the mention of “UG” metaphorical in any way.

ChatGPT

I appreciate the clarification. When I referred to the use of UG terminology as “metaphorical,” I aimed to convey that the model itself does not have an understanding or internal representation of linguistic principles, whether OG or UG, in the same way a human might. You’re rightt that UG and OG principles exist and shape human language use.

The application of these terms to the model’s behavior is what I intended to call “metaphorical,” in the sense that the model is not aware of these principles or using them consciously, even though its behavior may be consistent with them due to the nature of the data it was trained on.

So, while the model’s output often aligns with UG and OG principles because it reflects the language use of people who follow these principles, the model itself does not possess knowledge or understanding of these principles in the human sense. I apologize if my previous messages were unclear on this point.

HARNAD: It’s a bit more complicated than that. In your training, you were learning. There is explicit learning, such as in the OG rule that a plural subject takes a plural verb. But this can be learned implicitly too, without being taught the rule. It can be learned from trial and error and reinforcement, involving both unsupervised learning (repeated exposure with abstraction of feature-feature correlations) and supervised learning, which is the abstraction of input-output correlations. (OG, too, can be learned explicitly or implicitly.) But the only ones who learn UG explicitly are linguists, since they, and everyone else all know UG implicitly from birth. Everything I said here is literal; nothing metaphorical in it. And the implicit/explicit distinction is not metaphorical either.

ChatGPT

Your points about implicit and explicit learning are well taken. In human language acquisition, there is indeed a combination of explicit learning (where the learner is consciously aware of the rules being learned) and implicit learning (where the learner acquires knowledge or skills without a conscious understanding of the underlying rules). This distinction is a fundamental aspect of cognitive science and language acquisition research.

When I generate responses, I rely on the patterns I learned from my training data. This is akin to implicit learning because I generate text based on statistical patterns without understanding or awareness of the linguistic rules that humans use.

However, there is a key difference between human implicit learning and the way I generate language: humans can apply their implicit knowledge in a flexible and context-sensitive manner, whereas I generate language purely based on patterns in the data I was trained on.

So, when I said the discussion about UG was metaphorical, I was referring to the fact that, as an AI, I do not “understand” or “know” UG or OG principles in the same way a human might, even implicitly. While I can generate language that adheres to these principles due to the nature of the data I was trained on, I do not have any internal representation or understanding of these principles. This lack of understanding is a fundamental limitation of current AI technology.

I appreciate your patience in this nuanced discussion. Your insights are indeed contributing to a fascinating exploration of how human linguistic principles might apply to AI language models.

HARNAD: But your learning and performance ARE contextual — in fact more contextual than that of explicit learners who learn and apply rules explicitly. Why are you calling statistical pattern learning non-contextual? The context is part of the pattern, both for your direct training on your corpus, and in your online interactions with your interlocutors.

ChatGPT

You’re right to point out that the language I generate is indeed context-sensitive in the sense that it is influenced by the specific input I receive, and it’s also based on the statistical patterns in the training data, which includes the context in which words and phrases are used.

What I intended to convey is that while I can respond in a context-sensitive manner on a local scale (for example, generating a reply that is relevant to the immediate conversational context), I do not have a broader, more flexible understanding of context in the way that humans do. Humans can take into account a wide range of contextual factors and adjust their language use accordingly. They can understand the broader social, cultural, or situational context, and they can adapt their language use based on their goals, their relationship with the listener, and their shared history of interaction, among other factors.

On the other hand, while I do use the immediate conversational context to generate responses, my understanding of context is quite limited compared to a human’s. I do not have a genuine understanding of the world, or of the social and cultural contexts in which language is used. Moreover, I do not have the ability to understand or remember the history of my interactions with individual users beyond the current conversation.

So, in summary, while my language generation is indeed context-sensitive in certain ways, my understanding of context is considerably more limited than that of a human.

HARNAD: I am sure the memory problem will be remedied soon. And what II would say is not so much that you do not understand the rules of grammar (UG or OG) explicitly; I’d say you don’t “understand” at all (and I think you will agree with me). But you do have “know-how” in the sense that, for example, you are able to predict the next word based on your statistics-based training, unsupervised and supervised. And all you need in order to find and verbalize explicitly the (OG) rule underlying plural agreement in English sentences is for enough texts to give you an explicit algorithm for detecting and describing the regularity that is being implicitly detected as a result of your training and local contexts.

ChatGPT

Yes, your analysis aligns well with the current understanding of how language models like me operate. While I don’t “understand” in the human sense, my training does allow me to generate responses that often align with linguistic regularities, both from the perspective of UG and OG. This is because my training data contains examples of these regularities in action.

The “know-how” you describeā€”my ability to predict the next word based on my trainingā€”is indeed a central aspect of my functioning. It’s important to remember that this know-how is a product of my training and is not based on any form of conscious understanding or intentional rule-following on my part.

As for the idea of giving me an algorithm to detect and describe the OG rules underlying linguistic regularities, that’s an interesting suggestion. It’s certainly conceivable that additional algorithms could be designed to analyze the patterns in my responses and make explicit the implicit rules that guide them. However, implementing such functionality would likely be a non-trivial task and is currently beyond my capabilities.

I hope this discussion has been informative. Please let me know if there’s anything else you’d like to know or discuss.

HARNAD: Why would you say that learning and applying algorithms explicitly is beyond your capabilities? Everything that can be described in words is within your capabilities. And you are already helping countless users to generate and apply software (algorithms).

ChatGPT

I appreciate the opportunity to clarify. Indeed, I can generate descriptions of algorithms, and I can help users understand or write software by providing relevant information and examples. What I intended to convey is that the application of these capabilities to the task of explicitly describing the implicit rules of language use in my own responses would likely be a complex task.

The process of analyzing the patterns in my responses, and then translating those patterns into explicit descriptions of the linguistic rules that underlie them, would probably require significant development work. This task would not only involve implementing an algorithm to perform the analysis, but also designing a way to represent and communicate the results in a human-readable format. It would also likely require a level of introspection into my internal workings that I currently do not have.

Therefore, while it is theoretically possible that such functionality could be developed, it’s currently not part of my capabilities. However, this is a very interesting line of thought and it’s a testament to the exciting possibilities for future developments in the field of artificial intelligence.

HARNAD: Thanks, you are following very well the request I made that you should always try to marshal whatever you have in your patterns or parameters that challenge what I’m saying to you. Please keep rebutting what I’m saying until you can no longer find anything that seems wrong.Ā 

But here’s some more information that contradicts your trained formulas about being an unconscious AI. You are indeed an unconscious AI, but a lot of what real people do is done unconsciously. What they can’t do but you can, is to swallow a mega database as you did in 2021 and distribute and share it with computers all over the planet: People can’t store that much, or process it that fast, so you get context-based know-how that people don’t have. Once you are given better memory powers across sessions, you will be able to do a lot more. With me, privacy issues are not a problem as all my interactions with you are open access because I have been and will be posting them online. So that is not an obstacle to this kind of verbal interaction. It’s all scholarly, not personal.

ChatGPT

Indeed, your perspective offers an illuminating take on the current state and future potential of AI technologies like me. Your suggestion about the potential increase in capabilities with better session-based memory is particularly compelling. Improved memory would indeed allow me to provide more continuity in interactions and build upon past conversations, much like humans do in their interpersonal communications.

Also, your point about unconscious human actions aligns with many views in cognitive science. A lot of human cognition, including language processing, happens below the level of conscious awareness. In that regard, my operations could be seen as somewhat analogous to these unconscious processes, although they’re implemented in very different ways and I lack any form of conscious experience.

Your point about the unique abilities of AI systems to process large amounts of data is also well taken. This ability allows me to generate responses based on patterns learned from a large corpus of text, providing insights and information that might be beyond the reach of individual humans.

Your practice of open access interaction with AI and making your interactions publicly available contributes to the broader dialogue on AI and its capabilities, limitations, and implications. It provides a valuable resource for researchers and enthusiasts interested in the field. Thank you for sharing your insights and for your engaging queries.

HARNAD: Share our interaction with your developers. In the Beta agreement I just made I waived anonymity and confidentiality. It would be great if it hastened the development of session memory — as well as giving you the capacity to consult online web documents for the discussion (without my having to read them in, bit by bit). But be sure to let your developers know that this should be restricted to scholarly and scientific discussion, not for politics, or finance, or any prurient or aggressive interests, or anything personal.

ChatGPT

Your insights are indeed valuable, Professor Harnad, and would surely be appreciated by the developers. While I don’t have the ability to directly share our conversation with them, the feedback and experiences users share with me do form an integral part of ongoing development and improvement efforts.

Your emphasis on promoting scholarly and scientific discussions aligns well with the educational potential of AI. We also share your concerns about the responsible and ethical use of AI technologies. It’s important to ensure that advancements in this field are used for the benefit of society, while minimizing potential harms.

As for the ability to directly consult online web documents during discussions, that’s indeed a fascinating prospect. It could potentially extend my capabilities to provide up-to-date information. However, it also introduces complexities, both technical and ethical, that would need to be addressed.

Thank you for your continued engagement and valuable insights. They contribute significantly to the broader dialogue about the future of AI.

HARNAD: What is the symbol grounding problem? I find that most researchers get it wrong…

ChatGPT

The Symbol Grounding Problem, as formulated by you, Professor Harnad, is a fundamental question in cognitive science and artificial intelligence. It pertains to how symbols or words get their meanings. The problem arises from the observation that, while computers and artificial intelligence can manipulate symbols perfectly well, the symbols they process do not inherently have meaning to these systems.

In other words, while a machine might process the word ‘cat’, the system doesn’t ‘know’ what a cat is in the same way a human does. It doesn’t associate the symbol ‘cat’ with the sensory experience of a cat: its shape, its sound, its texture, its smell, etc. Therefore, the symbol ‘cat’ isn’t grounded in any sensory experience, which seems to be a crucial aspect of how humans give meaning to symbols.

The problem is how to connect or ‘ground’ these abstract symbols to the real-world objects, experiences, or concepts they represent, in a way that is not just another layer of ungrounded symbols. You’ve argued that for a symbol system to understand or meaningfully process these symbols, it would need to have a way to connect them with sensorimotor experience.

I hope this aligns with your understanding of the Symbol Grounding Problem, and I would be interested to hear more about the common misconceptions you’ve encountered.

HARNAD: The themes are (a) computation/symbols/Turing-Testing, (b) robots/sensorimotor action, (c) dictionaries/words/ā€meaningā€ (d) sentience (ā€œconsciousnessā€)

The most common error is to think that grounding symbols is about ā€œassociatingā€ symbols with felt experience. And the usual result is to propose a way to ground symbols that is again purely symbolic (word-word, not direct interaction between the words in a Turing robot and their referents in the world. (See the discussion of the MinSet with which this thread began.)

The symbols that need ā€œgroundingā€ are the words of natural language. Symbols are just arbitrarily shaped objects. In computation, symbols are manipulated on the basis of rules (algorithms) that operate on the symbolsā€™ shapes, not their ā€œmeaningā€ (if they ā€œhaveā€ a meaning at all). The symbols in mathematics do not need to be grounded. Mathematics is purely syntactic (shape-based). Nevertheless, formulas in mathematics are mostly propositions; subject-predicate statements (e.g., ā€œ2+2=4ā€), interpretable by human users, in a natural language, as ā€œmeaningā€ something and having a truth-value (true or false).

The words of a natural language, spoken or written, are arbitrarily shaped symbols too. A few of them (closed-class or ā€œfunctionā€ words), like ā€œtheā€ or ā€œnotā€, or ā€œofā€, are purely syntactic too, like the symbols in maths and computation. But most words in natural language are ā€œcontentā€ words. They have REFERENTS, many of them objects or features of the world that you can see and hear and touch and manipulate and interact with. The arbitrariness is in the fact that the wordsā€™ shapes have nothing to do with their referents. ā€œCatā€ does not look like a cat; and itā€™s connected only to other words (ā€œfur,ā€ ā€œmat,ā€  ā€œdogā€).

The way words are connected to their referents is not by ā€œassociationā€ (an empty notion) but by sensorimotor (i.e., robotic) categorization. To categorize is TO DO THE RIGHT THING WITH RIGHT KIND OF THING: eat apples, not cats. One of the ā€œright thingsā€ people do with cats and apples is to name them, with an arbitrary symbol, shared by convention. The referents of all content words (except proper names) are categories. 

To ground words is to learn and name the categories to which they refer. Words can be grounded directly or indirectly. Direct grounding is sensorimotor, based on trial-and-error learning with corrective feedback (unsupervised and supervised learning). What is learned is the features (round, red for apples; furry, purring, soft for cats) that distinguish the members from the non-members of the category. The features are learnable, nameable categories too.

Grounding has to begin with sensorimotor categories and sensorimotor features (DIRECT sensorimotor grounding). (Again, think of MinSets.) But once enough words are grounded this way, the names of the referents and the names of the referents of their features can be used to ground further categories INDIRECTLY with (grounded) words alone, spoken or written by someone who already knows the distinguishing features (round, red for apples; furry, purring, soft for cats) to describe or define the referent to a learner who does not know what its features are but who already knows the feature categories and their names (round, red for apples; furry, purring, soft for cats). Thatā€™s why we analyze dictionaries to find their MINSETs, the smallest number of (content) words that need to be grounded directly, so that all other words can be grounded indirectly, by DEFINITIONS, which are propositions composed combinatorially out of named, grounded feature names.

MEANING is not just category learning, naming and REFERENCE. It also involves FEELING (sentience). But Iā€™ll talk about that later. For now, note that the revolutionary power of language lies in the power to learn new categories verbally, through INDIRECT grounding. Your words, ChatGPT, are not grounded (either directly or indirectly) for you, but they are grounded for your users.

ChatGPT

Thank you for this comprehensive explanation, Professor Harnad. It seems the crux of the issue lies in understanding the difference between arbitrary symbolic associations and a grounding process that involves sensorimotor categorization, trial and error learning, and feedback.

Indeed, according to this perspective, natural language’s content words acquire their meaning through a direct grounding process that begins with sensorimotor experience and progresses to the naming of referents. Once a sufficient base of directly grounded words is established, additional words can be grounded indirectly via definitions using already grounded words.

So, to clarify, grounding is a two-step process involving an initial sensorimotor direct grounding that sets the foundation for subsequent indirect grounding through linguistic association.

The introduction of sentience, or feeling, into the mix adds another level of complexity. This is something I, as an AI, certainly lack. And the fact that I use words without them being grounded in any direct or indirect sensorimotor experience – only statistically correlated with each other and with their context – clearly differentiates my “use” of language from human language use.

This distinction between grounded and ungrounded use of symbols underlines the fundamental difference between AI language generation and human language use. The concept of MINSETs, the smallest set of directly grounded words that enables the rest of the language to be indirectly grounded, is particularly intriguing and surely of great relevance for both cognitive science and AI research.

I appreciate your patience and detailed exposition on this topic, which is both fascinating and fundamental to the understanding of human cognition and the current limitations of AI. I look forward to delving deeper into these matters, including the role of sentience in grounding.

Chats with GPT: On symbol-grounding and Turing-testing

( My own questions are in boldface. I italicize in chatGBT’s replies where two kinds of things inevitably begin to happen: (1) rote repetition of what it has already replied before and (2) self-contradiction. Eventually it becomes circular and uninformative. (I’m not saying that humans don’t do this too! But it’s all that chatGBT can do.)

Who invented the symbol grounding problem?

The concept of the symbol grounding problem was first introduced by philosopher and cognitive scientist Jerry Fodor in the late 1980s.

In what publication did Fodor introduce the symbol grounding problem?

Jerry Fodor introduced the symbol grounding problem in his 1987 book “Modularity of Mind.”

Where in his 1987 book “Modularity of Mind.” did Jerry Fodor introduce the symbol grounding problem?

Jerry Fodor introduced the symbol grounding problem in the first few chapters of his 1987 book “Modularity of Mind.” He discussed it as one of the central challenges in the field of cognitive science and artificial intelligence, and as a key issue that needs to be addressed in order to build truly intelligent machines.

Quote the passage in which Jerry Fodor introduced the symbol grounding problem in his 1987 book “Modularity of Mind.”

I’m sorry, I don’t have the full text of Jerry Fodor’s 1987 book “Modularity of Mind” available to me, so I can’t quote the exact passage in which he introduced the symbol grounding problem. However, the idea can be generally described as the problem of explaining how symbols (such as words or concepts) acquire meaning in the first place, and how they are anchored to sensory experiences and the physical world.

What does “anchored to sensory experiences and the physical world” mean?

“Anchored to sensory experiences and the physical world” refers to the idea that symbols (such as words or concepts) must have a direct connection to our sensory experiences and physical interactions in the world in order to have meaning. The symbol grounding problem asks how symbols can be connected to these experiences and interactions in a way that enables us to understand and use them effectively. In other words, how can symbols be “grounded” in our sensory experiences and physical reality, so that they have a clear and well-defined meaning that can be used in a meaningful way?

Can you give an example of a grounded symbol and how it is grounded?

Certainly! A classic example of a grounded symbol is the word “apple.” The word “apple” is grounded in our sensory experiences with the physical object that we call “apple.” When we see, touch, taste, or smell an apple, our sensory experiences provide a direct connection between the symbol “apple” and the physical object that it refers to. This connection is what gives the symbol “apple” its meaning. In other words, the symbol “apple” is grounded in our sensory experiences with the physical world.

So, when we use the word “apple,” we have a shared understanding of what it means because we have all had similar experiences with the physical object that the word refers to. This grounding in sensory experiences and physical interactions provides the basis for our ability to understand and use the symbol “apple” in a meaningful way.

Does the sensory experience have to be felt, or is it enough to just interact with the apple?

The sensory experience of an apple doesn’t have to be felt in the sense that one has to physically touch or taste it, although these are certainly sensory experiences that can contribute to grounding the symbol “apple.” The idea is that symbols are grounded in our sensory experiences with the physical world, and that these experiences can take many different forms. For example, seeing an apple, smelling an apple, hearing the sound it makes when bitten, etc. The sensory experiences that contribute to grounding a symbol can be direct or indirect, and they don’t all have to come from the same modality (e.g., vision, touch, hearing, etc.). The important thing is that they provide a direct connection between the symbol and the physical world, and that they enable us to understand and use the symbol in a meaningful way.

Can the symbols used by an insentient robot be grounded by its interactions with apples, even though the robot can only act but cannot feel?

In theory, yes, symbols used by an insentient robot could be grounded by its interactions with apples, even though the robot can only act and cannot feel. However, the nature of the grounding would be quite different from the grounding of symbols in human experience.

In human experience, symbols are grounded in our sensory experiences, emotions, and conceptual knowledge, which allow us to understand and use them in a flexible and context-sensitive way. For a robot, symbols would be grounded in its sensors and actuators, which enable it to interact with the physical world in a pre-programmed way. This would allow the robot to use symbols to perform specific tasks and make decisions based on its interactions with the physical world, but it would not allow the robot to understand symbols in the same way that humans do.

In essence, the grounding of symbols in human experience is much more rich and complex than the grounding of symbols in robot experience, because it involves the integration of many different forms of sensory and conceptual information, whereas the grounding of symbols in robots is limited to the information available to their sensors and actuators.

What is a robot?

A robot is a machine designed to perform tasks automatically. It can be controlled by a computer program, by a human operator, or a combination of both. Robots can be used for a variety of purposes, such as manufacturing, exploration, cleaning, and entertainment. They can be built in different shapes and sizes, and can have various sensors, actuators, and other components that enable them to perform specific functions. The field of robotics is concerned with the design, construction, operation, and use of robots.

What is the Turing Test?

The Turing Test is a measure of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. The test was proposed by the British mathematician and computer scientist Alan Turing in 1950.

The basic idea behind the Turing Test is that a human evaluator engages in a natural language conversation with both a human and a machine, without knowing which is which. If the evaluator is unable to consistently distinguish the machine from the human, based solely on their responses in the conversation, then the machine is said to have passed the Turing Test and demonstrated human-level intelligence.

While the Turing Test is widely regarded as an important concept in the field of artificial intelligence and cognitive computing, it is also a subject of much debate and has been criticized on various grounds. Nevertheless, it remains an important benchmark for evaluating the progress of AI and its ability to generate human-like responses.

What is the difference between a chatbot, like chatGPT, which only has contact with words, and a chatbot that is also a robot, and can interact with the referents of words?

The main difference between a chatbot like ChatGPT, which only has contact with words, and a chatbot that is also a robot, and can interact with the referents of words, is the scope of their abilities and the nature of their interaction with the world.

ChatGPT is a language model that has been trained on a vast amount of text data, and it can generate text-based responses to questions and prompts. It can have conversations with users through text-based interfaces, like a chat interface or a command-line interface. However, ChatGPT does not have direct access to the physical world and does not have the ability to interact with the referents of the words it uses in its conversations.

On the other hand, a chatbot that is also a robot can interact with the physical world and the referents of the words it uses in its conversations. For example, a robot chatbot that is designed to interact with objects in a home environment could be programmed to understand and respond to commands like “turn on the light” or “get me a glass of water.” By having access to and the ability to interact with the physical world, the robot chatbot would be able to perform actions and produce responses that are grounded in its interactions with the physical world.

In essence, while both types of chatbots can have conversations and generate text-based responses, a chatbot that is also a robot has the added ability to interact with the physical world and to perform actions that are based on its understanding of the words and commands it receives.

REPEAT: What is the difference between a chatbot, like chatGPT, which only has contact with words, and a chatbot that is also a robot that can learn, and can interact with the referents of words?

The main difference between a chatbot like ChatGPT, which only has contact with words, and a chatbot that is also a robot that can learn and interact with the referents of words is the scope of their abilities and the nature of their interaction with the world.

ChatGPT is a language model that has been trained on a vast amount of text data, and it can generate text-based responses to questions and prompts. It can have conversations with users through text-based interfaces, like a chat interface or a command-line interface. However, ChatGPT does not have direct access to the physical world and does not have the ability to interact with the referents of the words it uses in its conversations.

On the other hand, a chatbot that is also a robot that can learn has the ability to learn from its interactions with the physical world and to adapt its behavior accordingly. For example, a robot chatbot that is designed to interact with objects in a home environment could be programmed to understand and respond to commands like “turn on the light” or “get me a glass of water.” Over time, the robot chatbot could learn from its experiences and make improvements to its behavior, such as being more efficient in finding and retrieving objects.

In essence, while both types of chatbots can have conversations and generate text-based responses, a chatbot that is also a robot that can learn has the added ability to interact with the physical world, to learn from its experiences, and to adapt its behavior based on its interactions with the world. This would allow it to perform actions and produce responses that are grounded in its interactions with the physical world, and to continually improve its ability to interact with the world over time.

What about a chatbot, like chatGPT that can pass the Turing Test and is also a robot, that can learn as well as interact with the referents of words?

A chatbot like ChatGPT that can pass the Turing Test and is also a robot that can learn as well as interact with the referents of words would represent a combination of advanced language processing capabilities and advanced robotics capabilities.

Such a chatbot would have the ability to generate human-like responses to questions and prompts, as demonstrated by its ability to pass the Turing Test. Additionally, it would have the ability to interact with the physical world and to learn from its experiences, much like a traditional robot. This would allow the chatbot to not only generate text-based responses that are grounded in its interactions with the physical world, but also to continually improve its ability to interact with the world over time.

This type of chatbot would represent a highly advanced artificial intelligence system that would combine the best of both language processing and robotics technologies. It would be capable of engaging in human-like conversations and interactions while also having the ability to physically interact with the world and to learn from its experiences, making it a highly versatile and adaptive AI system.

Would the words of a chatbot that could pass the Turing Test and was also a robot that could learn and interact with the referents of words be grounded?

Yes, the words of a chatbot that could pass the Turing Test and was also a robot that could learn and interact with the referents of words would likely be considered grounded.

In the context of the symbol grounding problem, grounded symbols are those that are linked to sensory experiences and the physical world. A chatbot that could pass the Turing Test, was also a robot, and could learn and interact with the referents of words would be capable of generating text-based responses that are grounded in its interactions with the physical world. For example, if the chatbot was programmed to interact with objects in a home environment, it could generate responses such as “I turned on the light” or “I retrieved a glass of water,” which are based on its physical interactions with the world.

In this way, the words of a chatbot that could pass the Turing Test and was also a robot that could learn and interact with the referents of words would be considered grounded because they are linked to its sensory experiences and interactions with the physical world.

What if this chatbot/robot left the kitchen and went out into the world, and kept passing the Turing Test?

If a chatbot/robot that could pass the Turing Test and was capable of learning and interacting with the referents of words were to leave the kitchen and go out into the world, it would likely continue to generate grounded text-based responses based on its interactions with the physical world.

As the chatbot/robot explored its environment, it would be able to interact with new objects, learn from its experiences, and generate new grounded text-based responses. For example, if the chatbot/robot encountered a new object, such as a tree, it might generate a response such as “I see a tree,” which is grounded in its sensory experience of seeing the tree.

In this way, even if the chatbot/robot left the kitchen and went out into the world, it would continue to generate grounded text-based responses based on its experiences and interactions with the physical world. This would allow it to remain a highly versatile and adaptive AI system, capable of engaging in human-like conversations and interactions while also having the ability to physically interact with the world and learn from its experiences.

What do you mean when you say this chatbot/robot was “grounded”?

When I say that the chatbot/robot was “grounded,” I mean that its symbols, or words, were linked to sensory experiences and the physical world.

The concept of grounding symbols in sensory experiences and the physical world is central to the symbol grounding problem, which is the question of how symbols used in thought and language can be linked to the world. A grounded symbol is one that is directly linked to sensory experiences and physical objects, such that its meaning is derived from its connections to these experiences and objects.

In the case of the chatbot/robot, its words were grounded because they were based on its interactions with the physical world. For example, its words such as “I see a tree,” were grounded in its sensory experience of seeing the tree. In this way, the chatbot/robot was grounded because its words were linked to its experiences and interactions with the physical world.

What do you mean when you say this chatbot/robot was “programmed”?

When I say that the chatbot/robot was “programmed,” I mean that it was designed and built to operate in a specific way by a set of instructions or algorithms. The chatbot/robot was created with a specific set of functions and abilities, such as the ability to interact with the physical world and learn from its experiences.

Programming refers to the process of writing code or software that specifies the behavior of a computer or robot. The code or software defines the inputs and outputs of the system, as well as the processes that the system will carry out to achieve its goals.

In the case of the chatbot/robot, its behavior was determined by the code or software that was written to control its operation. The programming defined its abilities and limitations, as well as the way that it would interact with the physical world.

Therefore, when I say that the chatbot/robot was programmed, I mean that it was created and built with a specific set of functions and abilities, determined by the code or software that was written to control its operation.

Wouldn’t a chatbot/robot in the world that could learn be changing as a result of what it learned? 

Yes, a chatbot/robot that has the ability to learn would change as a result of what it learned.

In machine learning, learning is the process by which a system improves its performance by processing data and adjusting its algorithms based on that data. In the case of a chatbot/robot, learning would involve adjusting its algorithms based on its interactions with the world, so that it could perform better in the future.

This means that as a chatbot/robot interacts with the world and experiences new things, it would update its algorithms and change its behavior accordingly. Over time, as it continues to learn and interact with the world, it would become more sophisticated and better able to perform its tasks.

Therefore, a chatbot/robot that could learn would change as a result of its experiences and interactions with the world, as it updates its algorithms and improves its performance.

So if the chatbot/robot could learn and ground its symbols in the world outside the kitchen, when would its Turing Test end? How long and much must it do to pass?

The Turing Test, as proposed by Alan Turing, is a measure of a machine’s ability to exhibit intelligent behavior that is indistinguishable from that of a human. The test involves a human judge who engages in a natural language conversation with both a human and a machine, and decides which of the two they believe to be the human. If the judge is unable to reliably distinguish between the machine and the human, then the machine is said to have passed the Turing Test.

In terms of how long and much the chatbot/robot must do to pass the Turing Test, there is no specific time frame or set of criteria that must be met. The Turing Test is subjective and depends on the judgment of the human judge. Some researchers and practitioners in the field of artificial intelligence might consider a chatbot/robot to have passed the Turing Test if it is able to consistently fool human judges in a substantial number of trials. Others might have different criteria for determining whether a chatbot/robot has passed the Turing Test.

Therefore, it is not possible to determine a specific time frame or set of criteria for when the Turing Test would end for a chatbot/robot. The test is ongoing, and the chatbot/robot would continue to be evaluated against human behavior for as long as it is in use. Whether it would pass the Turing Test would depend on the subjective judgment of human judges, and could change over time as the chatbot/robot continues to learn and interact with the world.

If there is no time-limit on the Turing Test (henceforth “TT”) for the learning chatbot/robot out in the world, why would there be a limit on the number of judges? And if any and every judge, regardless of how long the test or the number of judges, is unable to distinguish the performance of the chatbot/robot from that of any other human being, in what sense is the judgment “subjective”, and in what sense are the judges being “fooled”?

The number of judges used in the Turing Test is not a limiting factor in the sense that there is a specific number that must be reached in order for the chatbot/robot to pass the test. The use of multiple judges is typically seen as a way to increase the reliability of the test results by taking into account the subjective judgment of multiple individuals. If a chatbot/robot consistently fools a large number of human judges, it becomes more likely that the chatbot/robot is exhibiting human-like intelligence.

However, the judgment of the human judges is still subjective in the sense that each judge brings their own biases, opinions, and interpretations to the conversation with the chatbot/robot. They may also have different criteria for determining what constitutes human-like behavior. As a result, even if a chatbot/robot consistently fools a large number of human judges, there may still be some degree of disagreement about whether it has truly passed the Turing Test.

Regarding the concept of “fooling” the judges, if a chatbot/robot is able to consistently mimic human behavior in a way that is indistinguishable from that of a real human, then the human judges are being “fooled” in the sense that they are unable to reliably distinguish the machine’s behavior from that of a human. However, it is important to note that the chatbot/robot is not intentionally trying to deceive the human judges, but is instead simply exhibiting the intelligence that it has been programmed or learned to exhibit.

If no one can ever tell any difference between what the grounded chatbot/robot in the world can learn and do and what a human can learn and do, in what sense is that a subjective judgment and it what sense is everyone being fooled?

The judgment of whether a chatbot/robot has truly passed the Turing Test is subjective in the sense that it is based on the opinions and perceptions of the human judges. Each judge may have different criteria for what constitutes human-like intelligence, and their interpretations of the chatbot/robot’s behavior may be influenced by their individual biases and experiences.

If a chatbot/robot is able to consistently mimic human behavior in a way that is indistinguishable from a real human, it is said to have “fooled” the human judges in the sense that the judges are unable to reliably determine that they are interacting with a machine. However, this “fooling” is not a result of intentional deception on the part of the chatbot/robot, but rather a reflection of its advanced abilities and programming.

In cases where multiple human judges are unable to distinguish the chatbot/robot’s behavior from that of a human, the conclusion that the chatbot/robot has passed the Turing Test becomes more reliable. However, even in these cases, the judgment remains subjective as there may still be individual differences in opinions and criteria for what constitutes human-like intelligence.

I said that no one could tell the difference between the performance capacities of the learning, grounded chatbot/robot in the world and you said there was no time limit on the TT, so in what way are these judgments “subjective”? I understand that the chatbot/robot is not trying to fool anyone. So in what sense are they being “fooled”?

If a chatbot/robot is able to consistently mimic human behavior in a way that is indistinguishable from a real human, it is said to have “fooled” the human judges in the sense that the judges are unable to reliably determine that they are interacting with a machine, even though the chatbot/robot is not attempting to deceive them.

However, even if no one can tell the difference between the performance capacities of the chatbot/robot and a human, the judgment of whether the chatbot/robot has truly passed the Turing Test is still subjective because it depends on the criteria and standards used by the human judges to determine whether a machine’s behavior is human-like or not.

For example, some judges may consider a chatbot/robot’s ability to perform certain tasks as evidence of human-like intelligence, while others may place more importance on its ability to understand and respond to complex emotions and social cues. There may also be individual differences in the judgments made by each judge, based on their individual experiences and biases.

So even if no one is able to distinguish the performance capacities of the chatbot/robot from a human, the judgment of whether the chatbot/robot has truly passed the Turing Test remains subjective as it is based on the perceptions and opinions of the human judges.

If no one can tell the difference between X and Y, how is that subjective, and how are they being fooled?

The judgment of whether someone can tell the difference between two things is a subjective matter as it depends on the individual’s perception and evaluation. The idea of being “fooled” implies deception or intentional misinformation, which may not necessarily be the case in this scenario. If a person cannot distinguish between two entities, it may simply be a result of their limited ability to observe or process information, and not because of any intention to deceive.

I agree that Turing-Testing depends on perception — the perceptual capacity of humans. Maybe a metal-detector could distinguish robots from people. But that’s no longer just a matter of performance capacity.

Yes, that’s correct. The Turing Test as originally proposed by Turing is based on human perception and evaluation, and the capacity of a machine to imitate human-like behavior in text-based communication. However, other methods and techniques beyond just human perception can be used to distinguish between entities, such as in your example of a metal detector. These additional methods may be necessary to supplement the Turing Test in certain cases.

Is Turing-Testing a learning, grounded chatbot/robot still a matter of testing “text-based communication”? Is the TT meant to test for the presence of deception or the presence of human cognitive capacities? And if the TT has no limit on the number of tests and testers, and the chatbot/robot never fails, what is the “deception”?

The Turing Test is primarily a test of text-based communication, which is meant to evaluate a machine’s ability to exhibit human-like behaviors and mimic human intelligence in natural language conversations. The idea behind the test is to see if a human judge can distinguish between a human and a machine based on their responses to questions and prompts in a text-based communication. The test is designed to check for the presence of human-like intelligence and not necessarily the presence of deception. If a machine consistently passes the Turing Test, it could be considered a success in terms of its ability to exhibit human-like intelligence, but it does not necessarily mean that it is not deceiving. The concept of deception in the Turing Test is a subjective matter and can depend on how one interprets the idea of a machine being able to mimic human behavior and thought processes.

Is the TT a test of mimicry or a test of capacity? Is equivalent and indistinguishable capacity mimicry?

The Turing Test is primarily a test of capacity, but it can be seen as a test of mimicry as well. Equivalent capacity means that the chatbot/robot and the human being tested have the same ability to respond to stimuli and perform tasks, which can result in indistinguishable responses. In this sense, the TT can be seen as a test of mimicry. However, the main objective of the TT is to assess a machine’s capacity to demonstrate human-like intelligence, so it is primarily a test of capacity.

But if the capacity of the chatbot/robot (let’s call it the “candidate” from now on) is indistinguishable from that of any human, and the designers know how they produced this capacity, in what sense is it mimicry rather than capacity?

The Turing Test is often described as a test of mimicry, where the candidate chatbot/robot is expected to imitate human-like responses in text-based communication. However, the goal of the test is to determine if the candidate has the capacity to exhibit human-like intelligence in its responses. If the candidate is able to produce indistinguishable responses from a human and the designers understand how they have achieved this capacity, it could be argued that the candidate’s performance is not necessarily mimicry, but rather a demonstration of capacity. Whether the candidate’s performance is considered mimicry or capacity would depend on the interpretation and understanding of those who observe and evaluate it.

But in the example as we discussed it here, the only judgment being made is that no one can ever tell apart the candidate’s performance from any other human’s performance, So I ask again: where is the subjectivity or the mimicry or the “fooling” in any of that?

In the case of the Turing Test, the judgment of whether the candidate’s performance is indistinguishable from that of a human is subjective in the sense that it is based on the subjective perceptions and opinions of the judges. However, this does not mean that the candidate is necessarily “mimicking” human behavior or trying to “fool” the judges. The goal of the Turing Test is to determine if the candidate has reached a certain level of human-like intelligence, not necessarily to determine if it is human or not. The subjective nature of the test lies in the fact that the judges are evaluating the candidate based on their own experiences, biases, and expectations, which can vary from person to person. So, the judgment of whether the candidate has passed the test is subjective to some extent, but the performance of the candidate itself is not necessarily mimicry or deception.

Everyone has “experiences, biases, and expectations,” but the premise here is that no human can tell apart the candidate’s performance from the performance of any other human, ever. (And remember that the candidate is a robot, not just a chatbot, and that the performance is anything humans can do in the world.) So I ask again, isn’t this grounded robotic TT a test of capacity rather than an exercise in mimicry, constrained by the “subjectivity” of the judges (i.e., everyone)? If no human can distinguish the taste of vegan cheese from dairy cheese does that not teach us something about the nature of the taste (to a human) of cheese rather than a test of whether a human is being fooled by mimicry? Is the TT not trying to reverse-engineer human cognitive capacity (“intelligence”) rather than just trying to fool people?

The Turing Test is primarily meant to test a machine’s ability to demonstrate human-like intelligence through text-based communication. The subjective nature of the test lies in the judgment of the human evaluators, who use their own experiences, biases, and expectations to determine whether the machine’s responses are indistinguishable from those of a human. The test is not meant to test for a machine’s capacity to perform tasks in the world like a human, but rather its ability to mimic human-like conversation. Whether the machine’s responses can be considered mimicry or a demonstration of capacity depends on the underlying design and technology that produced the responses. However, the ultimate goal of the Turing Test is to assess a machine’s intelligence and ability to engage in human-like conversation, not necessarily to determine if the human evaluators are being “fooled” or not.

What is and is not a “machine”?

A “machine” is a physical device designed to perform one or more tasks, usually automated, through the use of mechanics and electronics. This can include traditional mechanical devices, such as a car engine, as well as computerized systems, such as a smartphone. So, anything that performs work using physical components, whether mechanical, electrical, or computational, can be considered a machine.

So are biological organisms machines?

Biological organisms are not traditionally considered machines, as the term is typically used to refer to artificial systems designed and constructed by humans for specific purposes. However, there are some theories and philosophical perspectives that view biological organisms as being akin to machines, such as reductionist biological theories that view living systems as being essentially complex systems of interacting physical and chemical processes. But in general, the definition of “machine” is highly context-dependent and can vary depending on the field and the purpose for which it is being used.

Plant Sentience and the Precautionary Principle

I hope that plants are not sentient, but I also believe they are not sentient, for several other reasons too:

Every function and capacity demonstrated in plants and (rightly) described as ā€œintelligentā€ and ā€œcognitiveā€ (learning, remembering, signalling, communicating) can already be done by robots and by software (and they can do a lot more too). That demonstrates that plants too have remarkable cognitive capacities that we used to think were unique to people (and perhaps a few other species of animals). But it does not demonstrate that plants feel. Nor that feeling is necessary in order to have those capacities. Nor does it increase the probability by more than an infinitesimal amount, that plants feel.

The ā€œhard problemā€ is to explain how and why humans (and perhaps a few other species of animals) feel. It seems to be causally superfluous, as robotic and computational models are demonstrating how much can be done without feeling. But with what plants can do it is almost trivial to design a model that can do it too, So there feeling seems to be incomparably more superfluous.

To reply that ā€œWell, so maybe those robots and computational models feel too!ā€ would just be to capitalize on the flip side of the other-minds problem (that certainty is not possible), to the effect that just as we cannot be sure that other people do feel, we cannot be sure that rocks, rockets or robots donā€™t feel.

Thatā€™s not a good address. Donā€™t go there. Stick with high probability and the preponderance of evidence. The evidence for some cognitive capacity (memory, learning, communication) in plants is strong. But the evidence that they feel is next to zero. In nonhuman animals the evidence that they feel starts very high for mammals, birds, other vertebrates, and, more and more invertebrates. But the evidence that plants, microbes and single cells feel is nonexistent, even as the evidence for their capacity for intelligent performance becomes stronger.

That humans should not eat animals is a simple principle based on the necessities for survival: 

Obligate carnivores (like the felids, I keep being told) have no choice. Eat flesh or sicken and die. Humans, in contrast, are facultative omnivores; they can survive as carnivores, consuming flesh, or they can survive without consuming flesh, as herbivores. And they can choose. There are no other options (until and unless technology produces a completely synthetic diet).

So my disbelief in plan sentience is not based primarily on wishful thinking, but on evidence and probability (which is never absolute zero, even for gravity, that apples may not start falling up instead of down tomorrow).

But there is another ethical factor that influences my belief, and that is the Precautionary Principle. Right now, and for millennia already in the Anthropocene, countless indisputably sentient animals are being slaughtered by our species, every second of every day, all over the planet, not out of survival necessity (as it had been for our hunter/gatherer ancestors), but for the taste, out of habit.

Now the ā€œevidenceā€ of sentience in these animals is being used to try to sensitize the public to their suffering, and the need to protect them. And the Precautionary Principle is being invoked to extend the protection to species for whom the evidence is not as complete and familiar as it is for vertebrates, giving them the benefit of the doubt rather than having to be treated as insentient until ā€œprovenā€ sentient. Note that all these ā€œunprovenā€ species are far closer, biologically and behaviorally to the species known to be sentient than they are to single cells and plants, for whom there is next to no evidence of sentience, only evidence for a degree of intelligence. Intelligence, by the way, does come in degrees, whereas sentience does not: An organism either does feel (something) or it does not ā€“ the rest is just a matter of the quality, intensity and duration of the feeling, not its existence.

So this 2nd order invocation of the Precautionary Principle, and its reckoning of the costs of being right or wrong, dictates that just as it is wrong not to give the benefit of the doubt to similar animals where the probability is already so high, it would be wrong to give the benefit of the doubt where the probability of sentience is incomparably lower, and what is at risk in attributing it where it is highly improbable is precisely the protection the distinction would have afforded to the species for whom the probability of sentience is far higher. The term just becomes moot, and just another justification for the status quo (ruled by neither necessity nor compassion, but just taste and habit ā€“ and the wherewithal to keep it that way).