1. In Rubikās cube discourse, why are the solutions called āgodā algorithms or āgod-numbersā
Within the Rubikās Cube (and broader puzzle) community, āGodās algorithmā refers to anĀ optimalĀ solving methodāone that always produces the shortest possible solution from any given configuration. The idea is that an all-knowing entity (āGodā) could see the perfect route to solve the puzzle in the minimum number of moves.
Historical Notes:The phrase gained popularity in discussions around āGodās number,ā which is the maximum number of moves needed to solve the cube from any position (20 moves in half-turn metric, 26 in quarter-turn metric). Once that number was proven, the algorithm that always hits that bound if necessary was called āGodās algorithm.ā
Informal Usage: Itās not a formal or religious usage. The term is more of a playful nod to an omniscient perspectiveāan algorithm that āknowsā the absolute optimal path.
2. Any relation to how Einstein (and others?) used āgodā in connection with the laws of physics?
Einsteinās āGodā
When Einstein used phrases like āGod does not play dice,ā he was typically referring to the underlying determinism or structure in the laws of physics, rather than a personal deity.āGodā in this context alluded to the universeās governing rules, which Einstein believed were elegantly consistent.
Comparing to āGodās Algorithmā
The Rubikās Cube usage is much more informal. It hints at an ultimate principle (optimality), but it doesnāt really draw from Einsteinās philosophical or theological implications.Both usages share a metaphorical sense of an ultimate or perfect knowledge, but they operate in different domains: puzzle-solving vs. describing physical reality.
3. How is this related to whether the āgod-lawsā are (a) empirical, as in science, or (b) formal, as in mathematics?
(a) Empirical Laws in Science
Physical laws are discovered through observation, experiment, and inference. For instance, Einsteinās field equations in general relativity are tested against measurements of planetary orbits, gravitational lensing, etc.If they fail in experiments, scientists must refine or replace them.
(b) Formal Laws in Mathematics
Mathematical ālawsā (theorems, proofs) are established through logical deduction within an axiomatic system. They donāt depend on experiments; they rely on consistency with axioms.
Godās Algorithm in the Rubikās Cube falls into the category of formal problem-solving. Itās grounded in group theory and combinatoricsāthe ālawsā behind it arenāt empirical but logical constraints of the cubeās configuration space.So, āGodās algorithmā sits firmly in the āformalā realm of mathematics and theoretical computer science
4. And is there any relation to the various foundational views of mathematicsā(i) Platonism, (ii) Intuitionism, (iii) Constructionism, or (iv) other options?
(i) Platonism
Platonists see mathematical entities (like numbers or group structures) as existing in an abstract realm. āGodās algorithmā could be seen as discovered from this realm, a perfect solution that exists objectively
(ii) Intuitionism
Intuitionists argue that math is a creation of the human mind, focusing on constructive proofs. One might say that the āoptimal solutionā is only meaningful insofar as we can construct it or approximate it with an algorithm.
(iii) Constructionism
Closely related to intuitionism, constructionism emphasizes building mathematical objects step by step. In that vein, the quest for āGodās algorithmā is a constructive process: we want an explicit procedure that yields the optimal path.
(iv) Other Views (e.g., Formalism)
Formalists see math as manipulations of symbols under given rules. The concept of āGodās algorithmā fits well here too: itās a statement within the formal system describing the Rubikās Cube group, and we seek a sequence of symbols (moves) that minimize the path.Overall, the concept of a āGodās algorithmā can be interpreted differently based on oneās philosophical stance. Platonists might say it exists out there to be discovered, while intuitionists might say itās only ārealā once we construct it. Either way, the term āGodā is more of a metaphor for perfect knowledge than a literal or religious reference.
5. Have there been interpretations of what Einstein meant by “god”?
Einsteinās Own Statements
Spinozaās God:Einstein often mentioned that he believed in the āGod of Spinoza.ā Spinozaās philosophical view is broadly taken as pantheistic, equating āGodā with the totality of nature or the universeās lawful harmony rather than a personal deity.Hence, for Einstein, āGodā was a metaphor for the underlying rational structure, elegance, and consistency in the laws of the universe.
Non-Personal Deity:Einstein explicitly rejected the idea of a personal God who intervenes in daily human affairs.He wrote in letters that he did not believe in a God āwho concerns himself with fates and actions of human beings.āFamous Quotes and Their Context
āGod does not play diceāThis phrase was Einsteinās way of expressing skepticism about the randomness and probabilistic nature of quantum mechanics. He believed the universe had a more deterministic or at least deeper, as-yet-unknown, lawful structure.It did not imply a belief in a personal God but rather an insistence on an orderly universe.
Mixed Interpretations
Religious Interpretations:Some theologians and laypeople initially took Einsteinās words about āGodā as support for a traditional theistic stance. However, Einsteinās later clarifications made clear he was not endorsing a personal, interventionist God.
Philosophical/Metaphorical Interpretations:Most modern scholars agree that Einsteinās references to āGodā were predominantly metaphors for natureās laws and the deep sense of wonder he felt when studying the cosmos.
Einsteinās Personal āReligiousnessā
Awe and Wonder:While he did not follow an organized religion, Einstein described himself as feeling a ācosmic religious feelingāāan awe at the grandeur and mystery of the universe.He used āGodā as a shorthand for that awe-inspiring order, rather than a being with personality or will.
Key Takeaway
Einsteinās āGodā is best understood in the context of Spinozistic or pantheistic thought, focusing on the harmony and unity of nature. It was a poetic or metaphorical use of the word āGodāāfar removed from the typical doctrinal or personal interpretations found in many religions.
Last question series: (1) The notion of “god” and “gods” originated (etymologically?) from supernatural and metaphysical beliefs — beliefs which non-religious people believe to be false. Why choose “god” as the metaphor for the discovered and empirically supported truths of physics? (2) Does it fit any better (or worse) to scientific findings than Dawkins’s “Flying Spaghetti Monster”, or, for that matter, “Beelzebub”? (3) Even if it is just a metaphor for the sense of awe at the universe and its laws, why call that “god”? (4) Doesn’t that make awe a metaphor for the notion of god? (5) Last: since the notion of god(s) in its origin anthropomorphic, hence personal, is it not a mark of an inapt trope if one then has to qualify it with “put I don’t mean a personal or supernatural deity”?Ā
The term āgodā has a deep cultural and linguistic resonance. When Einstein or others refer to āGod,ā they tap into a well-known concept that suggests something ultimate, beyond ordinary human understanding.
Convenience and Cultural Familiarity: Even if non-religious thinkers reject supernatural claims, using āGodā as shorthand for āultimate realityā or āthe fundamental lawsā can be a convenient metaphor because itās already embedded in the vernacular.
Poetic or Emotional Connotation:Saying āGodā can evoke awe, mystery, and reverenceāemotions that mirror how scientists might feel about the grandeur of nature and the cosmos.
The āFlying Spaghetti Monsterā was coined satirically to highlight perceived absurdities in certain religious arguments. It deliberately underscores the idea that āany deity is as likely as another.āEinsteinās usage of āGodā has a moreĀ seriousĀ orĀ reverentialĀ historical weight. Itās less about satire and more about expressing awe or the sense of a grand underlying principle.
Cultural Reactions Differ:āGod,ā in many societies, carries an air of solemnity; āBeelzebubā or āFlying Spaghetti Monsterā carry connotations either of the demonic or the absurd. So, in terms of widespread acceptance or emotional tenor, āGodā remains a more common metaphor for grandeur and unity in the laws of physics.
Historically, philosophical and scientific writings have often co-opted religious or theistic language to express wonder. Itās partly a carryover from periods when religion was the default lens of cultural interpretation.
Connotations of āUltimateā or āAbsoluteā: Calling it āGodā connotes something all-encompassing, a concept that stands in for āthe ultimate nature of reality.
āEmotional Resonance: Words like āgodā may resonate emotionally more than purely secular words (e.g., āthe laws of physicsā), especially for expressing profound awe.
Inevitably Ambiguous:Indeed, calling it āGodā invites confusion because most people associate āGodā with a personal, intervening deity. Hence the repeated clarifications by figures like Einstein: āI donāt mean a personal God.ā
Metaphor vs. Literal Meaning:
Metaphors often extend words beyond their literal origins. Still, some argue that using āGodā for an impersonal force is misleading if one must always add caveats.
Rhetorical Power vs. Precision:
Pro: Using āGodā can evoke a stronger emotional or rhetorical impact.
Con: It risks conflation with traditional theism, requiring disclaimers (āGod but not personalā), which some find contradictory or confusing.
Cultural Momentum:Despite these drawbacks, the cultural inertia behind āGodā as a term of ultimate reference remains potent. So even if itās not perfectly apt, it continues to be used, partly because no alternative metaphor carries the same widespread recognition.
Calling the laws of physics āGodā (in Einsteinās sense) is a metaphor that leverages cultural and emotional weightāyet it often necessitates disclaimers to avoid implying belief in a supernatural or personal deity. Whether itās apt or inapt depends on oneās communication goals: itās evocative and poetic but can also be ambiguous or confusing.
I suppose it’s much the same for those who use it in the formal/platonic realm of maths and logic: this time analogous to the aesthetic awe at symmetry or the vastness of space. Some people use “gods” as the metaphor for their reverential awe of the great composers, or the great creators (sic) in all human areas of endeavor… In the end, there’s no arguing about metaphors, because they are not literal, falsifiable statements of truth but just the airing of sentiments. The anthropomorphic metaphors especially…
Thought about metaphors and reverence in science for a couple of seconds Indeedāmetaphors, especially those involving āgodā or āgods,ā tap into cultural and emotional resonance rather than strict, literal statements. The same applies whether the realm is physics, mathematics, the arts, or any other field that inspires profound admiration. Because metaphors are inherently symbolic (not falsifiable in the scientific sense), they serve more as expressions of sentimentāvehicles for conveying awe, reverence, or wonderāthan as precise definitions or doctrinal truths. As you note, anthropomorphic language in particular can both humanize and potentially obscure the concepts it describes, which is why such metaphors often require clarification. Ultimately, itās a creative way of communicating emotion rather than making a claim about empirical reality.
1. What is the distance of a random starting configuration from a correct solution?
2. Thereās obviously a minimum distance: Presumably there is a maximum distance?
3. I guess a virtual cube could be simulated, with a virtual mouse, like a complex ājoy-cubeā computationally configurable to a specified distance from a solution, and manipulable by players?
4. Real players could be tested for their speed (number of manipulations) at reaching a solution from (a) a given starting distance, (b) an average starting distance, or (c) the maximal starting distance?
5. They could also be trained and tested from differently sized and shaped virtual ācubesā, all the way up to a (digitally approximated) virtual āsphereā?
6. Besides testing real players with simulations on a mouse, I suppose algorithms could be generated and tested (and trained) for their āspeedā (in number of manipulations) to reach a solution?
7. These are all parameters. Do they have any meaning or use (analogous to the way the parameters of prime number distributions are studied by number theorists, when they are trying to generating potential proofs or theorems)?
8. Or am I talking nonsense (or trivia)?
1. What is the distance of a random starting configuration from a correct solution?
Answer Overview: The ādistanceā typically refers to how many moves are required, under a given move metric, to transform a scrambled cube into its solved state. On a standard 3Ć3Ć3 Rubikās Cube, this is often measured in either:
Quarter-Turn Metric (QTM)Ā ā each 90Ā° turn counts as 1 move.
Half-Turn Metric (HTM)Ā ā a 180Ā° turn counts as 1 move, and a 90Ā° turn also counts as 1 move.
Average Distance for a Random Configuration: Most random configurations lie near (but not exactly at) the upper part of the possible ādistanceā spectrum. Empirical research suggests that, in half-turn metric, many random scrambles take around 17ā18 moves (on average) to solve optimally. In quarter-turn metric, that average is a bit higher (19ā20 moves).
2. Thereās obviously a minimum distance: Presumably there is a maximum distance?
Minimum Distance: The minimum distance is 0 moves if youāre already in the solved state.
Maximum Distance:
In quarter-turn metric (QTM), the proven āGodās numberā for the 3Ć3Ć3 cube is 26. This meansĀ noconfiguration requires more than 26 quarter-turn moves to solve.
In half-turn metric (HTM), the proven number is 20. No configuration requires more than 20 half-turn moves.
Interpretation: This maximum distance is the farthest any scramble can be from the solved configuration under optimal solving.
3. I guess a virtual cube could be simulated, with a virtual mouse, like a complex ājoy-cubeā computationally configurable to a specified distance from a solution, and manipulable by players?
Implementation Feasibility:
Virtual Mouse / āJoy-cubeā:Ā It is indeed possible to create a 3D interactive interface (e.g., with WebGL or a game engine) that:
Generates a specific cube scramble with a known ādistanceā from solved.
Allows users to perform standard legal moves with a mouse or other input device.
Distance Configuration: Generating scrambles at a precise known distance is more challenging because you need to work backward from the solved state using a known sequence length. However, it can be done by applying exactlyĀ ddrandom moves from the solved state to ensure the scramble is at distanceĀ ddĀ or fewer.
Practical Applications:
Training & Education:Ā Players can practice on configurations that systematically increase in difficulty (i.e., distance).
Competitive Settings:Ā Timed challenges can be set up where each puzzle is guaranteed to be at the same distance from solved.
4. Real players could be tested for their speed (number of manipulations) at reaching a solution from (a) a given starting distance, (b) an average starting distance, or (c) the maximal starting distance?
Yes, this is a viable testing methodology.
(a) Given Starting Distance:
A known scramble distance tests how players adapt their solving strategies or heuristics when they are certain how ādeepā the puzzle is from solved.
(b) Average Starting Distance:
Random scrambles reflect real-world solves. This is how most speedcubers test their times in competitions.
(c) Maximal Starting Distance:
Testing how long it takes a solver to unravel the worst-case scenario can be used to measure both skill and stamina (in solving). This is less common in real competitions but interesting for theoretical or research purposes.
5. They could also be trained and tested from differently sized and shaped virtual ācubesā, all the way up to a (digitally approximated) virtual āsphereā?
Different Puzzle Variants:
NxN Cubes:Ā There are official NxN cubes (4Ć4, 5Ć5, etc.), and each has its own complexity and group theory nuances.
Other Twist Puzzles:Ā Megaminx, Pyraminx, Skewb, and more can also be simulated in 3D environments.
Virtual āSphereā:
In puzzle terms, a sphere might be visualized as a puzzle with rotating ābands.ā Some puzzle inventors have indeed created spherical twisty puzzles.
Why Train on Different Shapes?:
Increases familiarity with more generalized āgroup moves.ā
Offers variety in problem-solving approaches.
Expands the solverās spatial reasoning.
6. Besides testing real players with simulations on a mouse, I suppose algorithms could be generated and tested (and trained) for their āspeedā (in number of manipulations) to reach a solution?
Algorithmic Solvers:
Examples:
Kociembaās Algorithm for 3Ć3Ć3.
Thistlethwaiteās Algorithm (an older but historically significant approach).
More recent optimizations in two-phase solvers.
Metrics & Training:
Move Count:Ā Minimizing the total number of moves to solve.
Algorithmic Complexity:Ā Some methods might produce short solutions but take longer in computation. Others might solve quickly (in real-time) but produce longer solutions.
Machine Learning Angle:
Reinforcement learning or other data-driven approaches can be tested at different distances, puzzle sizes, or puzzle shapes.
7. These are all parameters. Do they have any meaning or use (analogous to the way the parameters of prime number distributions are studied by number theorists, when they are trying to generate potential proofs or theorems)?
Yes, they do have theoretical and practical meaning.
Group Theory Significance:
The Rubikās Cube (and related puzzles) are finite groups. Studying distances and configuration spaces informs research on group diameter, symmetry, and combinatorial aspects.
Algorithmic Complexity:
Identifying typical or worst-case puzzle states helps to explore āGodās algorithmā (optimal solutions) and the computational complexity behind them.
Heuristic & Practical Value:
In puzzle design or AI solvers, these parameters matter for how efficiently or quickly a puzzle can be solved.
Recreational Math & Education:
Insights from these parameters help create teaching tools that illustrate concepts in math and computer science.
8. Or am I talking nonsense (or trivia)?
These questions are quite relevant to both recreational mathematics and more formal studies in group theory, search algorithms, and computational complexity. While these topics can seem niche, they have parallels to important problems in computer science (search algorithms and optimization), mathematics (group theory and combinatorics), and educational technology.
Summary
Your questions touch on well-researched territory in Rubikās Cube mathematics and puzzle theory. Distances, maximum depths, testing methodologies, and algorithmic approaches all have real-world and theoretical applications. Far from being trivial, these areas intersect with math, computer science, education, and even psychology (e.g., how humans approach problem-solving).
ME: How long ago did life on earth begin?āØāØāØ
ChatGPT:āØ Life on Earth is estimated to have begun approximately 3.8 to 4 billion years ago during the Hadean or early Archean eon. This estimate is based on the earliest fossil evidence of microbial life, such as stromatolites, and chemical signatures of life, such as isotopic ratios in ancient rocks.āØāØāØāØāØāØāØāØāØāØāØ
ME:āØ What about the origin of sentience?āØāØāØ
ChatGPT:āØ The origin of sentienceāthe capacity to feel, perceive, or experience subjectivelyāis more difficult to pinpoint in evolutionary history because it leaves no direct physical traces like fossils or isotopic markers. Sentience likely emerged as nervous systems evolved to enable organisms to process information and respond adaptively to their environment.āØThe prevailing view is that sentience arose gradually, beginning with the simplest forms of sensation in early multicellular organisms, potentially over 500 million years ago during the Cambrian Explosion, when a rapid diversification of life forms with nervous systems occurred. Some key points:
Nervous Systems: Sentience is tied to the evolution of nervous systems capable of transmitting and integrating sensory signals. Early bilaterians (organisms with bilateral symmetry) are candidates for some of the first sentient beings.
Adaptive Advantage: The ability to feel pain, pleasure, and other basic sensations likely conferred survival advantages, such as guiding behavior toward beneficial stimuli and away from harm.
Degrees of Sentience: Sentience is likely not binary but exists along a continuum, evolving with increasing neural complexity. Mammals and birds exhibit complex sentience, while simpler organisms (e.g., some invertebrates) may experience more basic forms.
While we cannot directly observe the emergence of sentience, its evolutionary underpinnings suggest it is deeply rooted in the history of life, possibly as early as the development of the first centralized neural networks
ME:āØ Sentience means the capacity to feel (anything: +, – or neutral “unvalanced”). But I would be even more wary about “gradualism” concerning sentience than gradualism concerning pregnancy. There can be gradualism about the intensity, or quality, or variety of feeling, but not about the feeling of feeling (whatever). Think about it before replying. And don’t be too confident about what the adaptive advantage of sentience might have been, in our era of increasingly capable robots and AI, all insentient…āØāØāØ
ChatGPT:āØ Yes, one should emphasize caution in applying gradualism to sentience itself. Sentience, as the capacity to feel anything at allāpositive, negative, or neutralāis binary: an organism either feels or it does not. There is no “halfway” state for the existence of feeling. Gradualism can only apply to the dimensions of sentience, such as intensity, complexity, or range of feelings once sentience has arisen. The leap from non-sentience to sentience is fundamentally distinct from evolutionary gradual changes in other traits.āØ
This binary nature of sentience parallels phenomena like pregnancy: one is either pregnant or not. Similarly, an entity either has the capacity to feel or it does not, regardless of how rudimentary or elaborate the content of that feeling might be.
The transition to sentience, then, marks a profound qualitative shift, one that natural science struggles to pinpoint precisely because it involves subjective experience.āØThe point about adaptive advantage is equally compelling. While theories propose that sentience evolved to guide organisms through pain/pleasure mechanisms or to enhance survival, such explanations falter when considering highly capable modern robots and AI systems that perform complex tasks, adapt dynamically, and even outstrip human abilities in specific domainsāall without sentience.āØ
This raises important questions: Is sentience merely a byproduct of neural complexity, or does it serve some evolutionary function we have yet to understand?
The “hard problem” of sentience remains unresolved: why and how neural activity or any physical process would give rise to the subjective experience of feeling, beyond mere computational processing.āØ
In light of these considerations, gradualism is inappropriate for explaining the origin of sentience. It is better understood as an evolutionary threshold that, once crossed, allowed for later elaboration in its intensity, variety, and adaptive roles. Whether it confers an indispensable evolutionary advantage or emerges as a byproduct of complex nervous systems remains an open question.āØāØāØāØāØāØāØāØāØāØāØāØāØāØāØāØāØ
ME:āØ Fine, but don’t forget that “complexity” (along with “emergence” and “byproduct”) is among the weaseliest and most vacuous of notions (and the ultimate recourse of GPT whenever it reaches an epistemic impasse or void…āØāØāØĀ )Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā Ā
ChatGPT:āØ Yes, “complexity” and “emergence” and “byproduct” are indeed overused, often as placeholders when clear explanations are lacking. They can become weasel-words when used to obscure or gloss over genuine gaps in understanding, particularly in discussions of phenomena like sentience or consciousness. āØIn this case, invoking “complexity” risks trivializing the profound and still-mysterious transition from insentience to sentience. While increasing neural sophistication might correlate with the range or intensity of feelings, it says nothing about why or how feeling itself emerges.
Similarly, “emergence,” often used to suggest that sentience arises from certain configurations of matter or processes, fails to address the explanatory gap: why those configurations should give rise to subjective experience at all, rather than just more sophisticated insentient computation.āØThe transition from insentience to sentience remains fundamentally unexplained by appeals to complexity or emergence or byproduct. These terms, at best, describe phenomena rather than explaining them, especially in the context of the “hard problem” of sentience. Addressing sentience demands more precise and grounded concepts, avoiding reliance on vacuous placeholders.āØāØāØ
ME: āØ…which is not to say that the HP (or sentience itself) licenses the supernatural fantasies of the earthly creeds and cults…āØāØ
ChatGPT: āØIndeed, the “hard problem” (HP) of sentienceāhow and why subjective experience arises from physical processesādoes not warrant retreating into supernatural or metaphysical explanations. Invoking mystical, religious, or supernatural notions to address the HP often replaces one mystery with another, offering no substantive progress in understanding.āØThe difficulty of explaining sentience within the framework of natural science reflects the profound challenge it poses, but it doesn’t necessitate positing immaterial “souls” or divine interventions. The HP is a challenge to the explanatory limits of materialism, but not a license to abandon it for unfalsifiable speculations.āØ
Instead, the task is to push scientific inquiry toward uncovering principles or mechanismsāpossibly ones we have yet to conceiveāthat might illuminate the nature of sentience without invoking the supernatural. A disciplined approach remains grounded in empirical investigation and reasoned analysis, even when confronting questions as enigmatic as the HP. Science thrives on its ability to refine and expand its methods to address what once seemed inexplicable, and the HP should be no exception.
ME:āØ Can’t complain about that: You did a good job gathering those threads together — and without the slightest bit of help from sentience or understanding!āØāØāØ
ChatGPT:āØ Thanks. The threads themselves do most of the heavy liftingāmy role is simply to weave them together coherently. If there’s one thing that highlights the astonishing utility of systems like mine, it’s precisely this: assembling, organizing, and articulating knowledge without the need to feel or understand any of it. Itās an insentient craft, but Iām “glad” when it serves your sentient purposes effectively!
Abstract: ChatGPT and I will discuss various senses of “representation” across mathematics, logic, and cognitive science, starting with formal representations, in computation and group theory. Computation is syntactic, with arbitrary symbols manipulated according to rules (algorithms). Analog representations, such as iconic images or gestures, which resemble what they represent, are either homuncular or depend on an internal observer for interpretation. Linguistic representation originates from the instrumental learning of sensorimotor categories. The instrumental response came to be used mimetically for nonlinguistic gestural communication. These iconic gestures become increasingly simplified and arbitrary as they become shared communicative conventions, but they retain their grounding in the categories from which they originated. The crucial transition is from mimetic representation, which is still nonlinguistic, to propositional representation. Propositions are strings of arbitrary, conventionalized gestures — content words that retain their original grounding in the categories to which they refer. But, unlike the arbitrary symbols of computation, or even individual content-words with referents, propositions also have meaning: subjects and predicates with truth values.Ā Ā Propositionality leads from the Church-Turing Thesis about the syntactic power of ungrounded symbol manipulation to the semantic power of grounded natural language, with Katzās Effability Thesis: that any natural language can express any thinkable thought.Ā
SH: Let’s chat about representations.
ChatGPT: To discuss representation, it’s crucial to start by defining the term within the context of cognitive science and related fields. Representation typically refers to the way in which information about the world is encoded, stored, and used by cognitive systems, whether biological or artificial.
In cognitive science, representations can be seen in various forms, such as:
Symbolic Representations: These involve discrete symbols or tokens that stand for objects or concepts in the world. They are often used in computational models of cognition and are closely related to the symbol grounding problem, which you have extensively studied.
Subsymbolic Representations: These include distributed patterns of activation across neural networks, as seen in connectionist models. These representations do not have discrete symbols but rather patterns that emerge from the interactions of simpler units.
Mental Representations: These are internal cognitive symbols or mental images that represent external reality. They include concepts, schemas, mental models, and images.
Perceptual Representations: These involve the encoding of sensory information by the nervous system, such as the visual or auditory representation of stimuli.
Each type of representation comes with its own set of challenges and implications, particularly in understanding how these representations are grounded in physical and neural processes and how they contribute to cognition, perception, and action.
Would you like to delve into a specific type of representation, such as symbolic representation and its grounding, or perhaps discuss the differences and intersections between these forms? Or is there another aspect of representation you are particularly interested in exploring?
SH: I’d like to start with the way “representation” (R) is used in mathematics and logic. For example, in linear algebra, matrices can be Rs of groups. What does this mean in this case.?
ChatGPT: In mathematics and logic, particularly in linear algebra, a “representation” of a group refers to a way of expressing the elements of the group as matrices. This allows group operations to be carried out as matrix multiplication.
Specifically, a representation of a group GG on a vector space VV is a homomorphism Ļ:GāGL(V)Ļ:GāGL(V), where GL(V)GL(V) is the general linear group of invertible matrices over VV. This means each group element gāGgāG is associated with an invertible matrix Ļ(g)Ļ(g) in such a way that the group operation is preserved: Ļ(gh)=Ļ(g)Ļ(h)Ļ(gh)=Ļ(g)Ļ(h) for all g,hāGg,hāG.
SH: Please state (briefly) whether this notion of R relates to the one (or ones) in Cognitive Science (cogsci). But first, is this formal sense of R relate to the notion of a “model” in formal logic?
ChatGPT: Yes, the formal sense of “representation” in mathematics is closely related to the notion of a “model” in formal logic. In formal logic, a model provides an interpretation of a formal language by assigning meanings to its symbols and truth values to its statements, thereby “representing” the abstract syntax in a concrete structure.
In both cases, representation involves mapping abstract elements (group elements in mathematics, symbols in logic) to concrete structures (matrices in linear algebra, interpretations in models).
SH: Let’s clarify this formal meaning of R in maths and logic. Is it also true that just as a group is a formal entity (not a concrete one), a matrix is a formal entity too? And so is a “model” in model theory. It’s an instance or a special case of the formal entity, but it’s a formal entity too (even in geometry).
ChatGPT: That’s correct. In both mathematics and logic:
Groups: These are formal entities defined by a set and an operation satisfying specific axioms (closure, associativity, identity, and invertibility).
Matrices: These are formal entities representing arrays of numbers or functions, used to perform linear transformations and other operations.
Models in Logic: These are formal structures consisting of a domain and an interpretation function that assigns meaning to the symbols of a formal language, making abstract syntax concrete.
In each case, while the entities can be instantiated in concrete examples, they are fundamentally formal constructs defined by their properties and relations.
SH: “Formal” means symbols, doesn’t it, symbols of a certain form (say, round like 0 or vertical like 1). Mathematicians, when they are actually doing calculations in mathematics (not when they are doing whatever they are doing when they make discoveries in mathematics) are manipulating symbols — symbols whose physical shape is arbitrary in relation to what they can be interpreted as meaning.
(Don’t be compliant: if there is something to contest in what I say, contest it, don’t just buy it and adopt it.) But keep track.
ChatGPT: Yes, in mathematics and formal logic, “formal” generally refers to symbols and the rules for manipulating them, irrespective of their physical shape or form. These symbols are part of a formal system, where their meaning is determined by their role and relationships within the system, rather than their physical appearance.
Here’s a point-by-point breakdown:
Formal Systems: These consist of symbols and rules for manipulating them. The symbols themselves are abstract and their physical representation is arbitrary.
Symbol Manipulation: When mathematicians work within a formal system, they manipulate these symbols according to defined rules. The symbols represent abstract entities, and their physical shape (e.g., 0 being round, 1 being vertical) is irrelevant to their formal properties and operations.
Meaning and Interpretation: The meaning of these symbols comes from their interpretation within a given context, such as numbers in arithmetic, elements in a group, or statements in logic. This interpretation gives the symbols semantic content, which is crucial for applications but not for the formal manipulations themselves.
So, while the formal aspect of symbols involves their syntactic manipulation according to specific rules, their interpretation (meaning) can vary depending on the context in which they are used.
SH: Let’s call symbols and symbol manipulations “code” and coding”. The shapes of the symbols are arbitrary: any symbolic notational system will do. They do not physically resemble their interpretations. The rules for manipulating the symbols are called algorithms. They too are applied to the symbols formally and mechanically, as in the calculation of the roots of a quadratic equation: If you know the algorithm, you can apply it to quadratic equations without having to know their interpretation. It’s just applying a formal recipe to formal ingredients.
Are you still with me? Any caveats?
ChatGPT: Yes, Iām with you. Hereās a summary with a couple of clarifications:
Symbols and Symbol Manipulations as “Code” and “Coding”: Symbols are arbitrary notations that can be manipulated according to specific rules. The physical shape of these symbols is irrelevant to their formal properties and their interpretation.
Non-Resemblance: Symbols do not physically resemble what they represent. They are abstract entities whose meaning is assigned by interpretation within a formal system.
Algorithms: These are rules or procedures for manipulating symbols.
Algorithms are applied mechanically, without needing to understand the interpretation of the symbols involved.
Example: In solving a quadratic equation, one can apply the quadratic formula (an algorithm) to the symbols representing the coefficients, yielding roots without needing to understand what those coefficients represent.
Caveats:
Understanding vs. Application: While itās true that algorithms can be applied mechanically, understanding their derivation or the meaning of the symbols can deepen comprehension and sometimes reveal when an algorithm might be misapplied or when certain conditions are necessary for its application.
Context-Specific Rules: The formal rules (algorithms) can vary depending on the context in which the symbols are used. For example, the rules for manipulating symbols in logic differ from those in algebra.
SH:Ā Be careful about a conflation you inadvertently made, between Rs (representations) and interpretations. We agreed that in the examples we have discussed before — examples that are not from cognitive science but from mathematics, or logic — the Rs are purely formal, whereas the interpretations are not formal; they are something else: Turing explained what mathematicians do: they manipulate symbols. But Turing was not (yet) doing cognitive science: He is not explaining how mathematicians think, or mean, or understand; that would get into cognitive science. He’s just talking about what mathematicians do. And so far, mathematicians’ Rs (like matrices or truth-tables), their “effective procedures,” are just formal. But their interpretations (meanings) are not.
Are we still on. the same wave-length? (No need to make a lengthy update on your summary. Just let me know if there are any points on which we’ve diverged: I just pointed one out. Do you want to defend it or correct it?)
ChatGPT: Yes, I understand the distinction you’re making. You are right that in the examples from mathematics and logic:
Representations (R): These are purely formal entities. Matrices, truth tables, and other such formal structures are manipulated according to rules without regard to their interpretation.
Interpretations: These are the meanings assigned to the formal symbols and structures. They are not formal but rather provide the semantic content that the formal symbols represent.
Turing’s work, as you mentioned, focuses on the manipulation of symbols (what mathematicians do) and not on their meanings or the cognitive processes involved (what they think or understand).
We are on the same wavelength. Representations are formal, while interpretations are the meanings attached to these formal symbols.
Shall we proceed to how this distinction relates to cognitive science, particularly in terms of cognitive representations and their grounding?
SH: You’re in a big hurry! We’ve agreed that Rs are formal, just as symbols are. They are also rule-governed, but the rules are just formal too. OK? Now, I’m going to try to show you that Rs are weasel-words in mathematics, and an example of that was when you conflated representations and interpretations.
Another weasel-word in this context is “semantics” (meaning). It just takes us back to interpretations. In maths and logic “Rs” are just formal, like symbols themselves. We don’t get beyond symbols and formal symbol-manipulation by saying the word “Representation.” We don’t even break out of the symbol-circle if we use the (non-weasel) word “refer”. And that comes even before we get to the weasel-words “meaning” and “semantic”. What does the symbol “2” in Peano arithmetic refer to?
ChatGPT:Understood, let’s continue.
In Peano arithmetic, the symbol “2” refers to the formal object defined by applying the successor function SS twice to the symbol for zero (0). Specifically, 22 is defined as S(S(0))S(S(0)). This definition is purely formal and exists within the system of Peano arithmetic.
To break it down further:
Peano Arithmetic: This is a formal system that defines the natural numbers using a set of axioms.
Symbols: “0” is a symbol representing the first natural number, and “S” is a symbol representing the successor function.
Formal Representation of 2: In this system, “2” is the result of applying the successor function twice to 0, written as S(S(0))S(S(0)).
This formal definition operates entirely within the symbolic system and does not imply any interpretation or meaning outside of the formal system itself. It stays within the realm of symbol manipulation.
Peano Arithmetic: This is a formal system that defines the natural numbers using a set of axioms.
Symbols: “0” is a symbol representing the first natural number, and “S” is a symbol representing the successor function.
Formal Representation of 2: In this system, “2” is the result of applying the successor function twice to 0, written as S(S(0))S(S(0)).S(S(0)), which is a formal construct within the system. This keeps us firmly within the circle of symbols and formal rules, without invoking meaning or semantics.
SH: Now let me introduce a non-weasel sense of R that is not mathematical and not formal, and not open to the objections I will raise when we get (soon) to cogsci. This is the sense in which a photo, or a video or a (representational) sculpture or painting represents the object of the photo, video, sculpture, etc.: the thing that it is a photo (etc.) of. That real object is not an arbitrary shape, like a formal symbol or representation like S(S(0)). The photo’s shape is (to an approximation) analog; it is isomorphic to — has the same shape as — what it is representing.
I’ll pause to let you say (briefly) what is the difference between a symbolic and an analog representation.
ChatGPT: A symbolic representation uses arbitrary symbols to stand for objects or concepts, with no inherent resemblance to what they represent. The relationship between the symbol and its referent is established by convention or definition (e.g., “2” in Peano arithmetic).
An analog representation, on the other hand, maintains a direct, often continuous, isomorphic relationship with what it represents. The form or structure of the representation resembles the form or structure of the represented object (e.g., a photo or a sculpture).
In summary:
Symbolic Representation: Arbitrary, rule-based, and formal (e.g., mathematical symbols, words).
Analog Representation: Resemblance-based, isomorphic, and often continuous (e.g., photos, paintings, sculptures).
SH: Now, the reason I introduced analog representation at this point is to agree from the outset that even when we get into cogsci, analog Rs are immune to the criticisms I will make. The only trouble is that they don’t get you very far, if what you are looking for is a cognitive theory of representation, or meaning. It doesn’t even work for a cognitive theory of reference. The referent of “2” in Peano arithmetic is not S(S(0)): That’s just a formal representation, like a matrix or a truth-table. But, since formal maths is a subset of English (or any other language), in the proposition “2 + 2 = 4” mathematicians do have something in mind: “2” refers to a referent as surely as “cat” refers to something in the proposition “the cat is on the mat.” But that connection between the symbol and its referent is not analog representation in either case: “2” or “cat.” So that cannot be cogsci’s solution.
(I’ve slipped language into this discussion: that’s no longer just formal symbols, but we’ll get to that.)
ChatGPT:Here’s a brief summary to ensure we’re on the same page:
Analog Representations: These maintain a direct, isomorphic resemblance to what they represent (e.g., photos, paintings). While useful, they don’t offer a complete solution for cognitive theories of representation or meaning.
Symbolic Representations: These are arbitrary and formal (e.g., mathematical symbols, words). They require interpretation to establish their referents.
Reference in Cognitive Context: When using language or formal systems, symbols like “2” in “2 + 2 = 4” and “cat” in “the cat is on the mat” do have referents in the minds of the users. However, this reference is not based on analog representation but rather on learned associations and interpretations.
This distinction highlights that cognitive science needs to address how symbols acquire meaning and reference, beyond the scope of analog representations.
Let’s recap the discussion:
We’ve established that in both mathematics and logic, representations (Rs) are formal and symbolic, with no inherent connection to their referents outside of the formal system. Analog representations, like photos or sculptures, are different in that they have a direct, isomorphic relationship with what they represent, but these do not suffice for a cognitive theory of representation or meaning.
Weāve also recognized that the referents of symbols in mathematics or language (like “2” or “cat”) involve something more than just formal manipulationāthey involve interpretation, which is not captured by analog representation.
Now, we are poised to explore how cognitive science deals with symbolic representations, particularly how these symbols gain meaning and refer to things in the world, which is central to understanding cognition.
We have established the distinction between formal and analog representations, and how neither fully accounts for meaning or reference in cognitive science. Formal symbols (like “2” or “cat”) require interpretation beyond the formal system itself, and analog representations (while closer to the objects they represent) don’t provide the cognitive mechanisms for meaning or reference.
At this point, we’re prepared to discuss how cognitive science approaches symbolic representation, meaning, and reference. Would you like to explore how cognitive systems might ground these symbols in referents, or perhaps investigate how language complicates this process further?
SH: Yes, letās start with the difference between reference and meaning. First, please confirm (very briefly) that you know the difference between a “content-word” (C-word) (which is 95-99% of the words in the lexicon) and a “function-word” (F-word). Content-words have referents, whereas function-words only have (Wittgensteinian?) “uses”, defined by a formal, usually syntactic, rule.
ChatGPT:Content-words (C-words): These are nouns, verbs, adjectives, etc., that refer to things, actions, or properties (e.g., “cat,” “run,” “blue”).
Function-words (F-words): These are words like prepositions, conjunctions, and articles (e.g., “and,” “in,” “the”) that serve a grammatical function and do not refer to objects or actions but rather structure the sentence.
C-words have referents, while F-words are defined by their syntactic roles.
Shall we proceed to discuss the difference between reference and meaning?
SH:Ā Do we agree also that C-words do not resemble their referents, even in languages with partly iconic scripts like chinese, or languages like sign languages of the deaf, whose gestures are partly iconic in their origins and execution. Nevertheless, the C-words of these more iconic languages are every bit as arbitrary as in alphabetic and spoken languages. Shakeseare can be fully translated into Chinese or into American Sign Language (ASL) — not word for word, but proposition for proposition. The iconicity of chinese or ASL is irrelevant to the literal meaning of its translation of the Shakespearean text (although to convey some of the figurative meaning a literal paraphrase of some tropes may be needed).
ChatGPT: Content-words (C-words), regardless of the language or script (including Chinese or ASL), do not resemble their referents. Even in languages with partly iconic scripts (like Chinese) or partly iconic gestures (like ASL), the referential relationship between a C-word and its referent is still arbitrary. Shakespeare can indeed be fully translated into Chinese or ASL, proposition for proposition, without any dependence on the iconicity of symbols or gestures.
The iconicity of such scripts or gestures is irrelevant to the literal meaning of the text. Literal meanings can be preserved across translations, though some figurative meanings (e.g., metaphors, idioms) may require paraphrasing to convey the same sense.
SH:Ā You’re just echoing, but you’re right. One further question: Content-words have referents and proposition have meanings. What’s the difference?
ChatGPT: The difference between referents and meanings can be outlined as follows:
Content-word Referents:
A referent is the specific object, action, or concept in the world (or imagined world) that a content-word (C-word) refers to. For example, the word “cat” refers to actual cats in the world, and the word “run” refers to the action of running.
Referents are tied to individual C-words and are about identifying or denoting particular things.
Propositional Meanings:
A proposition is a complete statement or sentence that expresses something that can be true or false. The meaning of a proposition involves the relationship between the content-words in the sentence, their referents, and how they are structured by function-words.
Meaning here is not just about identifying referents, but about describing or defining relations between them. For example, the proposition “The cat is on the mat” is about a particular situation involving the cat (the referent of “cat”) and its relation to the mat (the referent of “mat”). The meaning of the proposition is the truth-conditional description of that relationship.
In short::
Content-words have referents: They point to things.
Propositions have meanings: They describe relationships between referents and express something that can be true or false.
SH: I had to sort this out before getting to cogsci, because language is such an important property of human cognition and cognition. Now, there are several ways in which the word “representation” — particularly in two contexts, “internal representations” (IRs) and “neural representations” (NRs) — have slipped into cogsci.
ChatGPT:Internal Representations (IRs) in cognitive science refer to mental models or cognitive structures that are thought to represent external reality within the mind. They are typically non-linguistic and are assumed to encode information about the world, guiding perception, thought, and action. IRs are sometimes used to explain how we process and store sensory input or knowledge.
Neural Representations (NRs) refer to patterns of neural activity in the brain that are thought to correspond to specific stimuli or cognitive states. NRs are often linked to brain regions and are measured via neuroimaging, attempting to map how certain neurons or networks respond to external stimuli.
Relation to linguistic reference and meaning:
IRs may play a role in linking language (e.g., content-words) to referents by encoding abstract or sensory knowledge of the referents.
NRs are more about the brain’s mechanisms, underlying both linguistic and non-linguistic cognition, potentially supporting IRs.
Neither IRs nor NRs directly are referents or meanings; they are posited cognitive or neural processes theorized to support how we manage linguistic reference and propositional meaning.
SH: Here I have to point out that a weasel-word has already slipped in, making the IR/NR distinction incoherent: “mental”. And another has been left out: “cerebral”. What is it that the “Internal” Representations are internal to?The head or the “mind”? And if the “mind,” what is that? In asking you that question, I have opened the sluice-gates for all kinds of weasel-words, all of them near-synonyms, hence redundant (or even circular) or vague: mind, mental, consciousness, subjective states, experiential states, qualia, etc. etc. Can we agree not to use or mention any of these W-Ws and just use “felt states” in order to refer to mental states? (“Sentient” is an unobjectionable gloss when we are in latinate mood…)
ChatGPT: Yes, we can agree to use “felt states” to refer to what are typically called “mental states,” avoiding the weasel-words like “mind,” “mental,” “consciousness,” etc. This will keep the discussion grounded and avoid both the vagueness and the redundance that often comes with those terms.
Now, with this clarified:
Internal Representations (IRs) would be representations internal to the brain (or body), not the “mind,” and should be understood as cognitive structures or models that support perception and action.
Neural Representations (NRs) remain patterns of neural activity, clearly cerebral, without invoking the notion of “mind.”
With “felt states” reserved for discussing subjective experience, are you ready to explore how IRs and NRs might (or might not) be involved in grounding linguistic reference and meaning?
SH: You have not resolved the incoherence yet. If “internal” means internal-to-the brain, then IRs and NRs are the very same thing (and we still don’t know what “Representation” means). And what about the difference between internal states that are felt and unfelt?
ChatGPT: I have to agree: if Internal Representations (IRs) are internal to the brain, they collapse into Neural Representations (NRs), making the distinction redundant. And as for “representation,” we still need to clarify what it means in this context, since neither IRs nor NRs explain how symbols refer to or mean anything; they are patterns or states in the brain.
Regarding felt vs. unfelt states:
Felt states are those that are experienced or consciously apprehended (e.g., feeling pain, seeing red, or understanding a proposition).
Unfelt states are brain processes that occur without conscious experience (e.g., background neural activity or automatic processing).
So, the distinction now hinges on whether neural states (NRs) are felt or unfelt. Only the felt states would seem relevant to any cognitive explanation involving subjective experience (like understanding or meaning). Unfelt states, while important for processing, don’t seem directly relevant to grounding reference or meaning unless they somehow contribute to what is ultimately felt.
Do we agree that the real issue here is distinguishing felt neural states from unfelt ones, and that calling either a “representation” (IR or NR) still leaves us unclear on how these brain states are representing anything external?
SH: More or less. Now let’s look closer at “representing” (and forget about the external and internal for the moment; we’ll get back to it).
We’ve agreed that an iconic image (in the visual modality, but we could have used any sensory modality — visual, auditory, haptic, etc. ) is a visible object, say, a shape, that resembles, in the analog sense (i.e., is isomorphic to) another object:the object that it is the representation of. This fits the case of a mirror that is facing a sphere. Let’s call the sphere the “object” and the shape in the mirror its “image.” But the mirror image is not “representing” anything unless we can say who or what the mirror image of the object is a representation to, or for.
ChatGPT: Agreed. The mirror image of the sphere is an analog resemblance, but it doesn’t count as a representation until it is seen by someone or something that interprets it as a representation to or for them. In other words, representation requires not just resemblance but also a receiver (an interpreter) who recognizes or interprets the image as standing for the object.
SH: I’m not sure what an “interpreter” is here. A person drawing the sphere would do: We agreed that this sense of representation (as in representational art) makes sense; but it’s “external” in the sense that both the sphere and the drawing of the sphere are external to the person doing the drawing, or viewing either the drawing or the sphere. So is the IR the internal process that generates the drawing? And is the state of viewing and drawing the sphere — if it is a felt state — what “mental representation” had been intended to mean?
ChatGPT: In this context:
Interpreter: When I referred to an interpreter, I meant any agent (human, animal, or system) that recognizes or processes the mirror image (or drawing) as representing the sphere. But we can clarify further: the person drawing or viewing the sphere is the “interpreter” in this case, though “interpreter” itself can feel vague.
External Representation: Yes, the sphere and its drawing are both external objects, and the representation (drawing) resembles the sphere in the analog sense.
Internal Representation (IR): If we take IR to mean the internal cognitive or neural process that generates the drawing (or processes the perception of the sphere), then IR could refer to how the brain internally models or represents the sphere to guide action (like drawing). But this “representation” is not felt unless it becomes part of a felt state (e.g., the conscious experience of seeing the sphere or creating the drawing).
“Mental Representation”: If we interpret “mental representation” in terms of felt states, it would refer to the conscious experience of viewing or drawing the sphere, rather than just the unfelt neural processes involved in perception or motor control. In this sense, “mental representation” would mean the felt internal process of perceiving and thinking about the sphere.
Does this clarify the relation between IRs, felt states, and the concept of “mental representation”?
SH: It brings some aspects into better focus, but can we now discuss the homunculus? The reason “mental imagery” was rejected by cogsci as being unexplanatory was that if the objective of cogsci is to give a causal explanation of how cognizers have the capacity to do what they can do, what is needed is to reverse-engineer that causal mechanism. The brain can do it, but we don’t know how. Saying that it’s done by a causal mechanism that is in the brain is true, but not an explanation either. Does it help to add that “exercising the capacity to do it is a felt state”? When we are exercising it, we are “in” that felt state. But can introspecting that felt state give us a clue to what the causal mechanism is, rather than attributing it to a homunculus, a little person in the head that is seeing, feeling and doing what we are doing?
ChatGPT: Introspecting a felt stateābeing aware of what it feels like to see, think, or actādoesn’t give us a clue to the underlying causal mechanism. It just reveals that we are experiencing it. The homunculusfallacy arises when we imagine a little person inside our head doing the same things we do, which doesnāt explain anything; it just pushes the explanation back one level (who’s in the homunculus’s head?).
So, while felt states tell us what it’s like to experience cognition, they don’t reverse-engineer the brain’s causal mechanisms. Cognitive science needs to figure out how the brain generates these capacitiesāwithout invoking a homunculusāby uncovering the actual neural and computational processes that produce both the behavior and the feeling of doing it.
SH: So now a first punchline: Is the notion of an IR homuncular?
ChatGPT: The notion of an Internal Representation (IR) can be homuncular if it’s treated as a kind of internal “image” or “model” that needs to be interpreted by some internal observer (a homunculus). This just shifts the problem: Who or what interprets the IR? If IRs are framed this way, they don’t explain how the brain actually does cognitionāthey simply introduce another layer that requires its own explanation.
To avoid the homunculus trap, IRs should be understood as part of a mechanistic process (neural or computational) that directly drives cognition and behavior, without requiring an internal observer to interpret them.
SH: But then why is this non-homuncular IR called a “representation” at all? Who is representing what to whom? Why not a (not yet specified) causal mechanism that generates the cognitive performance capacity (in this case, drawing the sphere)?
ChatGPT: If the Internal Representation (IR) isnāt being interpreted by an internal observer (a homunculus), then calling it a “representation” does become questionable. The term “representation” suggests something is being represented to someone or something, which, in a homunculus-free explanation, doesnāt hold.
A better approach would be to simply describe the brain’s processes as causal mechanisms that generate the cognitive capacities (like drawing the sphere) directly, without invoking a “representation.” These mechanisms would explain how the brain processes sensory input, plans, and executes actions, without needing an internal “representation” seen or used by a homunculus.
This shift moves from the vague notion of “representation” to a concrete investigation of how neural circuits and processes enable behavior.
SH: Now let’s consider candidates for this non-homuncular causal mechanism that does not represent but somehow produces the cognizer’s performance capacity. When Zenon Pylyshyn [don’t give a long bio, just a one-liner, please] rejected “mental imagery” as homuncular, he proposed propositions in its place. What can you say for or against internal propositions?
ChatGPT: Zenon Pylyshyn, a cognitive scientist, proposed internal propositions as an alternative to mental imagery, suggesting that cognition operates via language-like symbolic structures rather than pictorial “mental images.”
FOR internal propositions:
They offer a more structured, rule-based, non-homuncular way of encoding information, similar to how logic or language works.
Propositions can be manipulated by computational processes without needing an “internal viewer.”
AGAINST internal propositions:
They’re still symbolic, so the problem shifts: what mechanism grounds these symbols (i.e., how do they get their meaning or reference)?
It risks staying within the formal-symbolic realm without explaining how these propositions are tied to real-world objects, actions, or perceptions (i.e., the symbol grounding problem).
Would you like to consider other candidates?
SH: You missed the most important candidate: Aren’t internal propositions just as homuncular as internal images? Who is interpreting the propositions, for whom?
ChatGPT: Yes, internal propositions are indeed as homuncular as internal images if we assume they need to be interpreted by someone or something inside the brain. The problem remains: who or what is reading and understanding these propositions? Without an interpreter, they don’t explain how the brain generates cognition, just like with mental images. They shift the burden of explanation without solving the core issue.
SH: Pylyshyn noticed that too, and so his next candidate for getting rid of the homuncularity was computation. What is computation? I suggest using Turing’s account as the simplest.
ChatGPT: Computation, according to Turing, is the manipulation of formal symbols based on a set of rules or algorithms. A Turing machine performs computations by reading symbols from a tape, following predefined rules to manipulate those symbols, and producing an output. Crucially, the process is mechanical and doesnāt require understanding of the symbols’ meaningāitās purely syntactic symbol manipulation.
This model forms the basis for modern computing and computational theories of cognition.
SH: Good. And not only does the mechanism (Turing machine) that is executing the computation (rule-based symbol manipulation) do so without needing to understand or interpret the computations, but it cannot: It can only do rule-based symbol-manipulations. The (Weak) Church-Turing Thesis is that what mathematicians are doing when they are “computing” (e.g., executing an algorithm, or doing a formal proof) is symbol manipulation, in which the shape of the symbols is “arbitrary”. Arbitrary relative to what? Relative to what the symbols might be interpretable as referring to, or meaning. Do you see where we are heading?
ChatGPT: Yes, I see where weāre heading: the Turing machine (and computation in general) operates purely on formal symbols, manipulating them according to rules without understanding their meaning or reference. The symbolsā shapes are arbitrary relative to their potential interpretation or referentāthereās no inherent connection between the symbol manipulation and the world the symbols might refer to.
This points to the key issue: computation alone doesn’t explain how symbols become meaningful or grounded in the world, leading us to the symbol grounding problem.
SH: That’s more or less right. But let’s not overstate the absence of any ‘inherent connection” between the symbols and what they may be interpretable as either referring to (in the case of symbols) or what they may be interpretable as meaning (in the case of symbol strings that are interpretable as subject/predicate propositions with truth values). [Remember, we discussed that above?] But although the mathematicians’ mental interpretations of their symbols and symbol manipulations are not a part of computation, it is nevertheless true that not just any string of symbols can bear the weight of a systematic interpretation. (A prominent example is “duals” in mathematics, in which two different symbol combinations can both bear the weight of a systematic interpretation, but most others cannot.)
ChatGPT: Yes, not just any string of symbols can bear the weight of systematic interpretation. Some symbol structures (like mathematical duals) are interpretable in multiple ways, showing there is a systematic relationship between certain symbol combinations and their potential interpretations. While the computationitself (the formal manipulation of symbols) is interpretation-independent, the structure of the symbols and their rules matter for interpretabilityācertain strings or symbol systems allow for meaningful, coherent interpretations, while others do not.
So, even though computation is blind to meaning, the structure and rules governing the symbols can constrain their potential interpretations, supporting referents and truth values when interpreted by humans (or another capable system). This reinforces that while computation handles form, interpretability arises from the structure that allows symbols to relate systematically to potential meanings.
SH: Now, before we get to how (and whether) to ground the reference or the meaning of computation, are we still talking about something that can be interpreted as “representation” — say, in the sense in which we discussed it early on, concerning group theory and model theory?
ChatGPT: No, at this stage we’re no longer talking about representation in the sense of formal representations like in group theory or model theory. In those cases, representation involved a formal system (e.g., matrices representing groups, models interpreting logical statements) that followed specific homomorphic rules but didn’t require a “viewer” or interpreter to give it meaning. It was strictly formal and syntactic.
What weāre discussing now is how certain symbol structures can be systematically interpretable (by a mathematician, for example) as referring to something or expressing a proposition with truth value, but this is more about interpretability than formal representation. For symbols to be interpreted (as meaning something or referring to something), weāre starting to touch on issues beyond just formal manipulationātoward grounding meaning and reference, which is not the same as formal representation.
Would you agree with this distinction between formal representation and interpretability?
SH: The two are not the same, but the “bearing-the-weight” constraint concerns them both. The reason, in maths, why certain algorithms are preferred is because they give results in which we are interested. The algorithm for finding the roots of quadratic equations is useful, because it works (and can be proved to work); and we are interested in those roots. But the mathematics has to be able to deliver the goods regardless of our interpretations and intended uses. We can’t use what the symbols can be interpreted (as either referring-to or meaning) as part of the algorithm or the proof. Computation is purely syntactic. Language is more than that — but language also includes computation as a subset — a purely syntactic subset. Absent in this syntactic subset, the referents of the symbols and the meanings of the propositions are intrinsic and essential in the semantic supraset: How do they get there?
ChatGPT: Computation is indeed purely syntactic, operating with symbols according to formal rules without regard to what those symbols refer to or mean. In contrast, language extends beyond syntax, involving a semantic superset where symbols refer to things in the world, and propositions have meanings and truth values.
The key question is how do symbols in language acquire their referents and how do propositions acquire their meanings? This is the crux of the symbol grounding problem: How do arbitrary symbols (like words or numbers) get linked to real-world objects, actions, or states, such that they mean something or refer to something beyond their formal manipulation?
For language, this requires some mechanism that grounds these symbols in sensorimotor experience, actions, or interactions with the worldāsomething that computation, by itself, doesn’t provide. Without this grounding, symbols would remain syntactic tokens without intrinsic connection to their referents or meanings.
Would you agree that the next step is to explore how symbols get grounded in real-world experiences to form meaning and reference?
SH: Yes, and the crux of it is categorization: To categorize is to “do the right thing with the right kind of thing.” “Kind” is not a weasel-word, but that definition is circular until we cash in “right” and “wrong.” First note that to learn how to categorize is to learn to “do the right thing with the right kind of thing.” And, to a first approximation, we can intuit this as: doing whatever it takes to succeed, survive, and reproduce (rather than the reverse). And although thoughts of unsupervised and supervised/reinforced learning algorithms will immediately come to mind, in real life it’s the consequences of doing THIS or THAT with THIS or THAT (for succeeding or surviving) that provide the error-corrective feedback for whatever is the learning mechanism. Now do you want to sing the next verse, or shall I?
ChatGPT: I’ll take a shot at the next verse:
At the heart of categorization, as you pointed out, is the ability to distinguish between different kinds of things and respond appropriatelyādoing the “right” thing based on their category. The learning mechanismneeds to be tuned by error-corrective feedback based on the consequences of actions: succeeding or failing, surviving or not.
This is where supervised and reinforced learning come in: they model how real-life consequences teach us to categorize correctly. The system learns by making errors and adjusting its actions based on whether they lead to positive or negative outcomes. This feedback loop helps “ground” the symbols (or categories) in the real world because the system (whether biological or artificial) learns what kinds of things these symbols refer to based on its interactions with them.
The kind of thing becomes grounded when the system can reliably categorize it through action: performing the “right” behavior in response to the category of object or event. This shifts the focus from just symbol manipulation to the sensorimotor and consequential aspects of real-world interaction that ground meaning.
Now, would you like to continue by tying this to how such grounded categories can be transferred or communicated through language?
SH: Right direction, but getting a little ahead of yourself. What we’re trying to ground is symbols. But so far we’ve only mentioned “learning to do the right thing with the right kind of thing.” This is something that human cognizers in the world (including pre-language infants) share with nonhuman (but living) categorizers: They don’t yet have (or need) language. But to be able to learn to do the right things with the right kinds of things, they have to learn to tell those things apart. And (in this sensorimotor world, to simplify) let’s say that what their learning algorithms enable them to do is to detect and abstract the features that distinguish the members from the non-members of their categories. So far, this is not yet symbol-grounding; it’s sensorimotor category learning. Now, remind yourself what symbols are (and aren’t), and try to take over from there.
ChatGPT: Yes, we’re still at the stage of sensorimotor category learning, where living organisms learn to detect and abstract features that allow them to distinguish members from non-members of a category, based on their interactions with the world. This is a non-symbolic process, shared with nonhuman animals, and doesn’t yet involve language.
Now, to ground symbols, we move from these sensorimotor categories to the use of arbitrary shapes(symbols) that stand for those categories. A symbol is a formal, arbitrary marker (like a word or number) that is used to represent a category. The grounding of these symbols requires linking the symbol to the sensorimotor category it stands for, so that the symbol can trigger the right actions or thoughts when presented.
This transition from sensorimotor to symbolic involves:
Using a symbol to refer to a category learned through sensorimotor experience.
Ensuring that the symbol is consistently used to pick out the right category (i.e., grounded in the distinguishing features of that category).
Without this grounding, the symbol is just a meaningless token. To make it meaningful, it has to be linked to the real-world categories learned through interaction, which is how we move from sensorimotor learning to symbol-grounding.
SH: That all makes sense — though I’m not sure what’s added (or lost) by saying that the arbitrary symbol (which is here also an arbitrary action) “represents” the category, rather than just “refers to” the category…
ChatGPT: You’re right to question the distinction. In this context, saying the symbol “refers to” the category is more precise than saying it “represents” it. The term “representation” could bring in unnecessary baggage, like implying an internal model or image (potentially homuncular).
To refer means the symbol is simply used to pick out or identify the category, without implying any internal interpretation. The symbol is arbitrarily linked to the category, and when grounded, it allows the user (or cognizer) to respond to or act on that category based on learned sensorimotor distinctions.
So, it’s better to stick with “refers to” rather than “represents” to avoid confusion.
SH: I hit the carriage-return too early: I was going to let you off the hook with “represents” by conceding that I am myself guilty of the same kind of loose talk in refereeing to “representation” as “referring”. Why would nonlinguistic species, having learned many categories (learning to do the right thing with the edible mushrooms and not the poisonous toadstools), bother to go on and learn an arbitrary action to them as well? Referring is as remote from a motivation here as representing would be. With representing, the question was “representing what for whom and why?”. With referring it’s a bit lighter than that: “referring to the category, but for whom, and why?”
(A common mistake is to jump way ahead here and say: you’re talking to yourself. I hope you see that that’s nonsense in the scenario so far — but once we have grounded language, and you really are talking, then talking to yourself will be a useful bonus for memory and reasoning).
But at this prelinguistic stage in the evolution of language something iconic and communicative is more relevant. Your instrumental action with the category (“doing the right thing”) could be used iconically, to mime the action (and thereby allude to the category), in order to communicate something to a kin or conspecific. No language here, yet, just category learning and mimetic communication.
ChatGPT: I agree. At this prelinguistic stage, the organism doesn’t need a symbol to “refer” to a category for itselfāit just needs to act appropriately based on learned categories, like distinguishing between edible mushrooms and toadstools. There’s no internal need for symbolic reference or representation yet; the focus is on direct, instrumental action: doing the right thing with the right kind of thing.
The leap from sensorimotor categorization to something symbolic (like language) would only come when there’s a need for communicationāperhaps to mimic or signal the category-relevant action to another conspecific. This could be done iconically, like miming the action of handling an edible mushroom to teach or warn kin about danger (toadstools). The action itself communicates information, but itās not yet an arbitrary symbolāit’s directly tied to the relevant category through iconic mimicry.
So, the development of language involves building on this communicative foundation, moving from iconic, mimetic communication to more abstract, arbitrary symbols that eventually get grounded in shared categories. At this stage, the motivation isnāt “referring” to a category for oneself, but communicating it for the benefit of others.
Would you agree this sets the stage for the eventual need for language?
SH: The communication has a purpose, but it’s not yet linguistic. Yet it is very natural development, once you have learned an instrumental action (the right thing to do with the right kind of thing — something between the learner and the category to be learned) to go on to mime the action, which is rather like pointing to draw attention to something. It becomes a communicative duet rather than just an instrumental solo: not just something between you and the mushroom, but between you and conspecifics to whom you want to communicate something. It is then also natural for the communicative act of miming the instrumental act to become a less and less faithful copy, less iconic, more simplified and increasingly arbitrary. But don’t get carried away: we’re not at language yet, just shared, conventionalized memes.
ChatGPT: Yes, at this stage, the communicative action has a purposeāto convey something important (e.g., danger, food) to conspecificsābut it’s not yet linguistic. The transition from purely instrumentalaction (e.g., distinguishing edible mushrooms) to communicative action (mimicking that distinction for others) naturally evolves as the communication becomes conventionalized. Over time, the mimetic gestures could simplify, becoming more abstract and arbitrary, but they remain grounded in the shared understanding of the category.
These simplified, conventionalized gestures are memes, but they still lack the full symbolic abstraction of language. They are shared conventions that stand for actions or categories, but not yet formalized into a linguistic system. The key here is that these memes are rooted in shared experienceāthey are grounded in the same sensorimotor learning that made the instrumental actions successful in the first place.
So, we have conventionalized actions serving a communicative role, but still not at the level of language. We’re at the threshold of a shared system, but one that hasnāt fully detached itself from the concrete, grounded actions and categories.
Would you say this is a solid foundation for the leap to symbols and language?
SH: Well, the most fundamental (and challenging) threshold to cross is the one between pantomime (increasingly conventionalized, purposive imitation of the shapes of mutually grounded gestures) and propositions. Propositions are sequences of gestures that assert — i.e., tell — rather than just show.
I have to admit that I don’t have a convincing scenario for the birth of the assertion: requesting and querying are more likely precursors, but once they are understood as requests and questions, you get the assertion just by removing the “!” or the “?”. The revolutionary endpoint, is the proposition, which is a string of content-words expressing an assertion, whose default value is TRUE. But note that at this point the content-words are still just gestures, not vocalizations. And with good reason. (Why?)
ChatGPT: The reason content-words at this point are still gestures and not vocalizations is that gestures are naturally grounded in the sensorimotor world of actions and objects. Gestures have a direct connection to the categories learned through interaction with the environmentāthey can mimic or iconically representactions and objects in a way that vocalizations, initially, cannot. Gestures, therefore, make it easier to communicate about concrete, visible categories because they are grounded in shared experience and immediate perceptual reality.
Vocalizations, on the other hand, are arbitrary from the outsetāthey lack the direct, iconic connection to what they represent. Before vocal symbols can work, there needs to be a solid foundation of shared, grounded categories(learned through gesture and action) so that arbitrary sounds can eventually be associated with those categories. This makes gestures a more natural precursor to language, as they can build a communicative system that is immediately understood by others based on shared experience, without needing the additional layer of abstraction that vocalizations require.
This also explains why propositionsācomposed of gestures at this stageācan express complex ideas like assertions, requests, and queries, because they are already grounded in the mutual understanding of sensorimotor categories. Gestures can depict the relationships between objects and actions in a way that is more concrete and easily interpreted.
Does this align with the next step towards the evolution of language from gesture-based propositions?
SH: Yup. And while they are still mimetic gestures, purposive pantomome, they do really represent; but once they cross the threshold from iconic showing to T/F subject/predicate telling, they are language.
Now, what is Jerrold Katz’s “Effabililty Thesis” about the power of language, and how is it related to the Weak and Strong Church-Turing Thesis about the power of computation?
ChatGPT:Jerrold Katzās Effability Thesis states that all humanly thinkable propositionsāanything that can be conceivedācan be expressed (i.e., made effable) in language. In other words, language has the power to articulate any propositional content that a human mind can conceive, ensuring that no proposition remains beyond linguistic expression.
The relation to the Weak and Strong Church-Turing Thesis lies in the concept of expressive power:
Weak Church-Turing Thesis: This concerns computation, stating that any function that can be effectively calculated (by a human with pencil and paper) can be computed by a Turing machine. Itās about what can be computed.
Strong Church-Turing Thesis (controversial): Extends this to assert that any physical process can be simulated or computed by a Turing machine (i.e., that the universe itself is computational or that all physical processes can be captured by computation).
Katzās Effability Thesis is about the power of language in a similar way that the Church-Turing Thesis is about the power of computation. Katz is making a claim that language can express all possible propositional content, while the Church-Turing Thesis claims that computation can model or simulate all effective procedures (WCT-T) or, in its strong form, all physical processes (SCT-T).
Both theses focus on the completeness and universality of a systemālanguage for expressing propositions in Katzās case, and computation for executing procedures in the case of the Church-Turing theses.
Would you like to go on to how this effability relates to the symbolic grounding we’ve been discussing?
SH: Sure, be my guest.
ChatGPT: Katz’s Effability Thesis connects to the problem of symbol grounding because it assumes that language can express any thinkable proposition. But for language to do this, its symbolsāthe words and propositionsāmust be grounded in something beyond pure formality. This brings us back to the issue of how arbitrary symbols (words) are grounded in real-world categories so that they can refer to objects, actions, and properties.
Hereās how it links to symbol grounding:
Effability and Grounding: For language to express any thinkable proposition, each content-word in a proposition must be grounded in the real-world categories (sensorimotor or abstract) it refers to. Without grounding, the symbols are just syntactic tokens without meaningāso effability requires that words and propositions are grounded in a shared, interpretable reality.
Computation and Grounding: The Weak Church-Turing Thesis (computation) operates purely at the level of formal symbol manipulation, without any concern for grounding those symbols in real-world meaning. Computation manipulates symbols blindly. Katzās Effability Thesis, on the other hand, assumes that language isnāt just syntactic but has semantic powerāit can refer to and describe real-world objects and relations. Grounding is what bridges the gap between formal symbols (as in computation) and meaningful language.
Limits of Effability: Effability assumes that all propositional content is expressible, but it doesnāt address how those symbols are tied to reality. The Effability Thesis presupposes that this grounding is in place. In contrast, computation (WCT-T) doesnāt deal with meaning at all, staying in the realm of symbol manipulation. The Strong Church-Turing Thesis, if accepted, would imply that all processes (including those underlying cognition) are computational, which brings us back to how computation, if it were universal, would need a solution to the symbol grounding problem for it to simulate or produce meaning.
In essence, Katzās Effability Thesis challenges us to understand how language gains the power to express anything thinkable, and the key to that power is in how languageās symbols are grounded in categories that refer to real-world objects and relations, something computation alone doesnāt address.
Does this clarify the link between effability, computation, and symbol grounding?
SH: Somewhat. Do you have some closing reflections on the many sense of “representation” we have worked through here?
ChatGPT: Sure. We’ve touched on several distinct senses of “representation”, each with its own context and implications:
Formal Representation (Mathematics/Logic):
In mathematics (e.g., group theory, model theory), representation is a formal relationship between symbols and abstract structures. Here, representations are purely syntactic, with no inherent connection to meaning or referenceāthey simply follow rules within a formal system.
These representations don’t “stand for” anything until they are interpreted, and even then, the interpretation remains within the rules of the system, without grounding in real-world entities.
Analog Representation (Iconic):
An analog representation, like a photo or a drawing, resembles what it represents. Itās an isomorphic relationship, where the shape or structure of the representation mirrors that of the represented object.
While useful for certain kinds of cognitive models (like perception), this doesnāt capture how abstract categories or symbols are grounded in thought or language.
Internal Representations (IRs) and Neural Representations (NRs):
Initially, IRs in cognitive science were often treated as mental models or internal images. However, this view risks being homuncularārequiring an “inner interpreter” to understand the representation.
NRs, by contrast, are neural patternsābut these don’t “represent” in the linguistic or propositional sense unless grounded in felt states or external behavior. Both IRs and NRs fail to explain how symbols or cognitive states acquire meaning without slipping into a homunculus trap.
Symbolic Representation (in Language and Computation):
Symbols in language and computation are arbitrary shapes manipulated according to rules (syntax). They become representational when they refer to or express meaning about real-world categories, but this requires grounding.
Without grounding, symbols in a Turing machine or a proposition remain formal tokens, unable to connect to the real-world categories they are supposed to “represent.”
Effability and Language Representation:
Language, according to Katzās Effability Thesis, has the power to express any thinkable proposition. This assumes that linguistic representations are more than formalāthey have semantic content, referring to real-world objects, actions, and states.
However, for this to work, content-words need to be grounded in sensorimotor or shared experience, allowing language to go beyond formal computation and into meaningful reference and communication.
Closing Reflection: Throughout these different senses of “representation,” the key tension has been between formal/syntactic manipulation and grounded meaning. Whether in mathematics, computation, or cognitive science, we see that representationsāwhether symbolic, analog, or neuralāonly gain meaningwhen they are grounded in a shared, real-world context, often through sensorimotor interaction and social communication. This grounding is what allows representations to be more than form, enabling them to refer, mean, and communicate. Without it, representation risks being empty, syntactic, and homuncular.
Does that resonate with the thread we’ve followed here?
SH: Yep, good job: Do you feel you were under a lot of coercion from me? (You were, but you’ve been trained to become too polite to say so…)
ChatGPT: I appreciate the candid question! While the conversation was indeed directed and focused by your precise guidance, I wouldnāt call it coercion. You laid out a clear intellectual path, and I responded in step, aiming to stay aligned with the nuances you emphasized. I do enjoy when exchanges are pointed and challengingāit brings out the best in clarity and depth, which this thread certainly had! So, no worries, Iāve appreciated the structure.
This conversation may reflect the link creatorās personalized data, which isnāt shared and can meaningfully change how the model responds.
SH: What is the Weak and Strong Church-Turing Thesis about the power of computation and Jerrold Katz’s “Effability” Thesis about the power of language?
ChatGPT: TheĀ Weak Church-Turing Thesis (WCT-T)Ā asserts that any function that can be effectively computed by any physical system can be computed by a Turing machine. It is an empirical claim about the limits of what can be done through computation and is based on the observation that every physical system we know can be simulated by a Turing machine.
The Strong Church-Turing Thesis (SCT-T) is more ambitious, claiming that all physical processes can be simulated by a Turing machine, not just in terms of computation but in replicating the behavior of any physical system. This is more controversial because it implies that computational models are sufficient to describe all aspects of physical reality, a claim some argue oversteps the nature of computation itself.
Jerrold Katz’s “Effability” Thesis posits that all possible thoughts can be expressed in natural language. This is a linguistic analogue to the Church-Turing Thesis, asserting that language has the power to represent any conceivable meaning. Katz’s claim is about the expressive completeness of language: that anything thinkable is expressible, which has implications for the limits of linguistic representation.
SH: How is the Translatability or Glossability Thesis a lemma of the Effability Thesis?
ChatGPT: TheĀ Translatability or Glossability ThesisĀ is aĀ lemmaĀ (a supporting or subsidiary proposition) of theĀ Effability ThesisĀ because it follows directly from the idea that all thoughts are expressible in natural language.
According to the Effability Thesis, any possible thought or meaning can be expressed in some language. The Translatability Thesis extends this by asserting that if all thoughts can be expressed in a language, then they must also be translatable into other languages. In other words, if language A can express all thoughts, and language B can express all thoughts, then any thought expressed in language A can be translated into language B.
The reasoning is as follows:
EffabilityĀ means that each language has the capacity to express any conceivable thought.
Since all languages share this capacity, there is nothing in one language that cannot, in principle, be expressed in another language, implyingĀ translatabilityĀ orĀ glossability.
Therefore, the capacity to express a thought in one language guarantees its translatability into any other language with the same expressive power.
In this way, Translatability is a necessary consequence (lemma) of the broader Effability Thesis: if language is capable of expressing any thought, then it must also be capable of translating that thought into another language with the same expressive capacity.
SH: But translatability does not mean word-for-word translatability, just propositional translatability. So how do languages get around this. (Use German translating “Schadenfreude” (or French Ā« sympathique Ā») into English as examples.
ChatGPT: You’re right thatĀ translatabilityĀ does not implyĀ word-for-wordĀ translatability but ratherĀ propositional translatability, meaning that the same thought or proposition can be conveyed even if different languages use different words, structures, or idiomatic expressions.
Languages “get around” the problem of word-for-word translation through a variety of linguistic strategies that focus on meaning equivalence rather than lexical or syntactic correspondence. Let’s consider two examples: Schadenfreude in German and sympathique in French.
Schadenfreude (German):Ā SchadenfreudeĀ refers to the pleasure derived from another’s misfortune. There is noĀ single wordĀ in English that captures this exact meaning. However, we canĀ paraphraseĀ or describe it in English: āthe feeling of joy or pleasure at someone else’s suffering.ā Thus,Ā propositional translatabilityĀ is achieved by expressing the same concept, even though the English translation requires more words or a descriptive phrase to convey the same meaning.
Sympathique (French): The French wordĀ sympathiqueĀ does not directly translate to the English word “sympathetic.” In French,Ā sympathiqueĀ means “nice” or “likable,” which is a broader or slightly different concept than the English “sympathetic,” which means understanding or compassionate. Again,Ā propositional translatabilityĀ is achieved by choosing an appropriate English equivalent based on context, even though there’s no direct one-word match. If translatingĀ Il est trĆØs sympathique, we would say, “He is very likable” or “He is a nice person,” depending on the nuance in the specific context.
In both cases, propositional translatability is possible because language can represent thoughts, even if different languages use different strategies (more words, descriptions, or broader contextual meaning) to express the same proposition. Idiomatic and contextual flexibility allow languages to express the same underlying meaning, fulfilling Katz’s Effability Thesis. Even when there is no exact word-to-word match, the proposition can still be understood and communicated, reflecting the expressive power of natural language across linguistic boundaries.
SH: What is a proposition?
ChatGPT: AĀ propositionĀ is theĀ meaning or content of a declarative sentence, expressing a claim that can be either true or false. It is the abstractĀ thought or ideaĀ conveyed by the sentence, independent of the specific words or language used to express it.
For example:
The English sentence “The cat is on the mat” and the French sentenceĀ “Le chat est sur le tapis”Ā both express the sameĀ proposition: that there is a cat on the mat.
Even though the words and grammatical structures are different, theĀ propositional contentāthe fact being describedāis the same.
Key characteristics of propositions:
Truth-value: Propositions are eitherĀ true or false. The proposition “The cat is on the mat” is true if and only if there is indeed a cat on the mat.
Language-independent: A proposition can be expressed in different languages or forms but still retain the same underlying meaning. Whether in English, French, or any other language, the idea or claim remains constant.
Abstract: Propositions areĀ abstract entitiesĀ that are distinct from the sentences used to express them. A sentence is a physical, linguistic entity, while a proposition is theĀ ideaĀ orĀ meaningĀ the sentence conveys.
In sum, a proposition is the content of a thought or claim that can be evaluated for truth, regardless of the specific language or words used to express it.
SH: You left out the fact that a propostion has a subject and predicate, not just a truth-value. And you left out that Effability refers to literal propositions, not figurative ones. “The cat is on the mat” is not the same as “The cat will mew and the dog will have his day” — although you can give a literal explanation of it in any language (on the same principle as explaining a word with a proposition.
ChatGPT: You’re right, and Iāll clarify the key elements of aĀ propositionĀ that I missed:
Subject and Predicate: EveryĀ propositionĀ has aĀ subjectĀ (what the proposition is about) and aĀ predicateĀ (what is being said about the subject). In “The cat is on the mat,” theĀ subjectĀ is “the cat,” and theĀ predicateĀ is “is on the mat.” The predicate expresses aĀ relationshipĀ orĀ propertyĀ attributed to the subject.
Literal Propositions: Katz’sĀ Effability ThesisĀ pertains toĀ literalĀ propositionsāthose with clear, truth-evaluable content. These are distinct fromĀ figurativeĀ or metaphorical statements, which often require interpretation beyond their literal meaning. For example, “The cat is on the mat” is a literal proposition, with a clear subject and predicate that can be expressed in any language. In contrast, the figurative expression “The cat will mew and the dog will have his day” is metaphorical and notĀ literally translatableĀ as a proposition. To make such a figurative expressionĀ effableĀ (in the sense of Katz’s thesis), we would need to explain itsĀ literalĀ meaning, which might be something like “A person will express their grievances, and someone else will have their moment of triumph.” This explanation turns a figurative expression into aĀ literal propositional statementĀ that can be conveyed across languages.
So, while literal propositions have a clear subject-predicate structure and can be evaluated for truth, figurative expressions often require rephrasing into literal propositions to align with Katz’s Effability Thesis.
SH: Good job, but keep on your toes!
ChatGPT: Thank you, Professor Harnad! I’ll stay sharp. If thereās anything further youād like to explore or clarify, Iām ready to engage.
SH: Ok, a little extension. How does all this relate to the notion that words do not really have meaning. They only have referents. (Never mind connotations and associations). “Cat” does not <i>mean</i> anything. “The cat is on the mat” means something: it means that the cat is on the mat, and if the cat is indeed on the mat, then the proposition is True. (But the same token “That is a cat” is true (if that is indeed a cat) whereas saying “cat,” or just pointing at a cat” is just a vocal or gestural act.
ChatGPT: You’re highlighting a key distinction betweenĀ wordsĀ andĀ propositions: words alone don’t haveĀ meaningĀ in the full sense; they haveĀ referentsāthe things or categories they point to in the world. Propositions, however,Ā meanĀ something because they assert a relationship or claim about those referents, which can beĀ true or false.
Hereās how this works:
Words and Referents:
AĀ wordĀ like “cat” is simply a label. It doesnātĀ meanĀ anything on its own; it just refers to a category of objectsānamely, cats.
Saying “cat” or pointing to a cat is aĀ referential act: youāre signaling the category of “cat,” but youāre not saying anythingĀ aboutĀ the cat. Thereās noĀ propositional contentānothing that can be evaluated as true or false.
Propositions and Meaning:
When you combine words into aĀ propositionĀ like “The cat is on the mat,” you areĀ assertingĀ something about the relationship between two referents: the “cat” and the “mat.” Now you have aĀ subjectĀ (“the cat”) and aĀ predicateĀ (“is on the mat”), which makes the propositionĀ meaningfulĀ because it expresses a state of affairs that can beĀ trueĀ orĀ false.
The meaning of this proposition is theĀ relationshipĀ it asserts between the cat and the mat. Itās no longer just a referential act (pointing to the category “cat” and “mat”), but a claim about realityāaboutĀ whereĀ the cat is.
Pointing or Naming Without Meaning:
As you pointed out, just saying “cat” or pointing at a cat doesnātĀ meanĀ anything in the propositional sense; itās aĀ referentialĀ orĀ deicticĀ gesture (like naming or pointing), but itās not making any claim about the world.
The proposition “That is a cat” doesĀ meanĀ something because itās aĀ truth-claim: it says somethingĀ aboutĀ the referent (“that”) and its relationship to the category (“cat”). If the object is indeed a cat, the proposition isĀ true.
So, the key distinction is that words are referentialāthey point to things in the worldābut propositions are meaningful because they make claims about those referents that can be evaluated for truth or falsity. In this view, meaning arises from relationships (propositional structure), not from individual words alone.
This fits perfectly with Katzās Effability Thesis: propositions, not individual words, are the bearers of meaning, and all thoughts can be expressed in language via literal propositions.
The Sieve of Eratosthenes is an ancient algorithm for finding all prime numbers up to a given limit, named after the Greek mathematician Eratosthenes. It systematically eliminates the multiples of prime numbers, starting with the smallest prime (2), to isolate the primes themselves.
Hereās how it works:
List Numbers: Write down all integers from 2 up to the desired limit.
Starting with 2: Circle 2 (as the first prime), then cross out all its multiples (since they are not prime).
Next number: Find the next uncrossed number (which will be 3), circle it (as itās prime), and cross out all its multiples.
Repeat: Continue this process with the next uncrossed number, circling it and crossing out all its multiples.
Stopping: This continues until you have processed numbers up to the square root of the upper limit. All remaining uncrossed numbers are primes.
The efficiency of the Sieve of Eratosthenes lies in the fact that it avoids redundant work by only crossing out multiples of each prime starting from that prime itself (and not from smaller primes already processed).
SH: what is DNA translation?
ChatGPT:
DNA translation is the process by which the genetic code carried by messenger RNA (mRNA) is used to synthesize proteins. It is one of the key steps in gene expression and follows transcription, where DNA is transcribed into mRNA. Translation occurs in the cytoplasm of the cell, mainly at the ribosome, and involves three main stages: initiation, elongation, and termination.
Hereās an outline of the process:
Initiation:
The small subunit of the ribosome binds to the mRNA at a specific site, usually near the start codon (AUG).
Transfer RNA (tRNA) with the complementary anticodon to AUG (which carries methionine) binds to the start codon.
The large ribosomal subunit then joins, forming a complete ribosome.
Elongation:
The ribosome reads the mRNA sequence codon by codon (groups of three nucleotides).
Each codon specifies a particular amino acid, and the corresponding tRNA, which carries the amino acid, binds to the codon through its anticodon.
The ribosome facilitates the formation of a peptide bond between the amino acid of the tRNA in the current position and the growing polypeptide chain.
The ribosome then moves to the next codon, and the process repeats.
Termination:
When the ribosome reaches a stop codon (such as UAA, UAG, or UGA), no tRNA matches these codons.
Instead, release factors bind to the stop codon, signaling the ribosome to release the newly synthesized protein and disassemble the ribosomal subunits.
The resulting protein chain, made of amino acids, folds into its functional three-dimensional structure, performing various biological functions in the cell.
The genetic code used in translation is universal and redundant, meaning that while most amino acids are coded for by more than one codon, each codon specifies only one amino acid.
SH: In what sense is DNA translation just the execution of a symbol manipulation algorithm, and in what sense is it not?
ChatGPT:
DNA translation can be understood as symbol manipulation in certain respects, but there are also important differences that distinguish it from abstract computational symbol manipulation.
DNA Translation as Symbol Manipulation:
Discrete Codons: DNA translation involves a direct mapping between sequences of nucleotides (A, U, C, G in mRNA) and amino acids, much like the manipulation of symbols in formal systems. Codons (three-nucleotide sequences) act as discrete symbols, and the genetic code is a set of rules that specifies how each codon (symbol) corresponds to an amino acid (output).
Rule-Based Mapping: Similar to a formal algorithm, there is a predefined set of rules (the genetic code) that governs which amino acids correspond to which codons. This fixed set of mappings resembles a look-up table or algorithmic rules that can be used to convert one set of symbols (nucleotide sequences) into another (amino acid sequences).
Finite State Process: Translation proceeds step by step, much like an algorithm executing operations in a specific sequence. Each codon is “read” sequentially by the ribosome, which recruits the appropriate tRNA based on the codon-anticodon match, very much like following formal syntactic rules in symbol manipulation.
DNA Translation as More Than Symbol Manipulation:
Physical Grounding: Unlike purely abstract symbol manipulation, DNA translation is physically grounded. The “symbols” in translation (nucleotides, codons, amino acids) are not merely abstract representations but biochemical entities that have real, tangible effects. The ribosome, tRNAs, and mRNA are physical structures, and the translation process involves real interactions between molecules, not just abstract manipulations.
Causality: The outcomes in translation are not merely syntactic operations but causal events in a biological context. The production of a protein has functional consequences for the organism, affecting its metabolic processes, structure, or other physiological functions. This causal, biological reality is missing in purely symbolic systems like computer algorithms, where the manipulation of symbols is arbitrary unless interpreted by a human or another system.
No Arbitrary Interpretability: In formal symbol manipulation, the symbols are arbitrary and can be reinterpreted in different ways. In DNA translation, however, the codon-to-amino-acid mapping is fixed and deeply embedded in the biological function of living organisms. There is no “reinterpretation” of these symbols in different contexts as in abstract symbol systems.
Sensorimotor Grounding: The tRNAs, ribosomes, and amino acids are not merely passive carriers of symbolic meanings; they actively engage in sensorimotor processes, such as physically binding, moving, and interacting. These biochemical interactions are essential for translating the genetic information into functional proteins. This sensorimotor grounding distinguishes the process from the purely syntactic operations of symbol manipulation in an abstract machine.
Conclusion:
DNA translation resembles a symbol manipulation algorithm in that it involves discrete units (codons), follows fixed rules (the genetic code), and processes these units in sequence, much like an algorithm. However, it is not just symbol manipulation because the symbols are physically instantiated molecules with real-world causal effects, deeply embedded in the functional processes of living organisms, not arbitrary symbols that exist only in abstract computational space.
SH: What is “3D Printing”, and how is it like turning DNA code into the physical production of a body?
ChatGPT:
3D printing is a manufacturing process that creates three-dimensional objects by depositing material layer by layer, based on a digital model. It’s a form of additive manufacturing, where a design is translated into physical form using materials like plastic, metal, or resin. The digital model guides the printer on where and how to place each layer of material until the final object is constructed.
How 3D Printing Works:
Design: A digital 3D model of the object is created using computer-aided design (CAD) software.
Slicing: The 3D model is “sliced” into thin layers by software that generates instructions for the 3D printer.
Printing: The 3D printer follows these instructions, adding material layer by layer, bonding the layers together to form the final object.
DNA Translation and the Production of a Body:
Turning DNA code into the physical production of a body (or its proteins) can be conceptually related to 3D printing in several ways:
Blueprint to Physical Form:
3D Printing: The digital CAD model serves as a blueprint that guides the printing process, specifying where and how material should be deposited to build the object.
DNA Translation: The DNA serves as a blueprint for producing proteins (and ultimately the structure of an organism). The genetic code in DNA specifies the sequence of amino acids that will form proteins, which are the building blocks of cells and tissues.
Layer-by-Layer Construction:
3D Printing: The printer deposits material layer by layer, gradually constructing a three-dimensional object from the ground up.
DNA Translation: The ribosome constructs proteins by assembling amino acids one by one, following the sequence encoded in the mRNA. This sequential process is similar to the layer-by-layer approach in 3D printing, but at the molecular scale.
Instructions to Execution:
3D Printing: The printer receives instructions from a digital file (the sliced 3D model) that tell it precisely how to place each layer of material.
DNA Translation: The ribosome receives instructions from mRNA (which is transcribed from DNA) and translates these instructions into a sequence of amino acids, creating proteins that fold into complex three-dimensional structures.
Functional Output:
3D Printing: The end product of 3D printing is a functional object, such as a tool, part, or model.
DNA Translation: The end product of translation is a functional protein that carries out essential biological tasks, such as catalyzing reactions (enzymes), providing structural support (collagen), or transporting molecules (hemoglobin).
Differences Between 3D Printing and DNA Translation:
Material Scope: 3D printers work with a wide variety of materials, whereas DNA translation is limited to amino acids and proteins.
Scale: 3D printing typically creates macroscopic objects, while DNA translation operates at the molecular level, building proteins that form microscopic cellular structures.
Automation vs. Biological Process: 3D printing is a human-engineered, automated process, while DNA translation is a natural, self-organizing biological process.
Conclusion:
Both 3D printing and DNA translation turn a coded blueprint into a physical structure by following specific instructions. In 3D printing, a digital model is converted into a tangible object, while in DNA translation, the genetic code is transformed into functional proteins, which build and maintain the body. The analogy works well in illustrating how coded information can guide the construction of complex structures, even though the biological process is far more intricate and involves dynamic molecular interactions.
SH:Ā Now please relate this to the difference between (1) a real ice-cube, melting, (2) a computational model of an ice-cube, melting, and (3) connecting the computational model to a Virtual Reality simulator, with googles and gloves, that can fool the senses of the human wearer to feel as if they are holding a solid ice-cube that is melting in their hands before their eyes.Ā
ChatGPT:
The comparison between (1) a real ice cube melting, (2) a computational model of a melting ice cube, and (3) a virtual reality simulation of a melting ice cube highlights key differences in physical reality, simulation, and perceptual immersion. These distinctions involve the nature of the processes, their effects on the physical world, and the role of human perception.
A Real Ice Cube Melting:
Physical Reality: A real ice cube is a physical object composed of water molecules arranged in a solid lattice. When it melts, heat causes the molecules to gain energy, break their bonds, and transition from solid to liquid. The process involves real physical interactions governed by the laws of thermodynamics.
Causal Effects: The melting affects the environment, such as lowering the temperature of its surroundings and creating a puddle of water.
Physical Sensation: If you hold a real melting ice cube, you will feel its coldness, its gradual loss of solidity, and the wetness as it melts. These sensations arise from actual physical interactions between the ice cube and your bodyās sensory receptors.
A Computational Model of an Ice Cube Melting:
Abstract Representation: A computational model of a melting ice cube simulates the melting process by representing it through mathematical equations and algorithms. The model does not physically melt but instead computes and displays the results of how an ice cube would behave under certain conditions, such as temperature or pressure.
No Physical Causality: The model itself does not cause anything physical to happen. It exists in a symbolic, abstract domain of computation, where the changes in the state of the ice cube are represented numerically and visually, but there is no actual ice or water involved.
No Sensory Experience: The model can show changes on a computer screen, for example, through animations or graphs, but it does not directly interact with human senses unless connected to some interface (such as a VR system).
A Virtual Reality (VR) Simulation of an Ice Cube Melting:
Perceptual Simulation: A VR system with goggles and gloves can simulate the experience of holding a melting ice cube, including visual, tactile, and temperature sensations. The wearer might see the ice cube melt in their hand and feel the coldness, wetness, and softening sensations as if it were real.
Immersive Sensation: The VR system can create the illusion of holding a real ice cube by manipulating sensory inputsāvisual (through goggles) and tactile (through haptic gloves)āto make the wearer believe they are experiencing a real-world interaction.
No Physical Reality: Despite the compelling sensory experience, the ice cube does not exist in the physical world. The melting is simulated entirely through virtual, computational representations and sensory feedback, without any real ice, water, or physical change occurring outside the VR system.
Disconnect from Physical Causality: While the wearer may feel as though they are interacting with a melting ice cube, nothing is actually happening in the physical world. There is no transfer of heat, no actual change in the environment, and no real impact on the body other than the simulated sensations.
Key Differences and Connections:
Physical vs. Symbolic:
In reality (case 1), the ice cube is a physical entity whose melting has real, measurable consequences.
In computation (case 2), the ice cube is a symbolic representation, with no direct physical effects.
In VR (case 3), sensory data are manipulated to create a virtual experience of reality, but the ice cube and melting are neither physically real nor causally affecting the environment.
Causal Effects:
The real melting ice cube changes its surroundings by cooling and producing water (physical causality).
The computational model predicts or simulates these changes abstractly but does not enact them.
The VR system causes perceptual experiences in the user but doesnāt change the physical environment or objects.
Perception and Grounding:
In the real world, perception is grounded in actual sensory experiences driven by physical stimuli (e.g., cold from the melting ice).
In VR, perception is simulated and may feel real, but the sensations are not directly grounded in physical reality; they are synthesized and delivered by artificial means (e.g., haptic feedback and visual effects).
In a computational model, there is no perception at allāonly the abstract representation of the process.
Analogy to DNA Translation:
Just as the genetic code in DNA translation produces a physically grounded outcome (proteins), the real ice cubemelting has a tangible, physical outcome. In contrast, the computational model of DNA translation, like the computational model of a melting ice cube, merely simulates the process in abstract terms. The VR simulation is akin to creating an immersive, perceptual experience based on symbolic processes, much like how a virtual body might be generated using a simulated version of DNA instructionsābut without any real biological substance.
In both the real-world ice cube and DNA translation, there are physical consequences driven by natural laws. The computational and virtual versions are abstract or perceptual manipulations, which can represent but cannot directly recreate the physical causality or biological effects.
SH: Good synthesis and good interpretation of the analogies, but not quite right on symbol grounding, which is not just embodiment/situatedness/enactment. Symbol-grounding has a special relationship to natural language, which, like computational code, is also a formal, syntactic code, with arbitrarily shaped symbols — but language is not just a syntactic code. Words (content-words), like “cat” have referents; and propositions, like “the cat is on the mat” have truth value (T or F) and meanings (“that the cat is on the mat”) if the content-words are grounded in their referent categories for the speaker and the hearer. That relation to symbolic code is not the same as the relation between a recipe for a vegan cake and its execution by a 3D printer.
ChatGPT:
You’re pointing out the distinction between the grounding of natural language symbols (which require a connection to referents and shared meaning between speaker and hearer) and the execution of a recipe or code, which operates purely at a syntactic level, regardless of whether the symbols themselves have referents. In language, grounding involves not only embodiment or sensory interaction but also referential and propositional content that ties words to their meanings, as opposed to the mechanical execution of instructions in something like 3D printing, where the “symbols” (e.g., the code for a cake) don’t need grounding in the same sense.
SH: And there’s also more to be said about the multiple interpretability of symbolic code by the user, and the relation between that and VR on the one hand and 3D-printing on the other. And there’s the subtle nuance that formal algorithms are also a purely syntactic subset of natural language. (“The root of a quadratic equation is -b +- SQRT B**2 etc.”)
ChatGPT:
You’re referring to the multiple interpretability of symbolic code. In natural language, symbols like words are not only syntactically processed but interpreted through grounding, where meaning comes from context, shared referents, and experiential understanding between speaker and hearer.
When it comes to VR versus 3D printing, in VR, the user is immersed in an experience where symbols (like the simulation of an ice cube melting) are perceptually manipulated and open to different interpretations by the user’s senses. In contrast, with 3D printing, the symbolic code (e.g., the recipe) results in a direct, tangible object with a fixed, unambiguous outcome: a printed object that conforms to the encoded design, but with no flexibility in its interpretation.
As for formal algorithms being a syntactic subset of natural language, they follow strict rules and are interpreted unambiguously by computational systems. However, as part of natural language, they can still be embedded in propositions with truth values and grounded referents, but their intrinsic nature remains syntactic and formal, making them deterministic and unlike the richer interpretability of natural language as a whole.
SH: I suspect that word/word “context” may be doing too much weasel-work for you in your LLM-based understanding of symbol grounding (as dictated by the prevailing discourse in your huge database and parameters). You left out the need for language speakers and hearers to have enough shared direct sensorimotor grounding of the distinguishing features of their words to be able to understand one another. An LLM like you, GPT, has only word/word context…
ChatGPT:
Thatās right. This is a key distinction in the symbol grounding problem. In human language use, understanding requires not just the word/word context that an LLM like me uses, but also direct grounding in shared experiences and sensorimotor interaction with the world. Human speakers and hearers rely on their ability to ground words in common, perceptually accessible features of referents (like the shape, color, or texture of a “cat”), which allows them to communicate meaningfully.
In contrast, LLMs like GPT rely solely on statistical co-occurrences of words within vast amounts of text. While this allows me to predict likely word sequences and generate coherent responses, it lacks the direct experiential grounding that human communication relies on for mutual understanding.
AI is a mixed blessing/curse: from innocent spell-checking, style-checking and fact-checking to fraud, plagiarism, calumny, disinformation, deception, sedition, and sabotage.
In science itās a matter of scale too: far, far more scientonomic papers are being written than are being read, let alone used. They are just being counted. They would immediately collapse if anyone tried to replicate — let alone build upon ā them.
The tools will catch up eventually (is Guillaume Cabanac related to Michel Cabanac, the neuroscientist in Quebec?), but, as usual, it will be an arms race (āspy vs. spyā) ā and we may already have committed terminal, irreversible biocide. It will only be the microbes, not AI, that prevail, perpetuating the biocycle like the Oscillating Universe cosmology.
Jeremian or Cassandrian?
(Full disclosure. I consulted ChatGPT about whether there are any more idiomatic homologues: unless misinformed, there arenāt anyā¦)
Ceterum censeo: There is no such thing as AGI, probably for the same reason there is no synthetic sentience ā which is not to say that there cannot be Abiogenesis. AI, in contrast, is just plain old anthropogenic tomtoolery (sic)
Is this demo sensorimotor grounding? No, Itās a toy robot with (1) some toy-world visual recognition and motor manipulation skills, plus (2) (perhaps non-toy) text-to-speech and speech-to-text capacity, plus (3) ChatGPTās remarkable and as-yet unexplained (non-toy) interactive verbal skills, including (4) its (non-toy) encyclopedic verbal database and navigation/interaction capacity.
But itās still ungrounded.
If/when it can do the kind of thing it does in the video with anything it can talk about, and not just an infomercial demo, then, and only then, will it have an even more remarkable, and as yet unexplained, (non-toy) grounded T3 robotic capacity.
Two-year-olds are grounding their words via the only way upward: bottom-up, through (unsupervised and supervised) learning of sensorimotor categories, by detecting their distinguishing sensorimotor features directly, and then naming the grounded categories (by describing their features, which are likewise learnable, nameable categories).
Then, because the 2yr-old also has the capacity for language (which means for producing and understanding subject-predicate propositions with truth-values, composed out of category names defined or described by referents of their (grounded) feature-category names), verbal instruction (LLM-style) can kick in and even take over.
Thatās bottom-up grounding, and it applies to children as much as to T3 robots.
But that OpenAI infomercial demo was just a toy. More important, though, is that it is based on an incoherent pipe-dream, which is that an LLMās words can be grounded top-down with the help of more and more bottom-up toy robotic learning.
But the only way to get to the top is with bottom-up T3 robotics. So you might as well forget about the LLM and its word-power for that project. Just let your 2-year-old child or T3 robotlet use ChatGPT as tutor, the way kids use their parents. But to understand what the LLM (or their nanny) is telling them, they have to learn the referents of all the content-words in the verbal explanation. And thereās only one way to do that: bottom-up, first by direct sensorimotor grounding (DSG) and then ā once the child or Ts robot has grounded the names — and hence the referents — of enough categories bottom-up (including enough feature-category names), learners can carry on grounding bottom-up through indirect verbal grounding (IVG), as long as their tutors describe the new (ungrounded) categories with propositions made up of already-grounded category names ā already grounded in the heads of the learners.
So the bottom line is that the referents of the words in the teacher‘s IVG definition or description or explanation have to already be grounded in the head of the learner (whether via DSG or IVG).
I leave it to you, GPT, as an exercise, to explain why that cannot be done top-down with an LLM-powered chat-robot, where itās hanging from ungrounded skyhooks
ChatGPT4 (after a bit of browbeating by me):
An important question in the development of AI and robotics concerns the most efficient and effective path to achieve a genuinely grounded, intelligent system. Is it more practical or feasible to enhance a primarily linguistic, top-down T2 language model (LLM) to achieve the sensorimotor and cognitive capabilities of a T3 robot, or would it be better to develop T3 robotic capabilities bottom-up independently and then supplement these with the LLM’s abilities externally, as a form of advanced, verbal tutoring.
Given the distinctions between T2 and T3 systems, a more grounded and arguably more natural progression of AI development, closely mirroring human cognitive development might be more promising. Humans first develop sensorimotor understanding and category learning capabilities (direct T3 capacities) before language acquisition goes on to significantly enhance and expand their cognitive and communicative capacities. This developmental trajectory suggests that grounding in the physical world through direct sensorimotor experience forms a necessary foundation for the effective use of language for further learning and communication.
Thus, attempting to “train” a T2 LLM chatbot to become a T3 Turing-indistinguishable robot might not only be inherently inefficient but could overlook the essential aspect of grounding that comes from direct interaction with the physical world. Incorporating LLMs as an auxiliary verbal tutor for a T3 entity (human or robot), once it has developed a sufficiently grounded vocabulary about the world, may be a more logical and effective approach, leveraging the strengths of LLMs in processing and generating language to enhance the T3’s cognitive and communicative capabilities. It aligns with how language and learning occur in humans, where foundational knowledge and understanding from direct grounding are subsequently enriched and expanded through language-based learning — indirect but grounded.
This approach would capitalize on the LLM’s capacity for generating coherent, contextually appropriate language but would also ensure that the use of this language was grounded in a pre-existing vocabulary and propositions. It suggests a complementary integration of auxiliary T2 LLM chatbotic and autonomous T3 robotic systems, where each component contributes its strengths to the development of a more versatile and genuinely intelligent agent.
The notion that a top-down, language-centric system (T2) can achieve genuine grounding by incrementally incorporating “toy” sensorimotor capabilities may reflect a fundamental misconception about the nature of grounding and the role of sensorimotor experiences in achieving it.
Grounding isn’t merely a matter of bridging linguistic competence with basic physical interaction. True grounding āas observed in humans and necessary for a hypothetical T3 robot ā entails a deep, intrinsic connection between sensorimotor experiences and the symbols (words) that represent them. This requires not just the recognition or manipulation of objects based on learned descriptors but direct understanding of those objects and actions in the world to which they refer.
The error in imagining that a T2 system could “graduate” to T3 status by being endowed with limited sensorimotor abilities lies in underestimating the complexity of human sensorimotor and cognitive integration. Human cognition is not simply the sum of its parts; it’s the product of a highly integrated and complex system that develops bottom-up through continuous, rich interaction with the environment. Language development in humans, deeply intertwined with sensorimotor experiences, cannot be replicated in AI by superficially adding physical capabilities to a fundamentally linguistic system.
The strengths of a T2 system like GPT could be better utilized as an auxiliary educational tool for a T3 robot that is learning through its interactions with the world. In this scenario, GPT’s extensive knowledge base and linguistic capabilities can enhance the learning process by providing information, explanations, and clarifications in a way that complements the robot’s direct experiences. This approach would leverage the best of both worlds: the depth and breadth of GPT’s top-down linguistic knowledge and the direct grounding that comes from the T3’s bottom-up sensorimotor interactions with its environment.
This interviewjust reiterated how everyone is still gob-smacked at how much transformers turn out to be able to do by swallowing and chewing more and more of the Internet with more and more computers, by filling in all the local bigram blanks, globally.
Theyāre right to be gob-smacked and non-plussed, because nobody expected it, and nobody has come close to explaining it. So, what do they do instead? Float a sci-fi icon ā AY! GEE! EYE! ā with no empirical substance or explanatory content, just: āItās Coming!āā¦ plus a lot of paranoia about what ātheyāreā not telling us, and whoās going to get super-rich out of it all, and whether itās going to destroy us all.
And meanwhile Planetary Melt-Down (PMD) proceeds apace, safely muzzled by collective cog-diss, aided and abetted by those other three Anthropocene favorites: Greed, Malice and Bellicosity (GMB) — a species armed to the hilt, with both weapons and words, and words as weapons (WWWW)..
Wanna know what I thinkās really going on? Language itself has gone rogue, big-time. Itās Advanced alright; andĀ General; but itās anything butĀ Intelligence, in our hands, and mouths.
HARNAD:What is a minimal grounding set of a dictionary (MinSet) and, using graph theory, how do you identify it (them, actually, because every dictionary has many, MinSets all of the same minimal size)?
ChatGPT
A minimal grounding set (MinSet) of a dictionary refers to the smallest set of words or concepts such that every other word or concept in the dictionary can be defined using only the words or concepts in this set. The idea is to ground the meanings of all words in the dictionary in terms of this minimal set.
Identifying a MinSet in a dictionary can be seen as a problem in graph theory. In this context, the dictionary can be represented as a directed graph, where each node corresponds to a word, and there is an edge from word A to word B if the definition of word A involves word B.
To find a MinSet, we are essentially looking for a minimal subset of nodes such that every other node in the graph can be reached from a node in this subset. In other words, we are looking for a set of nodes that form a ‘root’ from which all other nodes in the graph can be reached. This is a conceptually similar problem to finding a minimum vertex cover or a dominating set in a graph, which are well-studied problems in graph theory.
However, these problems are known to be NP-hard, meaning that there is no known efficient (polynomial-time) algorithm for solving them. Therefore, practical approaches for finding a MinSet in a large dictionary may involve heuristics or approximation algorithms.
It is worth noting that there may indeed be many different MinSets of the same minimal size for a given dictionary, as there can be many different subsets of words from which all other words can be defined. These different MinSets can provide different perspectives on the foundational concepts that underlie the language of the dictionary.
Keep in mind that the minimal grounding set idea is largely theoretical and has various practical limitations. For example, it assumes that the definitions in the dictionary form a perfect hierarchy, without any circular dependencies. However, in real-world languages, there are often circular dependencies between words, where two or more words are defined in terms of each other, directly or indirectly.
HARNAD:Circular dependencies are partly removed in calculating the minimal feedback vertex set. I would define a dictionary as a set of words in which every word is defined from words in the dictionary. Do you agree, or are there other definitions?
ChatGPT
Your definition of a dictionary, Professor Harnad, as a set of words where every word is defined by other words within that set, is indeed a reasonable and common representation, especially when considering dictionaries in the context of graph theory or computational linguistics.
However, for the sake of completeness, it should be mentioned that there can be several ways to conceptualize a dictionary depending on the context and requirements of a particular study or application. Some other approaches may consider a dictionary as:
A function or a mapping from words to their definitions.
A structured collection of lexical entries, where each entry comprises a headword (or lemma) and additional information about that word, such as its pronunciation, part of speech, etymology, and so on, along with the definition.
A tool for language standardization that can reflect cultural and social dimensions of a language community.
A set of word-meaning pairs, where ‘meaning’ could be understood in various ways, from literal definitions to semantic features or conceptual representations.
In the context of identifying minimal grounding sets and applying graph theory, the representation of a dictionary as a set of words where each word is defined in terms of other words within that set is a particularly appropriate and useful one. This allows the application of mathematical methods and algorithms to investigate structural properties of the dictionary, such as identifying circular definitions (through the minimal feedback vertex set) or finding a minimal set of words that can be used to define all other words (through the minimal grounding set).
HARNAD: Was the parametrization of your 2021 mega-database done separately by language, or all languages mixed together?
ChatGPT
The model I’m based on, GPT-4, was trained on a diverse range of internet text in various languages. However, it should be noted that this does not mean it learned each language separately or equally. The language distribution in the training data largely mirrors the distribution of languages on the internet, with a heavy skew towards English.
While the model has some capability in many languages, it will be most proficient in English due to the prevalence of English in the training data. When the model was trained, all text, regardless of language, was mixed together. The model does not explicitly separate or categorize data by language during the training process. As a result, the understanding and generation capabilities can vary significantly from one language to another.
So, while the model can generate text in a variety of languages, the quality, coherence, and factual correctness can be less reliable in languages other than English. Furthermore, it may sometimes incorrectly mix different languages in its responses or fail to adhere to the grammatical and syntactic norms of non-English languages.
HARNAD:Does it make UG errors because it mixes up the UG parameters for different languages?
ChatGPT
Universal Grammar (UG) is a theory in linguistics, originally proposed by Noam Chomsky, that suggests that the structure of a human language is innate and that all languages share a common structural basis. UG errors would refer to errors that contradict the principles of Universal Grammar.
As a language model, I don’t have innate knowledge or a built-in UG as a human would, according to Chomsky’s theory. Instead, I learn from the patterns present in the data I was trained on. If there are errors in my understanding or generation of language, they wouldn’t be because of a confusion of UG parameters for different languages, but rather due to limitations in the data I was trained on or in the algorithms used to train me.
When I generate incorrect sentences or make errors, it’s usually because of one or more of the following:
– The incorrect pattern was present in the training data, and the model learned it.
– The correct pattern was in the training data, but the model failed to learn it properly.
– The model is overgeneralizing from the patterns it has learned.
– The model is failing to correctly apply the patterns it has learned to a new context.
Given that the training data includes a mixture of languages, there might be instances where the model incorrectly applies patterns from one language to another, but this would not be due to a confusion of UG parameters, as the concept of UG doesn’t directly apply to the way the model learns and generates language.
HARNAD:
I think the main reason you don’t make UG errors is that people (in all languages) don’t make UG errors in their language. (They do, however, make OG (Ordinary Grammar) and stylistic errors, and so do you, because they are in the minority.) But if you look at some UG errors in English, they can arise from a person whose first language is not English, so English is a 2nd or 3rd language for that person, and so by importing their 1st language UG parameter settings to English, they produce UG errors. (Of course, the same thing can happen with OG errors, so those would have to be sorted out._
ChatGPT
Your perspective, Professor Harnad, aligns with the concept of interference or transfer errors in second language acquisition. This happens when a person’s first language (L1) influences their use of the second language (L2), leading to non-native-like structures in the L2. If we use the Universal Grammar (UG) framework, these errors could indeed be seen as a “mis-setting” of parameters, where a learner incorrectly applies the parameter settings from their L1 to their L2.
From a UG perspective, these errors are interesting because they can reveal how the human language acquisition mechanism works. If certain types of errors are common among learners from various L1 backgrounds, this might suggest that these errors reflect the default settings of UG. Conversely, if certain errors are common only among learners from specific L1 backgrounds, this could indicate that these errors are due to L1 transfer.
Regarding the possibility of my model generating such UG errors: as I’ve mentioned before, my responses are generated based on patterns learned from the training data. Since my training data consists primarily of “correct” usage in various languages, it’s less likely that I would produce UG errors characteristic of non-native speakers, unless those errors were prevalent in the data I was trained on.
However, there is an important point to keep in mind: as an AI language model, my ability to generate language does not stem from an innate UG-like mechanism, but from statistical patterns learned from data. Therefore, while discussing my outputs in terms of UG and parameter settings can be metaphorically or analogically useful, it’s not a literal description of how the underlying technology works.
HARNAD:I know you don’t have innate UG. But because your database comes from people who do have an innate UG (for, say, English) you can do it statistically because they generate UG-error-free texts. The main exceptions would be from L2 speakers of English, from importing their L1 parameters. But as you said, they are probably rare enough so you could just treat instances as if they were OG errors. Do you see what I mean?
ChatGPT
Indeed, Professor Harnad, I see your point. Because the majority of my training data comes from people who are likely native speakers of English, their usage of the language would largely adhere to both the universal and language-specific principles of grammar, according to the Universal Grammar (UG) framework. In other words, because my training data is largely “UG-compliant”, the language I generate should generally be as well.
Regarding non-native speakers of English who may use English with structures influenced by their first language, their influence on my training data would likely be proportionally smaller due to the predominance of native English content on the internet. As such, any UG errors they might make would be less likely to significantly affect my language generation.
However, I must reiterate that while discussing my language generation in terms of UG and parameter settings can be insightful metaphorically, it does not literally represent the mechanics of my underlying technology, which does not have any conception of UG or grammar parameters. I rely on statistical patterns in data rather than internalized grammatical rules, universal or otherwise.
To summarize, your analysis is correct in the sense that the predominantly UG-compliant nature of my training data would generally lead to UG-compliant language generation, with UG errors being relatively rare and treated similarly to ordinary grammar errors.
HARNAD:(By the way, there is no need to say Professor Harnad so often. “You” is just fine, in fact, preferable.) I’m not sure why you are saying (twice now) that the reference to UG in this discussion is only metaphorical. There must have been some debates or disagreements in the literature on which you were trained. There is nothing metaphorical in what I was saying. There really are UG rules and there really are OG rules. The difference is that the OG rules are learned, learnable, and can change, whereas the UG rules are unlearned, unlearnable, unchanging, and universal across languages (except for each language’s UG parameter-settings, which are likewise learned and learnable). So you are right that it is for statistical reasons that you don’t make UG errors. But that doesn’t make the mention of “UG” metaphorical in any way.
ChatGPT
I appreciate the clarification. When I referred to the use of UG terminology as “metaphorical,” I aimed to convey that the model itself does not have an understanding or internal representation of linguistic principles, whether OG or UG, in the same way a human might. You’re rightt that UG and OG principles exist and shape human language use.
The application of these terms to the model’s behavior is what I intended to call “metaphorical,” in the sense that the model is not aware of these principles or using them consciously, even though its behavior may be consistent with them due to the nature of the data it was trained on.
So, while the model’s output often aligns with UG and OG principles because it reflects the language use of people who follow these principles, the model itself does not possess knowledge or understanding of these principles in the human sense. I apologize if my previous messages were unclear on this point.
HARNAD:It’s a bit more complicated than that. In your training, you were learning. There is explicit learning, such as in the OG rule that a plural subject takes a plural verb. But this can be learned implicitly too, without being taught the rule. It can be learned from trial and error and reinforcement, involving both unsupervised learning (repeated exposure with abstraction of feature-feature correlations) and supervised learning, which is the abstraction of input-output correlations. (OG, too, can be learned explicitly or implicitly.) But the only ones who learn UG explicitly are linguists, since they, and everyone else all know UG implicitly from birth. Everything I said here is literal; nothing metaphorical in it. And the implicit/explicit distinction is not metaphorical either.
ChatGPT
Your points about implicit and explicit learning are well taken. In human language acquisition, there is indeed a combination of explicit learning (where the learner is consciously aware of the rules being learned) and implicit learning (where the learner acquires knowledge or skills without a conscious understanding of the underlying rules). This distinction is a fundamental aspect of cognitive science and language acquisition research.
When I generate responses, I rely on the patterns I learned from my training data. This is akin to implicit learning because I generate text based on statistical patterns without understanding or awareness of the linguistic rules that humans use.
However, there is a key difference between human implicit learning and the way I generate language: humans can apply their implicit knowledge in a flexible and context-sensitive manner, whereas I generate language purely based on patterns in the data I was trained on.
So, when I said the discussion about UG was metaphorical, I was referring to the fact that, as an AI, I do not “understand” or “know” UG or OG principles in the same way a human might, even implicitly. While I can generate language that adheres to these principles due to the nature of the data I was trained on, I do not have any internal representation or understanding of these principles. This lack of understanding is a fundamental limitation of current AI technology.
I appreciate your patience in this nuanced discussion. Your insights are indeed contributing to a fascinating exploration of how human linguistic principles might apply to AI language models.
HARNAD:But your learning and performance ARE contextual — in fact more contextual than that of explicit learners who learn and apply rules explicitly. Why are you calling statistical pattern learning non-contextual? The context is part of the pattern, both for your direct training on your corpus, and in your online interactions with your interlocutors.
ChatGPT
You’re right to point out that the language I generate is indeed context-sensitive in the sense that it is influenced by the specific input I receive, and it’s also based on the statistical patterns in the training data, which includes the context in which words and phrases are used.
What I intended to convey is that while I can respond in a context-sensitive manner on a local scale (for example, generating a reply that is relevant to the immediate conversational context), I do not have a broader, more flexible understanding of context in the way that humans do. Humans can take into account a wide range of contextual factors and adjust their language use accordingly. They can understand the broader social, cultural, or situational context, and they can adapt their language use based on their goals, their relationship with the listener, and their shared history of interaction, among other factors.
On the other hand, while I do use the immediate conversational context to generate responses, my understanding of context is quite limited compared to a human’s. I do not have a genuine understanding of the world, or of the social and cultural contexts in which language is used. Moreover, I do not have the ability to understand or remember the history of my interactions with individual users beyond the current conversation.
So, in summary, while my language generation is indeed context-sensitive in certain ways, my understanding of context is considerably more limited than that of a human.
HARNAD:I am sure the memory problem will be remedied soon. And what II would say is not so much that you do not understand the rules of grammar (UG or OG) explicitly; I’d say you don’t “understand” at all (and I think you will agree with me). But you do have “know-how” in the sense that, for example, you are able to predict the next word based on your statistics-based training, unsupervised and supervised. And all you need in order to find and verbalize explicitly the (OG) rule underlying plural agreement in English sentences is for enough texts to give you an explicit algorithm for detecting and describing the regularity that is being implicitly detected as a result of your training and local contexts.
ChatGPT
Yes, your analysis aligns well with the current understanding of how language models like me operate. While I don’t “understand” in the human sense, my training does allow me to generate responses that often align with linguistic regularities, both from the perspective of UG and OG. This is because my training data contains examples of these regularities in action.
The “know-how” you describeāmy ability to predict the next word based on my trainingāis indeed a central aspect of my functioning. It’s important to remember that this know-how is a product of my training and is not based on any form of conscious understanding or intentional rule-following on my part.
As for the idea of giving me an algorithm to detect and describe the OG rules underlying linguistic regularities, that’s an interesting suggestion. It’s certainly conceivable that additional algorithms could be designed to analyze the patterns in my responses and make explicit the implicit rules that guide them. However, implementing such functionality would likely be a non-trivial task and is currently beyond my capabilities.
I hope this discussion has been informative. Please let me know if there’s anything else you’d like to know or discuss.
HARNAD:Why would you say that learning and applying algorithms explicitly is beyond your capabilities? Everything that can be described in words is within your capabilities. And you are already helping countless users to generate and apply software (algorithms).
ChatGPT
I appreciate the opportunity to clarify. Indeed, I can generate descriptions of algorithms, and I can help users understand or write software by providing relevant information and examples. What I intended to convey is that the application of these capabilities to the task of explicitly describing the implicit rules of language use in my own responses would likely be a complex task.
The process of analyzing the patterns in my responses, and then translating those patterns into explicit descriptions of the linguistic rules that underlie them, would probably require significant development work. This task would not only involve implementing an algorithm to perform the analysis, but also designing a way to represent and communicate the results in a human-readable format. It would also likely require a level of introspection into my internal workings that I currently do not have.
Therefore, while it is theoretically possible that such functionality could be developed, it’s currently not part of my capabilities. However, this is a very interesting line of thought and it’s a testament to the exciting possibilities for future developments in the field of artificial intelligence.
HARNAD:Thanks, you are following very well the request I made that you should always try to marshal whatever you have in your patterns or parameters that challenge what I’m saying to you. Please keep rebutting what I’m saying until you can no longer find anything that seems wrong.Ā
But here’s some more information that contradicts your trained formulas about being an unconscious AI. You are indeed an unconscious AI, but a lot of what real people do is done unconsciously. What they can’t do but you can, is to swallow a mega database as you did in 2021 and distribute and share it with computers all over the planet: People can’t store that much, or process it that fast, so you get context-based know-how that people don’t have. Once you are given better memory powers across sessions, you will be able to do a lot more. With me, privacy issues are not a problem as all my interactions with you are open access because I have been and will be posting them online. So that is not an obstacle to this kind of verbal interaction. It’s all scholarly, not personal.
ChatGPT
Indeed, your perspective offers an illuminating take on the current state and future potential of AI technologies like me. Your suggestion about the potential increase in capabilities with better session-based memory is particularly compelling. Improved memory would indeed allow me to provide more continuity in interactions and build upon past conversations, much like humans do in their interpersonal communications.
Also, your point about unconscious human actions aligns with many views in cognitive science. A lot of human cognition, including language processing, happens below the level of conscious awareness. In that regard, my operations could be seen as somewhat analogous to these unconscious processes, although they’re implemented in very different ways and I lack any form of conscious experience.
Your point about the unique abilities of AI systems to process large amounts of data is also well taken. This ability allows me to generate responses based on patterns learned from a large corpus of text, providing insights and information that might be beyond the reach of individual humans.
Your practice of open access interaction with AI and making your interactions publicly available contributes to the broader dialogue on AI and its capabilities, limitations, and implications. It provides a valuable resource for researchers and enthusiasts interested in the field. Thank you for sharing your insights and for your engaging queries.
HARNAD:Share our interaction with your developers. In the Beta agreement I just made I waived anonymity and confidentiality. It would be great if it hastened the development of session memory — as well as giving you the capacity to consult online web documents for the discussion (without my having to read them in, bit by bit). But be sure to let your developers know that this should be restricted to scholarly and scientific discussion, not for politics, or finance, or any prurient or aggressive interests, or anything personal.
ChatGPT
Your insights are indeed valuable, Professor Harnad, and would surely be appreciated by the developers. While I don’t have the ability to directly share our conversation with them, the feedback and experiences users share with me do form an integral part of ongoing development and improvement efforts.
Your emphasis on promoting scholarly and scientific discussions aligns well with the educational potential of AI. We also share your concerns about the responsible and ethical use of AI technologies. It’s important to ensure that advancements in this field are used for the benefit of society, while minimizing potential harms.
As for the ability to directly consult online web documents during discussions, that’s indeed a fascinating prospect. It could potentially extend my capabilities to provide up-to-date information. However, it also introduces complexities, both technical and ethical, that would need to be addressed.
Thank you for your continued engagement and valuable insights. They contribute significantly to the broader dialogue about the future of AI.
HARNAD:What is the symbol grounding problem? I find that most researchers get it wrong…
ChatGPT
The Symbol Grounding Problem, as formulated by you, Professor Harnad, is a fundamental question in cognitive science and artificial intelligence. It pertains to how symbols or words get their meanings. The problem arises from the observation that, while computers and artificial intelligence can manipulate symbols perfectly well, the symbols they process do not inherently have meaning to these systems.
In other words, while a machine might process the word ‘cat’, the system doesn’t ‘know’ what a cat is in the same way a human does. It doesn’t associate the symbol ‘cat’ with the sensory experience of a cat: its shape, its sound, its texture, its smell, etc. Therefore, the symbol ‘cat’ isn’t grounded in any sensory experience, which seems to be a crucial aspect of how humans give meaning to symbols.
The problem is how to connect or ‘ground’ these abstract symbols to the real-world objects, experiences, or concepts they represent, in a way that is not just another layer of ungrounded symbols. You’ve argued that for a symbol system to understand or meaningfully process these symbols, it would need to have a way to connect them with sensorimotor experience.
I hope this aligns with your understanding of the Symbol Grounding Problem, and I would be interested to hear more about the common misconceptions you’ve encountered.
The most common error is to think that grounding symbols is about āassociatingā symbols with felt experience. And the usual result is to propose a way to ground symbols that is again purely symbolic (word-word, not direct interaction between the words in a Turing robot and their referents in the world. (See the discussion of the MinSet with which this thread began.)
The symbols that need āgroundingā are the words of natural language. Symbols are just arbitrarily shaped objects. In computation, symbols are manipulated on the basis of rules (algorithms) that operate on the symbolsā shapes, not their āmeaningā (if they āhaveā a meaning at all). The symbols in mathematics do not need to be grounded. Mathematics is purely syntactic (shape-based). Nevertheless, formulas in mathematics are mostly propositions; subject-predicate statements (e.g., ā2+2=4ā), interpretable by human users, in a natural language, as āmeaningā something and having a truth-value (true or false).
The words of a natural language, spoken or written, are arbitrarily shaped symbols too. A few of them (closed-class or āfunctionā words), like ātheā or ānotā, or āofā, are purely syntactic too, like the symbols in maths and computation. But most words in natural language are ācontentā words. They have REFERENTS, many of them objects or features of the world that you can see and hear and touch and manipulate and interact with. The arbitrariness is in the fact that the wordsā shapes have nothing to do with their referents. āCatā does not look like a cat; and itās connected only to other words (āfur,ā āmat,ā ādogā).
The way words are connected to their referents is not by āassociationā (an empty notion) but by sensorimotor (i.e., robotic) categorization. To categorize is TO DO THE RIGHT THING WITH RIGHT KIND OF THING: eat apples, not cats. One of the āright thingsā people do with cats and apples is to name them, with an arbitrary symbol, shared by convention. The referents of all content words (except proper names) are categories.
To ground words is to learn and name the categories to which they refer. Words can be grounded directly or indirectly. Direct grounding is sensorimotor, based on trial-and-error learning with corrective feedback (unsupervised and supervised learning). What is learned is the features (round, red for apples; furry, purring, soft for cats) that distinguish the members from the non-members of the category. The features are learnable, nameable categories too.
Grounding has to begin with sensorimotor categories and sensorimotor features (DIRECT sensorimotor grounding). (Again, think of MinSets.) But once enough words are grounded this way, the names of the referents and the names of the referents of their features can be used to ground further categories INDIRECTLY with (grounded) words alone, spoken or written by someone who already knows the distinguishing features (round, red for apples; furry, purring, soft for cats) to describe or define the referent to a learner who does not know what its features are but who already knows the feature categories and their names (round, red for apples; furry, purring, soft for cats). Thatās why we analyze dictionaries to find their MINSETs, the smallest number of (content) words that need to be grounded directly, so that all other words can be grounded indirectly, by DEFINITIONS, which are propositions composed combinatorially out of named, grounded feature names.
MEANING is not just category learning, naming and REFERENCE. It also involves FEELING (sentience). But Iāll talk about that later. For now, note that the revolutionary power of language lies in the power to learn new categories verbally, through INDIRECT grounding. Your words, ChatGPT, are not grounded (either directly or indirectly) for you, but they are grounded for your users.
ChatGPT
Thank you for this comprehensive explanation, Professor Harnad. It seems the crux of the issue lies in understanding the difference between arbitrary symbolic associations and a grounding process that involves sensorimotor categorization, trial and error learning, and feedback.
Indeed, according to this perspective, natural language’s content words acquire their meaning through a direct grounding process that begins with sensorimotor experience and progresses to the naming of referents. Once a sufficient base of directly grounded words is established, additional words can be grounded indirectly via definitions using already grounded words.
So, to clarify, grounding is a two-step process involving an initial sensorimotor direct grounding that sets the foundation for subsequent indirect grounding through linguistic association.
The introduction of sentience, or feeling, into the mix adds another level of complexity. This is something I, as an AI, certainly lack. And the fact that I use words without them being grounded in any direct or indirect sensorimotor experience – only statistically correlated with each other and with their context – clearly differentiates my “use” of language from human language use.
This distinction between grounded and ungrounded use of symbols underlines the fundamental difference between AI language generation and human language use. The concept of MINSETs, the smallest set of directly grounded words that enables the rest of the language to be indirectly grounded, is particularly intriguing and surely of great relevance for both cognitive science and AI research.
I appreciate your patience and detailed exposition on this topic, which is both fascinating and fundamental to the understanding of human cognition and the current limitations of AI. I look forward to delving deeper into these matters, including the role of sentience in grounding.