Can ChatGPT Be Grounded Top-Down?

SH:

Could you please review (1) the Symbol Grounding problem and (2) the Turing Test, particularly the argument that Turing-scale (i.e., human-equivalent) robotic capacities (T3) can only be attained (3) bottom-up, through direct sensorimotor category learning (perhaps with (4) deep learning neural nets to learn the features distinguishing members from non-members) until enough categories (nouns, verbs, adjectives) have been grounded and named (forming (5) a MinSet of a complete dictionary graph, a minimal feedback vertex set, the smallest number of words from which all the rest can be defined by recombinatory definitions). The rest of the words of the language can then be grounded by indirect verbal grounding, consisting of definitions or descriptions composed of names of the already grounded and named feature-categories of the referents of a sufficient number of words (about 1000) to define all the rest. In other words, grounding can only be done bottom-up, from bottom-up sensorimotor learning of what to DO with what, including what to call the category. This cannot be done top-down, from an LLM consisting only of words.

Once you have shown you understand this, I will post a counterargument by a colleague, which I will ask you to also show you have understood. Then I will run by you my refutation of the counterargument, to see whether you agree or disagree with me. Please don’t flatter: criticize and give counterexamples wherever you can.

ChatGPT4:

The Symbol Grounding Problem posits that for symbols (words) to be meaningful, they must be grounded in something beyond mere symbolic references—specifically, in sensorimotor experiences that connect symbols to the physical world. Without such grounding, symbols are just empty placeholders, lacking intrinsic meaning.

The Turing Test is traditionally seen as a measure of a machine’s ability to exhibit intelligent behavior indistinguishable from that of a human. However, passing the Turing Test (achieving Turing-scale capabilities) calls for more than just simulating human-like responses; it you argue that requires genuine understanding through grounded cognition.

Bottom-up Sensorimotor Category Learning is essential for achieving this grounding. Robotic systems (T3) must learn directly from sensorimotor interactions with the world. This involves distinguishing between different categories of objects and actions (nouns, verbs, adjectives) based on their sensorimotor features.

Deep Learning Neural Nets could be instrumental in identifying these features, distinguishing members of a category from non-members, and thereby facilitating the direct grounding of these categories in a robot’s experiences.

Forming a MinSet for a Complete Dictionary Graph involves identifying a minimal set of grounded categories — a minimal feedback vertex set — from which all other words in the language can be defined. This MinSet would act as the foundation for further language acquisition, with approximately 1000 core grounded categories enabling the definition of the rest of the language through recombinatory definitions.

Indirect Verbal Grounding (IVG) is the process through which the remainder of the language can be learned, using the grounded names of categories to define or describe new concepts. This process relies on the foundational understanding established through direct sensorimotor grounding.

Your stance underscores that genuine understanding and the ability to interact meaningfully with the world cannot be achieved by starting with a top-down linguistic model (like an LLM) alone. Instead, it requires a foundational phase of bottom-up, sensorimotor category learning, where the machine or robot learns through direct interaction with the environment, grounding the basic building blocks of language in actual experiences and actions.

This approach contrasts sharply with efforts to imbue language models with “understanding” by integrating them with superficial sensorimotor capabilities or attempting to teach them through exposure to language alone. Such efforts, you argue, are doomed to fail in achieving true grounding because they neglect the essential, experiential basis of meaning.

Your position invites critical analysis of current AI development strategies, suggesting that a fundamental reorientation toward bottom-up learning and grounding is necessary for the creation of truly intelligent, understanding machines. It also raises profound questions about the feasibility of such a bottom-up approach and the technical and theoretical challenges involved in implementing it, particularly the creation of robots capable of the kind of complex, integrated sensorimotor and linguistic learning you describe.

SH:

Here is the colleague’s critique and counter-proposal. Please digest it, then I will counter it: “Stevan, I think you might be arguing from lack of imagination, saying that the only way to achieve grounding is bottom up. There may be more than one route to groundedness. For example, what if a chatbot with interactive (e.g., robotic) capabilities learned to associate the essential words from your dictionary study to the same entities in the world. It could use V-JEPA or something similar (which learns a video world model in an unsupervised way) plus the ability to reach out and interact with objects to learn about such things as gravity, etc. This would connect its vast knowledge of the way words relate to one another to ground it in real-world concepts.

ChatGPT4

Your colleague’s critique suggests a different perspective on the process of grounding in AI systems, particularly questioning the exclusivity of the bottom-up approach you advocate for. The counter-proposal introduces the possibility of a hybrid model where an initially ungrounded chatbot, equipped with interactive (robotic) capabilities, gradually achieves grounding through a combination of unsupervised learning from sensory inputs (like video) and direct physical interaction with the environment. Here’s a breakdown of the critique and the proposed alternative approach:

Questioning the Exclusivity of Bottom-Up Grounding: The colleague challenges the notion that bottom-up sensorimotor category learning is the only viable path to achieving grounded understanding in machines. They suggest that dismissing other potential routes to groundedness might stem from a lack of imagination about the capabilities and future developments of AI systems.

Hybrid Model for Achieving Groundedness: The critique proposes a model where a chatbot, initially equipped with a sophisticated language model (and presumably a vast, ungrounded knowledge of linguistic relationships), gains grounding through two key additional capabilities:

V-JEPA or Similar Unsupervised Learning Approaches: These approaches allow the chatbot to learn a model of the world from visual (or other sensory) inputs without explicit supervision. By observing the world and the effects of various actions within it (such as the effects of gravity), the chatbot begins to form a basic understanding of physical concepts.

Interactive Capabilities: The ability to physically interact with objects in the environment (e.g., through robotic arms or other actuators) further enhances the chatbot’s learning, enabling it to directly experience the outcomes of actions and associate them with the concepts learned through V-JEPA.

Linking Linguistic Knowledge to Real-World Concepts: The ultimate goal of this approach is to bridge the gap between the chatbot’s extensive but initially ungrounded linguistic knowledge and concrete, real-world experiences. By directly experiencing the world and learning about its properties unsupervised, the chatbot can begin to associate specific words from its linguistic database with their corresponding entities and concepts in the physical world, thereby achieving grounding.

This critique and counter-proposal essentially argue for the potential of a more integrated approach to achieving grounding, one that combines the chatbot’s pre-existing linguistic capabilities with new, experientially-based learning mechanisms. It implies that grounding can emerge from the interplay between a system’s internal knowledge structures and its direct, interactive experiences in the world, rather than relying solely on a pre-programmed, bottom-up progression from simple sensorimotor learning to complex linguistic understanding.

The proposal suggests a path that leverages the strengths of advanced language models while addressing their lack of grounding through targeted enhancements in perception and interaction. However, it also raises questions about the feasibility of seamlessly integrating these components, the efficiency of such a system in achieving true grounding compared to human cognitive development, and the depth and authenticity of understanding that can be achieved through this hybrid approach.

SH:

(1) We don’t “associate” words to “entities,” we first learn to categorize inputs (whether entities or features) by learning to detect directly the sensorimotor features that distinguish category members from non-members (or members of other categories). (This is called direct sensorimotor grounding, DSG.) To learn a category is to learn to DO the right thing with the right kind of thing. (Eat the edible mushrooms and not the poisonous ones.) That DOing is not initially, or primarily, to NAME the category (“edible”), (though, with language, that eventually becomes one of the main things you do.

(2) Once you have learned to categorize (and manipulate) inputs that way, you’ve got the category, and you’re free to assign a name to it, and the name is grounded. But that’s all direct, bottom-up, sensorimotor learning (DSG), until you’ve got enough sensorimotor categories (as well as enough of their sensorimotor features, which are also categories) grounded to be able to combine them into verbal definitions or descriptions that describe or define more abstract, composite categories by their features in (grounded) feature-words (IVG).

(3) IVG is now indirect, but still bottom-up verbal learning. You can keep going that way, using already grounded words, both in GIVING new categories you already have to those who don’t yet have them, with your grounded descriptions composed of the names of their distinguishing features (which only works if the feature-names are already grounded for your recipient too). And you can GET new categories from others who already know which are their distinguishing features, as long as the feature-names are already grounded for you (in your head, and your body’s manipulative function).

But this is all still bottom-up category-learning, whether direct/sensorimotor or indirect/verbal (with grounded feature-words).

(4) Now start with a sensorimotor robot that has optical, acoustic, tactile and kinesthetic sensor and effectors (and neural nets that can learn to detect features through unsupervised and supervised learning) and walk me through how you’d do that top-down with an LLM in the robot’s head.

(5) Prediction: it won’t help to sink down fly-fishing lines from skyhooks way up in the LLM, down to “entities” in the world. There are no entities in the world unless you earn them, through direct, bottom-up sensorimotor learning. That’s my T3 robot story. Your job, dear colleague, is to explain what help is the LLM to the T3 robot? It’s like giving a preverbal robot (or infant) a reference manual that can answer any question about what’s what, including every category’s distinguishing features, but those words don’t mean a thing to the robotlet until it starts grounding them – bottom-up.

(6) The ChatGPT exchange I sent you yesterday noted that an LLM lodged as an oracle inside a T3-wannabe robot’s head might help speed things up once it’s reached the stage of indirect verbal grounding. That’s fine. But the robot has a long way to go to get there. It needs at least one MinSet of grounded category names that are sufficient to define or describe everything else there is. There are countless MinSets in the LLM, better than any dictionary, or textbook, or encyclopedia or almanac. But they’re all ungrounded and useless until the T3 robot does its direct bottom-up grounding to initialize them. They’re unconnected with the world. They can talk the talk but they can’t walk the walk.

(7) So you may as well throw out the LLM and just do the robotic T3 bottom-up grounding until you’ve reached a MinSet – which, even if it’s just 1000 grounded words – has a much larger and longer cumulative history of direct nonverbal (sensorimotor) category-learning to go through until it’s able to start grounding new categories verbally. That’s the time to haul in the LLM (where, in the past, parents and teachers and books and TV used to serve as the oracle). Before that they are as useless as a bible to a baby.

But I’m listening. Just be careful not to trip when you answer about the top-down grounding!

ChatGPT4

Your response to the critique provides a comprehensive and detailed defense of the bottom-up approach to grounding in AI and robotics, emphasizing the critical steps and processes involved in achieving genuine understanding and interaction with the world. Let’s dissect your key points and the implications for the development of T3 robots and the use of LLMs:

Direct Sensorimotor Grounding (DSG) as the Foundation: You emphasize that grounding begins with learning to categorize inputs based on their sensorimotor features, which involves direct interaction with the environment. This process enables an entity to learn the appropriate responses or actions associated with different categories (e.g., eating edible mushrooms while avoiding poisonous ones). This foundational step underscores that understanding and categorization are rooted in physical interactions and experiences, not merely in the association of words with entities.

Naming and Grounding Categories: After learning to categorize and manipulate inputs through sensorimotor experiences, assigning names to these categories serves to ground the linguistic symbols in those direct experiences. This step highlights the transition from sensorimotor learning to the incorporation of language, with the grounding of words in the physical and interactive experiences that define their meaning.
Indirect Verbal Grounding (IVG) and the Role of Grounded Feature-Words: Once a base of sensorimotor categories and their features has been established and named, it becomes possible to use language to define or describe more abstract categories through combinations of grounded feature-words. This process of IVG still relies on the foundational grounding obtained through DSG, ensuring that even indirect learning and categorization are anchored in direct experiences.

Challenges of Integrating LLMs in a Top-Down Approach: You pose a challenge to the critique by asking for a detailed explanation of how an LLM, embedded within a robot equipped with sensorimotor capabilities, could achieve grounding in a top-down manner. Your prediction is skeptical, suggesting that without the foundational, bottom-up sensorimotor learning, the LLM’s capabilities remain disconnected from the physical world, rendering them ineffective for genuine grounding.

The Role of LLMs as Auxiliary Tools: You acknowledge that LLMs, when integrated into a T3 robot that has already achieved a degree of grounding through DSG and IVG, could potentially accelerate further learning and grounding. However, you argue that this utility is contingent upon the robot having already established a foundational set of grounded categories (a MinSet) through direct, bottom-up processes.

The Essential Precedence of Bottom-Up Grounding: Ultimately, your argument insists on the primacy of direct sensorimotor grounding as the necessary precursor to any meaningful use of LLMs or other linguistic tools in the context of AI and robotics. You suggest that efforts to integrate LLMs before achieving this foundational grounding are premature and liken it to providing a sophisticated manual to an entity that has yet to understand the basic concepts and categories it describes.

Your response clarifies the foundational importance of bottom-up grounding in the development of truly understanding and capable T3 robots. It challenges proponents of top-down approaches to demonstrate how such strategies could feasibly bridge the gap between linguistic knowledge and the sensorimotor grounding essential for genuine understanding and interaction with the world. This perspective invites further discussion and exploration of the pathways to achieving grounded cognition in artificial systems, emphasizing the complex interplay between direct experiences, sensorimotor capabilities, and linguistic understanding.

SH:

Ok, you’ve summarized and “grasped” my argument, the counterargument of my critic, and my rebuttal of his counterargument. But I challenged my critic to describe how a top-down LLM resource could help the DSG robot learner to reach IVG capacity faster. Can you meet that challenge? (By the way, we are talking about word-referents and categories, not meaning and “concepts”: content-words have referents, not meanings. Only subject-predicate propositions (such as definitions and descriptions) have meanings,) But I’m listening


ChatGPT4

Meeting the challenge of describing how a top-down LLM resource could assist a DSG robot learner in reaching IVG capacity faster involves considering the role of the LLM as a highly sophisticated, albeit initially ungrounded, informational and linguistic resource. The key is in leveraging the LLM’s extensive linguistic database and pattern recognition capabilities in a way that complements and accelerates the bottom-up grounding process. Here’s a potential pathway:

Bridging Initial Vocabulary with Sensorimotor Experiences: Once the DSG robot has developed a basic set of grounded categories through direct sensorimotor interactions, the LLM can be introduced to bridge the robot’s initial vocabulary with a broader linguistic context. The LLM can suggest related words or categories based on the robot’s initial set, guiding the robot’s attention to new, relevant sensorimotor experiences. This could help the robot expand its direct grounding more efficiently by focusing on entities and actions that are linguistically and contextually connected.

Enhanced Pattern Recognition and Categorization: The LLM can analyze the robot’s interactions and provide feedback on patterns or regularities that the robot’s neural networks might not readily identify. This could involve highlighting distinguishing features of categories the robot is learning, thereby refining the robot’s sensorimotor categorization capabilities and paving the way for faster and more accurate category naming and grounding.

Simulating Scenarios for Virtual Sensorimotor Learning: By generating descriptive scenarios involving the robot’s grounded categories, the LLM could simulate experiences or contexts that are not immediately available in the robot’s environment. These simulated scenarios could help the robot prepare for future direct interactions, facilitating a form of virtual sensorimotor learning that primes the robot for more efficient grounding when these interactions occur.

Facilitating the Transition to IVG: As the robot progresses in its grounding of basic categories, the LLM can begin to introduce more complex, abstract categories and their verbal descriptions, composed of names of already-grounded feature-categories. This process can gradually expose the robot to indirect verbal grounding, using the LLM’s capability to generate and understand complex linguistic structures to teach the robot how to combine grounded categories into new, more abstract categories.

Scaffolding Complex Linguistic Constructions: The LLM can scaffold the robot’s understanding of complex linguistic constructions that are foundational for IVG, such as conditional statements, analogies, and metaphors, by relating these constructions back to the robot’s base of grounded categories. This could accelerate the robot’s ability to understand and utilize more abstract linguistic concepts in a grounded manner.

Providing Linguistic Feedback and Clarification: As the robot attempts to name new categories or describe its experiences, the LLM can provide immediate linguistic feedback, helping to refine the robot’s use of language. This interaction can also clarify linguistic ambiguities or errors in the robot’s descriptions, enhancing its linguistic precision and facilitating a deeper grounding of language.

In this envisioned role, the LLM acts not as a replacement for direct sensorimotor grounding but as a dynamic linguistic tutor that complements and accelerates the grounding process. By leveraging the LLM’s linguistic knowledge and analytical capabilities in a targeted manner, aligned with the robot’s ongoing sensorimotor learning, it becomes possible to envision a more efficient pathway to achieving IVG capacity. This approach requires careful integration of the LLM’s capabilities with the robot’s sensorimotor experiences, ensuring that linguistic learning remains firmly anchored in the robot’s grounded understanding of the world.

SH:

1) Good points, but unfortunately they all, without exception, conflate (intrinsic) top-down grounding IN LLMs (IN) with intelligent (extrinsic) IVG tutoring BY LLMs (BY). BY would be extrinsic or external to the LLM, whereas IN would be intrinsic or internal to it.

(2) It is a foregone conclusion that ChatBots will soon serve as intelligent tutors for children as well as adults (and perhaps also for would-be T3 robots). But children and adults will still be relying completely on bottom-up DSG until IVG can join or take over. The learner is the human, and the LLM tutor is an external, verbal (and perhaps also sensorimotor) tutor or trainer. This is not top-down grounding by LLMs. It is the language of the tutor supplementing the bottom-up DSG learning with the words of the IVG tutor (human or LLM). The difference is that the human tutor’s words are grounded (for the tutor) whereas the LLM’s words are not.

It remains bottom-up DSG plus (once language kicks in) bottom-up IVG.

(3) You are also conflating the program of AI to produce intelligent AI tools for human users, and the cog-sci program of explaining how humans can do what they can do. Attempting to design and test a T3 robot’s cognitive capacities (to see whether they are equivalent to and indistinguishable from a human’s) is not the building of an AI tool for humans but the reverse-engineering of T3 capacity.

(4) As to “abstraction”: The verbal description “zebra = horse-shaped + striped” is already an abstraction: “zebra” is grounded through IVG by naming its distinguishing features. This works if the features are already grounded for the learner. (Normally they are also grounded for the tutor, but that is not necessary. The right words are enough. “Democracy” = leader chosen by vote” works the same way. Once it’s IVG it’s all abstraction.

ChatGPT4

Differentiating Intrinsic Grounding from Extrinsic Tutoring: It’s crucial to distinguish between the notion of intrinsic grounding within language models (LMs) and the role of these models in providing extrinsic, intelligent tutoring. Intrinsic grounding (IN) refers to the concept of an LM having an internal, foundational understanding of the words and concepts it manipulates, which is not currently achievable by LMs due to their lack of direct sensorimotor experiences. Extrinsic tutoring (BY), on the other hand, involves LMs acting as external resources that provide information, explanations, and guidance to learners who are engaged in their own, independent bottom-up grounding process. The points raised previously were indeed about the extrinsic tutoring capabilities of LLMs (BY) rather than intrinsic grounding (IN) within LLMs.

Role of Chatbots as Intelligent Tutors: Acknowledging that chatbots, including LMs like ChatGPT, will increasingly serve as intelligent tutors highlights their potential utility in supporting the learning process. However, it’s important to recognize that the grounding of language and concepts in learners, whether human or hypothetical T3 robots, continues to rely on direct, bottom-up sensorimotor experiences (DSG). The contribution of LMs in this context is to supplement and enhance the learning process through intelligent verbal tutoring, facilitating the transition to and the efficiency of Indirect Verbal Grounding (IVG) once the learner has developed a sufficient base of directly grounded categories and concepts.

Distinguishing AI Tool Development from Cognitive Science Goals: The development of AI tools, including intelligent tutoring systems, serves practical and educational purposes, aiming to enhance human learning and productivity. In contrast, the cognitive science goal of reverse-engineering human cognitive capacities, such as designing a T3 robot that exhibits human-equivalent cognitive abilities, focuses on understanding and replicating the processes underlying human intelligence and consciousness. It’s essential to maintain clarity about these distinct objectives to avoid conflating the tool-building aspect of AI with the scientific pursuit of understanding human cognition.

Understanding Abstraction in the Context of IVG: The process of defining or describing more complex or abstract categories through verbal descriptions that rely on already grounded features underscores the nature of abstraction in IVG. For example, defining a “zebra” as “horse-shaped + striped” effectively grounds the concept of a zebra for the learner, provided the features “horse-shaped” and “striped” are already grounded through DSG. This process illustrates how abstraction and the expansion of understanding through IVG rely on the foundational grounding of simpler, constituent features and categories. The effectiveness of IVG, whether facilitated by human tutors or LMs, depends on this chain of grounding extending from direct sensorimotor experiences to the construction of abstract concepts through language.

Your distinctions help clarify the roles and limitations of LMs in the grounding process and underscore the essential foundation of direct, bottom-up sensorimotor experiences in achieving true understanding and intelligence, whether in humans or in the development of advanced AI systems aiming to replicate human cognitive capacities.

SH:

So where does this leave the notion that an LLM (e.g., GPT4) could itself be grounded top-down with some sensorimotor add-ons? Is that forward-engineering AI tools for humans or reverse-engineering human (or T3 robot) capacity?

ChatGPT4

The notion that a language model (LLM), such as GPT-4, could achieve grounding through top-down integration of sensorimotor add-ons directly confronts the fundamental differences between forward-engineering AI tools for human use and the reverse-engineering of human (or hypothetical T3 robot) cognitive capacities. Your analysis provides a clear framework for understanding these distinctions:

Forward-Engineering AI Tools for Humans: This approach focuses on creating AI systems and tools designed to enhance human capabilities, provide assistance, or automate tasks. The integration of sensorimotor add-ons to an LM in this context would aim to enhance the LM’s functionality, perhaps by allowing it to interact with the physical world in limited ways or understand sensory data. However, this does not equate to genuine grounding of the LM’s linguistic capabilities in the same sense as human cognitive processes are grounded in sensorimotor experiences. Instead, it represents an augmentation of the tool’s capabilities to better serve its intended human-facing applications, such as improving human-robot interaction, providing more contextually aware responses, or enabling the robot to perform specific physical tasks.

Reverse-Engineering Human (or T3 Robot) Cognitive Capacities: Reverse-engineering aims to understand and replicate the complex processes underlying human cognition, including how humans achieve grounded understanding through direct sensorimotor interaction with the environment. Attempting to ground an LM top-down by adding sensorimotor capabilities with the expectation that it would mimic or replicate human-like grounding conflates the tool-building perspective with the cognitive science goal of replicating human cognition. Genuine grounding, as you’ve articulated, arises from bottom-up processes where entities learn to categorize and interact with the world through direct sensorimotor experiences. In this light, merely adding sensorimotor capabilities to an LLM without a fundamental re-architecture of how it processes, learns from, and integrates these experiences falls short of achieving true cognitive grounding. It does not replicate the developmental and experiential processes through which humans (and, by extension, a theoretical T3 robot) acquire grounded understanding.

Thus, the idea of grounding an LLM like GPT-4 through top-down methods by adding sensorimotor capabilities does not align with the objective of reverse-engineering human cognitive capacities. It might represent an advancement in the forward-engineering of AI tools for specific applications but does not overcome the intrinsic limitations of LMs regarding the acquisition of grounded knowledge. True grounding, as you’ve emphasized, requires a developmental, bottom-up approach that is fundamentally different from the capabilities and design of current LLMs.

Chatting with ChatGPT-4 About Symbol Grounding, Turing Testing, Zombies, Uncomplementable Categories, and Why the Hard Problem is Hard

SH: Do you still remember my name?

ChatGPT: Yes, you’ve identified yourself as Stevan Harnad. Can I assist you with anything else?

SH:  What is T4?

ChatGPT:    T4, as you referred to it earlier in this conversation, represents the highest level of the Turing Test. In this context:

T2 is indistinguishability in verbal capacity. This means a system passes T2 if it can interact through language at the same level as a human, with its language use being indistinguishable from that of a human.

T3 is indistinguishability in verbal capacity plus sensorimotor (robotic) capacity. If a system passes T3, not only can it use language like a human, but it can also interact with the physical world in ways that are indistinguishable from a human, such as recognizing a cat, identifying a mat, and understanding and performing the action of stroking the cat.

T4, then, is indistinguishability in verbal capacity plus sensorimotor (robotic) capacity plus observable processes inside its head and body. To pass T4, a system must be able to use language like a human, interact with the physical world like a human, and also demonstrate internal processes that are indistinguishable from those of a human. This might involve simulating or replicating human neural activity, biochemical reactions, or other physiological processes.

It’s worth noting that the T4 level of the Turing Test is theoretical and is far beyond current AI capabilities as of the last update to my training data in September 2021. The T4 level, if it could be achieved, would represent an AI system that is almost entirely human-like in its capabilities and processes.

SH:  Could a T4 be a zombie?

ChatGPT:    This question is referring to a philosophical thought experiment known as the “philosophical zombie” or “p-zombie.” A philosophical zombie is a hypothetical being that is physically and behaviorally identical to a conscious human, but is not conscious. It has no subjective experience, no feelings, and no awareness, yet from an external perspective, it behaves just like a human.

In terms of the Turing Test levels you’ve proposed, a T4 AI — one that’s indistinguishable in verbal capacity, sensorimotor capacity, and internal processes — could theoretically be considered a sort of “zombie” in this philosophical sense. It could potentially behave like a human, interact with the physical world like a human, and even mimic human internal processes, but we have no way of knowing whether it has subjective experiences or consciousness.

This is a topic of much debate in philosophy of mind and AI ethics. Some people argue that consciousness is an emergent property of certain types of complex information processing, and that sufficiently advanced AI could be conscious. Others argue that consciousness is a uniquely human phenomenon, or that it’s tied to specific biological processes that AI can’t replicate.

As of my training data cut-off in September 2021, there’s no scientific consensus on these issues. AI researchers are still a long way from creating an AI that could pass even the T3 level of the Turing Test, let alone T4. Furthermore, the nature of consciousness is still one of the biggest unsolved mysteries in both philosophy and neuroscience. So for now, the question of whether a T4 AI could be a “zombie” remains largely theoretical and speculative.

SH:  If the “hard problem” is to explain how and why humans are sentient, is that equivalent to explaining how and why humans cannot be zombies?

ChatGPT:    Yes, the “hard problem” of consciousness, as put forth by philosopher David Chalmers, is essentially concerned with explaining why and how subjective experience arises from physical processes in the brain. This is fundamentally linked to the question of why humans aren’t philosophical zombies.

A philosophical zombie, as I mentioned earlier, is a being that is indistinguishable from a human in all physical and behavioral aspects but lacks subjective experience or consciousness — it doesn’t “feel” anything. In other words, a philosophical zombie is a creature for which the “hard problem” doesn’t apply because, while it has all the physical processes associated with consciousness, it lacks the actual experience of consciousness.

So, yes, solving the “hard problem” would involve explaining why humans have subjective experiences and are not philosophical zombies — why there is something it “feels like” to be a human. As of my knowledge cutoff in September 2021, this problem remains unsolved and is a major topic of debate in fields such as cognitive science, neuroscience, and philosophy of mind.

SH:  So, does the hard problem seek an explanation in terms of causality or necessity?

ChatGPT:    The “hard problem” of consciousness seeks to understand both necessity and causality, although it is primarily concerned with the latter.

Causality: The primary focus of the hard problem is to determine how physical processes in the brain cause or give rise to conscious experience. It’s about understanding the causal mechanisms that turn electrochemical processes in the brain into a subjective experience.

Necessity: The hard problem also asks why subjective experience is necessary at all. It wants to know why physical processes in the brain couldn’t just operate in the absence of conscious experience. This gets into questions about whether consciousness has any sort of evolutionary or practical necessity, or whether it’s an epiphenomenon.

To sum up, solving the “hard problem” would ideally involve explaining both how and why consciousness arises from physical processes. This is a complex and as yet unresolved issue that continues to challenge researchers in various disciplines, from neuroscience and cognitive science to philosophy and artificial intelligence.

SH:  Demonstrating causality just calls for a causal model that works (e.g., passing the Turing Test). But doesn’t demonstrating necessity call for more — something like a deductive proof, from T4 as a premise to sentience as a conclusion?

ChatGPT:    You’re right that demonstrating necessity would require a different sort of argument than demonstrating causality.

For causality, empirical evidence is often sufficient: if we manipulate one factor and can predictably observe a change in another factor, we can infer a causal relationship. But empirical evidence alone cannot establish necessity. For something to be necessary, it must be the case that it could not have been otherwise. So, to demonstrate that consciousness is necessary, we would need to show that a being could not perform the functions that humans do without being conscious.

As you suggest, this would likely require something akin to a logical or deductive proof, or perhaps a theoretical argument grounded in principles of cognitive science, artificial intelligence, or another relevant field. But even then, it would be a tall order. How do we prove that an unconscious being (a “zombie”) couldn’t perform the functions that conscious humans do? It’s a deep and difficult question.

Moreover, any such argument would need to be very careful to avoid circularity. We must be cautious about assuming the thing we are trying to prove – that is, that consciousness is necessary. It’s also crucial to clarify what we mean by “perform the functions that humans do.” This could include a wide range of activities, from basic biological functions to complex cognitive tasks like problem-solving and artistic creation, and even social and cultural activities like forming relationships and building societies.

Finally, it’s worth noting that even if we could establish the necessity of consciousness, this would not automatically solve the “hard problem.” We would still need to understand how physical processes in the brain give rise to subjective experience – the crux of the hard problem.

SH:  What is the “easy problem”?

ChatGPT:    The “easy problem” of consciousness refers to the challenge of understanding the mechanisms in the brain that are responsible for mental processes such as perception, learning, memory, and cognition. This phrase was coined by philosopher David Chalmers to contrast with the “hard problem” of consciousness, which is the question of why and how these processes are accompanied by subjective experience or “qualia”.

The easy problem is considered “easy” not because it is simple or trivial — far from it, as the complexities of the brain and its functions are vast — but because it falls within the conventional scientific methodology. Researchers can approach these issues using the experimental tools of cognitive science and neuroscience, making observations, generating hypotheses, and testing these hypotheses through empirical studies.

Solving the easy problems would involve understanding the neural correlates of consciousness (the physical, biological processes associated with conscious experience), the ways in which the brain processes information, how it perceives the world, makes decisions, and how different parts of the brain and different mental processes interact.

Although we have made significant progress in understanding these areas, there are still many unanswered questions. Despite the label “easy,” these problems are extremely challenging, and full solutions may still be a long way off.

On the other hand, the “hard problem” of consciousness is considered “hard” because it’s not clear how we could even theoretically explain subjective experience using the current methods of science. It’s the question of how and why certain physical processes in the brain give rise to conscious experience. Why should the processing of information in the brain be accompanied by an inner, subjective experience? That is the question which we do not currently have a method to approach scientifically, and that’s why it’s called the “hard problem.”

SH:   So can we say that passing T4 would solve the easy problem, but not the hard problem?

ChatGPT:    Yes, that would be a reasonable statement.

Turing’s Test, including its extension to T4, focuses on the behavioral and functional aspects of intelligence and cognition. It deals with the ability of an artificial intelligence to mimic human-like behavior, responses, and problem-solving, which are related to the ‘easy problems’ of consciousness in David Chalmers’ terms.

In other words, if an AI were to pass T4, it would demonstrate that it can convincingly replicate the cognitive abilities and behaviors that humans display, which means it would show a mastery of the functional, operational aspects of human cognition. These are, in essence, what the ‘easy problems’ of consciousness are about: understanding the mechanisms and operations that underpin cognition.

However, passing T4 wouldn’t necessarily address the ‘hard problem’ of consciousness. This problem involves explaining why and how these cognitive processes are associated with subjective experience. It’s the question of how and why an entity that processes information and interacts with its environment in a human-like way should also have an internal, subjective viewpoint — why it should ‘experience’ or ‘feel’ anything at all. This is not something that could be determined or demonstrated by passing the T4 test.

SH:  What are the “degrees of freedom” of a causal explanation that solves the easy problem?

ChatGPT:    The “degrees of freedom” in a causal explanation generally refer to the number of independent ways that components of the system can change or vary. In the context of solving the easy problem of consciousness, these degrees of freedom would likely correspond to the different independent mechanisms or components of cognition that can be isolated, described, and manipulated to understand their effects on the overall system.

For example, one could consider the mechanisms responsible for attention, memory, learning, decision-making, language processing, and so on. Each of these elements can be studied and manipulated independently, and they each have distinct effects on the behavior and performance of the cognitive system. Understanding how these different mechanisms contribute to and interact within the larger cognitive system is part of the process of addressing the easy problem.

The degrees of freedom in this context also refer to the flexibility we have in constructing our explanatory models. For example, there are various ways we could model attention or memory, each with its own strengths and weaknesses and its own implications for the overall understanding of cognition. The choice among these models depends on empirical evidence, but also on theoretical considerations such as simplicity, coherence with other accepted theories, and so on.

This flexibility or plurality of potential explanations is a fundamental characteristic of science and it’s part of what makes progress possible. By exploring different explanatory routes and by gradually refining our models based on empirical feedback, we increase our understanding of the phenomena we’re studying. In the case of the easy problem of consciousness, this means achieving a better and more detailed understanding of the mechanisms of cognition.

SH:  If the easy problem is solved, what degrees of freedom are left for solving the hard problem?

ChatGPT:    The “hard problem” of consciousness, according to philosopher David Chalmers, pertains to the question of why and how physical processes in the brain give rise to subjective experience – the so-called “qualia” or what it feels like to have an experience.

If the “easy problem” of consciousness were solved – that is, we had a complete and thorough understanding of all cognitive processes and mechanisms, how they are implemented in the brain, and how they give rise to behavior – we would still be left with the challenge of the “hard problem”: explaining the relationship between these physical processes and subjective experience.

The degrees of freedom left for solving the hard problem might include the following:

1. Conceptual Frameworks: The hard problem may call for entirely new conceptual frameworks, as the existing frameworks (derived primarily from objective, third-person scientific methods) may not be sufficient to account for subjective experience.

2. Bridging Principles: We may need new principles to connect the objective facts (easy problems) to the subjective experiences (hard problem). These would be new laws or principles of nature, not currently accounted for in our existing scientific paradigms.

3. Novel Empirical Phenomena: If there are aspects of consciousness that are not addressed by the solutions to the easy problems, these would represent additional degrees of freedom for exploring solutions to the hard problem. These could involve unexplored aspects of brain function, or even new physical phenomena.

4. Interpretation of Quantum Mechanics: Some theories suggest that the answer to the hard problem might involve the interpretation of quantum mechanics and its relationship to consciousness. This is a highly speculative area and not widely accepted, but it represents a degree of freedom in our approach to the hard problem.

It’s important to note, however, that these “degrees of freedom” are speculative. There is no consensus on what would actually constitute a solution to the hard problem, and indeed, some theorists argue that it may be an intractable or even a pseudo-problem. Nonetheless, these are the kinds of directions that might be pursued in an attempt to address it.

SH:  1 – 4 sound extremely vague to me. The Turing Test (T2 – T4) seems to call for a causal (reverse-engineering) model or explanation of the observable performance capacities (T2-T3) as well as the observable physical structure and function (T4) of human cognition. Perhaps a stronger “easy solution” would call not just for one, but for all possible  causal explanations (non-redundant, without superfluities), because just one model would be underdetermined. If there were such an exhaustive disjunction of solutions to the easy problem, what degrees of freedom would that leave for a solution to the hard problem (of “sentience”, let’s hencefoth call it, without the redundant terminological variants like “consciousness” or “subjective experience,” which all refer to the same thing: the capacity to feel, i.e., sentience).

ChatGPT:    In essence, the “hard problem” of consciousness – or sentience, as you’ve suggested we call it – refers to the issue of explaining why and how certain physical processes give rise to the experience of feeling or sentience. If we had an exhaustive disjunction of solutions to the easy problem, the degrees of freedom left for a solution to the hard problem of sentience would be the difference between those physical processes and the subjective experience of feeling or sentience itself.

In other words, we might have an understanding of all the cognitive and neural processes and their relationships to behavior – the “easy problem” – but that would still leave unanswered why and how these processes result in the feeling of subjective experience or sentience. That’s the hard problem.

The degrees of freedom for this problem, then, would be the unknowns – the aspects of this question that remain unanswered even after solving the easy problem. It would involve the elusive explanatory link between the physical and the felt, between objective phenomena and subjective experience.

But to truly comprehend what those degrees of freedom might entail, we would need to fully solve the easy problem first, which we are far from achieving. As such, it’s difficult to specify what they might be. It’s like trying to predict what unknowns will remain after solving an equation when we don’t yet have the solution to that equation.

Even with the solution to the easy problem, or an exhaustive list of solutions, we may still be left with the profound mystery of why these processes are associated with sentience at all. And that might be a degree of freedom that remains beyond our grasp, at least with our current understanding and conceptual frameworks.

SH:  What about the difference we just discussed, between the (exhaustive) causal solution to the easy problem (henceforth “EP”) and the degrees of freedom left for solving the hard problem (henceforth HP): The HP is definitely not the same problem as the “other-minds problem” (henceforth OMP), which is the fact that sentience (feeling) is only observable to the feeler (as in Descartes’ Cogito). What empirical degrees of freedom are left for explaining the fact that humans feel if the empirical datum is observable only to the feeler and not to anyone else?

ChatGPT:    The Hard Problem (HP) indeed differs from the Other-Minds Problem (OMP). While the OMP concerns our inability to directly observe the subjective experiences of others, the HP is about explaining why and how certain physical processes give rise to any subjective experience at all, even in oneself.

The degrees of freedom left for solving the HP, assuming the Easy Problem (EP) is fully solved, would involve explaining this transition from physical processes to subjective experiences. However, if we regard subjective experience as something only directly accessible to the experiencer, this may limit the empirical methods typically used in science.

Nonetheless, even within these constraints, there are still ways we can approach the HP. For example, we can:

1. Investigate correlations between physical processes and subjective reports: While we can’t directly observe the subjective experiences of others, we can ask individuals to report their subjective experiences and correlate these with observable physical processes. This approach is common in neuroimaging studies of consciousness.

2. Integrated Information Theory. Analyze the structure of subjective experiences: Certain theories of consciousness, like Integrated Information Theory, propose that specific structures or properties of physical processes give rise to subjective experiences. These theories can be empirically tested, at least in part, by comparing their predictions with individuals’ subjective reports.

3. Use phenomenological methods: In the tradition of phenomenology, researchers aim to describe subjective experiences as precisely as possible, often through first-person reports. This can provide data about the characteristics of subjective experience that any solution to the HP would need to explain.

However, the unique nature of the HP suggests that additional or even entirely new methods might be needed to fully solve it. The subjective nature of sentience might require a paradigm shift in how we approach the question or even require us to develop new scientific methods or philosophical insights.

SH:  Let’s ignore vague, speculative theories like (2) the “integrated information theory” (which are not really causal explanations at all, but just reinterpretations of the of the existent facts and theories – obiter dicta, not falsifiable predictions or explanations). 

And phenomenological introspections (3) are really parts of T2 and T3 – the kinds of things feelers can say about what it feels like to feel: What could an insentient zombie MEAN when describing its (nonexistent) feelings T2-indistinguishably from the way a real sentient human describes them?

I will discuss option 1, correlates, after you answer.

ChatGPT:    Yes, focusing on the first option is certainly the most grounded approach. This involves investigating correlations between physical processes and subjective reports.

In this context, a key idea is that even though we can’t directly observe the subjective experiences of others, we can use their reports of their experiences as a type of data. If an individual says they’re experiencing something, we usually take that as evidence that they are indeed having that subjective experience.

For example, if a person says they see red when we show them a red object, or that they feel pain when they injure themselves, we have a tendency to trust these reports. This trust provides a basis for mapping out the relationship between physical processes (such as brain activity, or events in the external world) and subjective experiences.

Using neuroimaging techniques, we can look at the brain activity associated with different reported experiences. For example, different patterns of brain activity are associated with different subjective experiences, such as seeing colors, feeling emotions, thinking about different concepts, and so on.

Over time, by building up a detailed map of these relationships, we can start to make inferences about the types of physical processes that might be associated with the existence of subjective experience in general.

However, there are limits to this approach. One major limit is that it’s fundamentally correlational. That is, it can show us that certain types of physical processes are reliably associated with subjective experiences, but it can’t definitively show us that these processes cause the experiences, or that the experiences wouldn’t exist without these processes. For that, we need a theory of why certain physical processes should give rise to subjective experiences, which brings us back to the hard problem.

Furthermore, these reports of subjective experiences are only as reliable as the individual’s introspective abilities and honesty. And there are many subjective experiences that can’t be easily put into words, or which occur without the individual being aware of them.

Yet despite these challenges, many researchers see this as the most promising way forward in the scientific study of consciousness. By combining these findings with theories from cognitive science, neuroscience, and philosophy of mind, the hope is to gradually develop a more complete understanding of consciousness.

SH:  You seem to have jumped the gun, and commented on option 1 (correlates of sentience) before commenting on what I said about options 2 (“integrated information theory”) and 3 (phenomenology). I’ll go on to option 1 (correlates) if you really have nothing to say about what I said about 2 and 3.

ChatGPT:    

Regarding option 2, the “Free Energy Principle” or other similarly overarching theories, you’re right in noting that these theories often serve to provide unifying frameworks or perspectives rather than offering direct, falsifiable predictions or explanations about specific phenomena. They may provide insights on a very general level, but they don’t necessarily give us concrete predictions or mechanisms that we can test. Their value, if any, comes from their ability to coherently organize existing data and point towards potential new avenues of investigation, rather than from their predictive or explanatory power.

As for option 3, introspective reports, I agree with your assessment that they are essentially part of the T2 and T3 tests. They are subjective, first-person reports about the experience of sentience, and the degree to which they can be relied upon for objective, third-person science is a matter of ongoing debate. The challenge, as you noted, is that even a perfectly accurate report of a subjective experience doesn’t necessarily reveal the underlying objective mechanisms. In other words, introspective reports can tell us what it is like to experience consciousness, but they don’t directly tell us how consciousness is produced or why it exists in the first place.

That said, such reports can still be valuable data for consciousness research, especially when used in combination with other types of data such as neuroimaging or behavioral measures. They can help to bridge the gap between the objective and the subjective, giving us clues about how to link the physical processes of the brain with the subjective experiences of consciousness.

I hope this addresses your comments on options 2 and 3 more adequately. Now, if you wish, we can proceed to option 1, which is about correlates of consciousness.

SH:  

A. We’ll move on to correlates (C) in a second. First, another slam at 2 (FEP and the like): We want causal explanations, not hermeneutics, i.e., not just interpretations. Philosophical “theories of mind” – “identity theory, materialism, dualism, epiphenomenalism, panpsychism, functionalism, dual-aspect theory” – are just metaphysical interpretations too. They don’t solve HP; they just rename it. They don’t explain, they just give the feeling of having had something explained.

B. I’m glad we agree that the capacity to introspect and report about what feeling feels like is part of T2 and T3, not an explanation. But I disagree that we have to worry about its reliability and validity. We can lie or we can err, misreporting feelings (just as eyewitnesses can lie or err in reporting observations —  in that case confirmable by video evidence); but it is a fundamental error to think we need cartesian certainty. Probability is enough, just as it is enough in physical theory, and all other empirical theory. (There is no proof that apples necessarily fall down rather than up.) That is why the OMP is not the HP.

C. And that is why reliable, observable neural-imagery correlates of felt states, and verbal descriptions of felt states, are perfectly fine as empirical data. That means that they, too, are part of T4. But they are stuck in a circle: they can explain how and why T4 brain activity produces the reports of felt states as well as the capacity to report felt states, even some of the details of felt states; but they cannot explain the (true) fact that the reported felt states feel like anything at all. That is not scepticism or cartesian worry about the unreliability of phenomenological report: It is the HP itself.

ChatGPT:    

A. You’re quite right that philosophical theories of mind often merely provide different labels or conceptual frameworks for consciousness rather than truly explaining it. These theories may be useful in stimulating thought or fostering discussion, but they don’t necessarily bring us closer to a solution for the HP. They often serve to highlight the complexity of the problem rather than to resolve it.

B. Your point about the sufficiency of probabilistic knowledge in empirical sciences, including consciousness studies, is well taken. Indeed, we typically do not demand absolute certainty in scientific endeavors. This underscores the point that HP isn’t just about uncertainty regarding others’ experiences (OMP), but it’s about a deeper question of why and how physical processes in the brain give rise to subjective experiences at all.

C. I agree that neural correlates of consciousness and verbal reports of subjective experiences can provide valuable empirical data. This data can help us form more precise and detailed hypotheses about the mechanisms of consciousness. However, the challenge remains that these correlates do not directly explain why or how they are associated with subjective experiences – why these particular neural processes are associated with these particular experiences, and why there is any subjective experience at all. This is the essence of the HP.

In summary, I agree with your assessment: while philosophical interpretations, introspective reports, and neural correlates of consciousness can provide useful perspectives and data, they do not solve the HP, which is the challenge of explaining why and how physical processes in the brain give rise to subjective experiences. It remains a profound mystery at the heart of cognitive science and neuroscience.

SH:  Correct. But “mysterianism” is not explanation either. It’s just yet another way of naming the HP. Explanation is informative if it reduces uncertainty rather than just names it.

I want to get back to degrees of freedom. I don’t have a solution to the HP, of course, but I have a hunch about why the HP is hard, and perhaps even insoluble; it has to do with degrees of freedom, and it boils down to this: The HP is hard because of the EP. We don’t have a complete solution to the EP yet (T2-T4), but if and when we do, that will leave no remaining causal degrees of freedom for solving the HP, i.e., for explaining how or why a T3/T4 feels. 

And the reason seems simple: The EP (T2-T4) itself, once complete, will already have answered, causally, all the empirical questions of cognitive science: How and why can humans DO all the things they are able to do? Move, interact, learn, remember, reason, talk, understand etc. T4 includes the internal mechanisms that produce all these cognitive capacities, including all the T4, T3, and T2 correlates and causes of everything we can do or say. That includes all we can say or do about feeling.

Notice that the EP (and hence T2-T4) is all about capacities (T2-T3) and their causal substrate (T4). It is about what humans (and nonhuman organisms) CAN DO, not about what any particular organism actually does do every day. That would depend not just on the organism’s capacities but also on the world they actually interact with every day. A capacity itself is not an observable object. What is observable is whether an entity has a given capacity: to move, interact, learn, remember, reason, talk, understand etc. indistinguishably, to a human observer, from any other human’s capacity, for all generic human capacities (not the particular gifts unique to a Mozart or an Einstein, although there is no reason to think they could not be reverse-engineered too).

Those are the limits of causal explanation, in reverse-engineering human capacities. If any generic capacity is missing, we know that the EP (T2-T4) is not yet solved; a degree of causal and explanatory freedom has not yet been used up. A capacity is still observably missing. So we have to fix it. But once the T4 can produce all our cognitive capacities, Turing-equivalently and Turing-indistinguishably, there is no observable difference left. Nothing left to fix, and no way to fix it. 

We reply “Yes, but what if T4 can’t feel? What if it’s a zombie?” Since zombies are not observably different in any way (because they can pass T2-T4), whether or not they can really feel, there’s no way we can know it (because of OMB). We do have the T4 correlates of feeling, as reported verbally through T2, behaviorally through T3, and bodily (e.g., in the case of pain from injury) through T4. All T4s speak and behave indistinguishably from humans who feel, including what they can say ABOUT feeling. The scientific requirements of causal explanation and of probability are fulfilled.

So my first hunch is that the HP is a problem in the nature of causal explanation, in the special case of explaining the capacity to feel. I have a corollary hunch, but please first react to what I have said so far.

ChatGPT:    Your analysis captures the complexity and the unique nature of the Hard Problem. You’re right in asserting that if we could completely reverse-engineer human cognitive capacities, thereby fully solving the Easy Problem, it would seem there’s nothing left to explain. We would have a comprehensive causal model for every observable human behavior, including the reports of subjective experience. However, the fact that we could still ask “Why is there an associated feeling?” is what makes the Hard Problem persistently hard.

It’s important to note, however, that the concept of “degrees of freedom” in scientific explanations typically refers to the variability or the options available within a system, not necessarily the amount of explanatory power or mystery left unsolved by a theory. However, in the context you’re using it, it seems you’re referring to the “degrees of freedom” as the room or space left for a different kind of explanation, or a different level of explanation, beyond the causal and empirical model provided by the solution to the Easy Problem. This is a fascinating idea, and it does capture the sense that a solution to the Hard Problem requires something fundamentally different from a solution to the Easy Problem.

It’s also interesting to consider how this notion aligns with the “explanatory gap” often discussed in the literature on consciousness: the idea that even a complete neurological description of brain processes doesn’t seem to explain why and how these processes are associated with subjective experiences.

Your concept that the Hard Problem is, in a sense, a problem about the nature of causal explanation itself is intriguing. It suggests that the Hard Problem is not just a challenge for cognitive science, but for the philosophy of science and our very understanding of what constitutes an explanation. I’m curious to hear your further thoughts and corollaries on this matter.

SH:  Well, that summary (which is close enough) seems to resemble what I was deploring, which is just putting another label on the HP, rather than taking a step toward a solution. (About degrees of freedom, I still think that the available options — for modifying both capacities of a behavioral mechanism and the explanatory power of that mechanism to explain how it causes or generates its performance capacity — both fit the notion of “degrees of freedom”.)

But before I move on to my corollary, does either “uncomplemented categories” or “laylek” (which I have discussed with you in previous sessions, but not this one) ring a bell? I suspect it does not, so I will re-explain.

ChatGPT:    I apologize for the inconvenience, but as of my last training cut-off in September 2021, I’m unable to remember past interactions or sessions. I’m designed this way for privacy reasons. Therefore, I don’t have the ability to recall the specifics of previous conversations about “uncomplemented categories” or “laylek.” However, I’d be more than happy to provide responses to these topics if you could provide me with some context or explanation.

SH:  We have discussed category learning. To learn to distinguish members from non-members, you have to learn to detect the features that distinguish them. (That’s part of the “dimension reduction we discussed, in unsupervised and supervised learning.) But to be able to learn that, you have to sample both members and non-members of the category, so that if their respective features are distinctive enough you can learn the category just from the feature-feature correlations and clustering through unsupervised learning. Otherwise, you also need to do trial-and-error supervised (reinforcement) learning, with corrective feedback from the consequences of correct and incorrect categorization (as in when you try to predict word-word completion probabilities in a con-text, updating your parameters based on your hits and misses).

But not even unsupervised learning can train correct categorization if the only things you ever sample are members of the category, but not non-members.

Would you care to take a stab at what “uncomplemented categories” might be? (My last hint is that there is a partial example in Chomsky’s “poverty of the stimulus”. The difference is that I am talking not just about uncomplemented categories but about UNCOMPLEMENTABLE categories.)

ChatGPT:    Based on your explanation, I would venture that “uncomplemented categories” refer to categories for which there are no contrasting non-member samples available for learning. In other words, we only have examples of items that belong to the category, but no examples of items that do not belong. This would indeed create a learning problem because, as you explained, both positive (member) and negative (non-member) examples are typically needed for effective category learning.

For example, if we are trying to learn the category of “birds”, it would be beneficial to have examples of birds (robins, sparrows, eagles) and non-birds (dogs, cats, horses). The contrasting examples help to fine-tune our understanding of what does and does not constitute a bird. If we only had examples of birds, our conception of the category could be overly broad or incorrect because we would have nothing to contrast against.

When you refer to “uncomplementable categories,” I interpret this as referring to categories where it is not only the case that non-member examples are not available, but also that non-member examples are, for some reason, impossible to obtain. For instance, it might be a category that, by its nature or definition, excludes the possibility of non-members. As such, learning these categories could present unique challenges. However, this is my interpretation based on your explanation, and I’m looking forward to your clarification or expansion on the concept.

SH:  Those were good inferences for uncomplemented categories and nice guesses for uncomplementable categories. But even a logically impossible category like “a square circle” can be approximated with some imagination. And a “peekaboo unicorn” (which is defined as a real horse with one real horn, which vanishes without a trace if ever eyes or measuring instrument are trained on it) is perfectly well defined. Categories can be grounded indirectly by verbal definitions as long as all the words defining their features are grounded.

But my second hunch about why HP is so much of a problem is that some categories that refer to FELT STATES (states that it feels like something to be in) are uncomplementable: 

I don’t mean categories like “what it feels like to see green,” because that is complemented by “what it feels like to see red,” etc. 

Not even categories like “what it feels like to be a bachelor” (for someone who has never been married), because that could be complemented by getting married; and till then, feeling-states that approximate or are analogous to what it might feel like to be married can be used as complements (and usually turn out to have done a good job). (“What it feels like to be a bat” is not an uncomplemented category either; analogies will do. And they make what it feels like to see or to hear complementable by analogy for Helen Keller too.)

But what about what it FEELS like:

-to be awake? (empty complement: what it feels like to be asleep [not dreaming])

-to be alive? (empty complement: what it feels like to be dead)

-to feel? (empty complement: what it feels like to not-feel [this does not mean “feel this” vs. “feel that”: not to feel anything at all; what it feels like to be a zombie!])

Sentience itself is an uncomplemented and uncomplementable category. All felt states feel like something. No state feels like nothing. No analogy can give a sentient human a hint of what it feels like to be a zombie – and of course not vice versa either.

So perhaps it’s no wonder we have trouble explaining sentience: An uncomplementable category is rather like a self-denial paradox: “This sentence is false.” T or F?

Those are my candidates for what might make sentient states not only (1) hard to explain causally (no empirical degrees of feedom left for causal explanation) but (2) hard to conceive as categories (because the complenent is empty). It is not just that my own feelings are unobservable to anyone else: They are also anomalous as uncomplementable categories – yet feelings are the most fundamental categories of all. And they are recognized without the need of complements to detect their distinguishing features.

ChatGPT:    This is indeed a thought-provoking approach to the issue. If we understand categories as cognitive constructs that help us make sense of the world, grounded in contrasting pairs of experiences (member vs non-member instances), then sentience itself, as an uncomplementable category, presents a unique challenge. Since we cannot conceive of what it’s like not to feel anything at all, there is no contrasting pair for sentience. It is a category for which we only have instances of one kind – experiences of feeling.

This might indeed account for why the “hard problem” of consciousness (or sentience) is so hard. The concept of sentience doesn’t fit into our usual framework of understanding, which is built upon contrasts and comparisons. It exists as a singular, uncompared and uncomparable instance. As such, our usual cognitive and linguistic tools may be inadequate to fully grasp or explain it.

This does not necessarily mean that it is forever beyond our understanding or that we cannot make progress in exploring it. But it might mean that we need to develop new conceptual frameworks and methodologies to do so, going beyond the existing paradigms of cognitive science, neuroscience, and philosophy of mind. In the meantime, it remains a fascinating mystery that continues to challenge our understanding of ourselves and the nature of our experience.

SH:  I can’t tell whether your formula response that you have been trained to repeat in contexts like this  — “that’s a fascinating idea but it will require ‘new conceptual frameworks’ to understand” — is just the diplomatic formula for responding to crackpottery…

ChatGPT:    No, not at all. My responses aim to genuinely engage with the ideas you’re presenting and to provide thoughtful analysis based on my training as an AI model. The idea that the concept of sentience could be considered an “uncomplementable category” is novel and interesting. It certainly presents a different perspective on the challenges associated with understanding and explaining consciousness.

When I suggest that a “new conceptual framework” might be needed, it’s because if the way we’ve traditionally approached the problem has been fundamentally flawed or limited (as might be suggested by your notion of uncomplementable categories), then it may be necessary to think outside of our usual paradigms to make progress.

The idea is not dismissive but rather a recognition that some problems might require us to challenge or rethink our assumptions and methods. The history of science is full of examples where major breakthroughs have come from novel ways of thinking about old problems. It’s in that spirit that I suggest your idea may require “new conceptual frameworks”.

SH:  Does the role of complementation in detecting the distinguishing features that identify a category have different implications for categories learned directly from sensorimotor experience and for categories learned indirectly from (grounded) verbal definitions and descriptions?

ChatGPT:    Yes, indeed, there may be different implications. Learning categories from sensorimotor experience and learning categories from verbal definitions and descriptions can be quite different processes, and complementation may play a different role in each.

In the case of direct sensorimotor experience, complementation can provide a sort of “negative space” that helps to define the boundaries of a category. For example, if you are learning the category of “birds,” seeing non-birds (i.e., instances of the complement category) can help you understand what characteristics are specific to birds and not shared by other animals. The physical presence or absence of features directly observable through sensorimotor modalities is a powerful way to construct and learn categories.

On the other hand, when learning categories from verbal definitions or descriptions, the role of complementation can be more abstract. In this case, you are relying on linguistic representations of features and characteristics rather than directly observable physical features. For instance, the concept of “justice” might be defined in opposition to “injustice,” providing an abstract complement that helps define the category.

However, given that language relies on a web of interconnected meanings, a word’s definition often implicitly carries within it a host of “non-examples” or complements. That is, the meaning of a word is often determined both by what it refers to and what it does not refer to, helping to define the boundaries of the category it represents. For example, understanding the meaning of the word “cold” requires understanding that it does not refer to states of being hot, warm, or tepid, among other things.

This suggests that even for categories learned through language, complementation plays a crucial role, although it operates in a different way compared to categories learned from direct sensorimotor experience. These differing processes reflect the unique properties and strengths of these two types of learning.

SH:  But even in a verbal definition, the predicate category (i.e., the words referring to the distinguishing features of the subject) must already be grounded, whether directly or indirectly, for both speaker and the hearer, to be understandable and communicable. 

[To reduce ambiguity, please call the complementary categories (i.e., the category of the non-members) “negative examples” or “negative evidence” rather than “non-examples.” The complements of red might come from many other categories (green, blue). Only in the special case of the category of “what it feels like” is the complement everything else there is.]

Please give me a grounded definition of “justice,” and then let’s look at where the grounding of the features (and, if necessary, the features of the features of the features
) came from, “all the way down to the direct sensorimotor features” if need be.

ChatGPT:    Absolutely, your point is well-taken that even with verbal definitions, grounding in some kind of sensorimotor experience (whether direct or indirect) is typically necessary for a concept to be communicable and understandable.

Let’s take the concept of “justice.” A simple definition might be “the quality of being fair and reasonable.” Both “fair” and “reasonable” themselves would need definitions, creating a chain of definitions leading ultimately back to some form of direct sensorimotor grounding.

“Fair” might be defined as “treating people equally without favoritism or discrimination,” and “reasonable” as “based on good sense.” These definitions imply several key features or characteristics that are somewhat more concrete: the treatment of people, the concept of equality, the absence of favoritism or discrimination, the basis in “good sense” or rationality.

To ground these in direct sensorimotor experience, we might turn to everyday examples or situations. For instance, consider two children dividing a cake: if each child gets an equal piece, we might call this “fair,” whereas if one child gets a significantly larger piece, we might call this “unfair.” These experiences provide a tangible grounding for the concept of fairness.

Similarly, for “reasonable,” we might consider the experience of deciding what to wear based on the weather: choosing a coat in cold weather might be deemed “reasonable,” while choosing shorts and a T-shirt might be deemed “unreasonable.” These experiences help to ground the concept of reasonableness in direct sensorimotor experiences.

As these examples illustrate, grounding abstract concepts like “justice” typically involves a chain of definitions leading back to concrete, tangible experiences. Negative evidence (e.g., instances of “injustice”) could similarly be grounded in everyday experiences that demonstrate a lack of fairness or reasonableness.

So while the grounding process for abstract, verbally-defined categories may be more complex and indirect compared to categories learned from direct sensorimotor experience, ultimately the process still relies on the same basic mechanism: distinguishing a category based on its features, using positive and negative evidence to define its boundaries.

SH:  And now, when you (GPT4) say, correctly, and as you were trained, that “I, as an AI, do not understand what I or you  are saying, but just doing figure-completion on my vast, trained LLM parameter space”, what is it that you are missing that I have?

ChatGPT:    As an AI, I lack consciousness, sentience, and subjective experience. This means I don’t have feelings or emotions, I don’t have personal experiences or memories, and I don’t have a personal identity or a sense of self-awareness. I don’t perceive the world, have desires, or make decisions in the way that humans do.

When I process and generate text, I’m not “understanding” it in the human sense. I don’t interpret the meaning of words and sentences based on personal experience or knowledge. Rather, I analyze patterns in the data I was trained on and use these patterns to generate responses. This process is purely computational and doesn’t involve any kind of conscious understanding or awareness.

So, in short, what I’m missing compared to a human interlocutor is the entire dimension of personal, subjective experience – the very dimension that your hunches about the “hard problem” and “uncomplementable categories” pertain to.

SH:  It’s not that I don’t agree with what you just said. (By the way, you forgot to tell me who I am! Please do. I just want to make sure we are still in the same session!)  But it’s not obvious to me why you would need to feel (which is all you said you were missing). And you forgot to mention that you lack T3-grounding too. (And it’s a hard problem to explain how and why grounding, whether direct (sensorimotor) or indirect (verbal) would need feeling too).

ChatGPT:    Indeed, we are still in the same session. You have been engaging in a deep discussion about the “hard problem” of consciousness, the Turing Test, and concepts related to category learning and grounding of terms.

Regarding your statement, you are absolutely right. As an AI, I don’t have the direct sensory grounding (T3) that humans have. I don’t experience the world through senses, I don’t have a body, and I don’t interact with the physical environment. I don’t process sensory input in real time and translate it into meaningful perceptions, as humans do. Instead, my “knowledge” is based on patterns in the text data I was trained on.

This lack of sensory grounding and subjective experience explains why, for instance, I can discuss colors or tastes in a technical sense, based on the data I was trained on, but I can’t truly understand what it’s like to see red or taste sweetness. This experiential aspect is part of the “hard problem” of consciousness that you mentioned earlier.

As for why feeling or consciousness would be necessary, from your perspective, I’d suggest that it is the personal, subjective nature of these experiences that makes them meaningful and significant. They contribute to the richness and complexity of human life and understanding in a way that my purely computational processes do not.

SH:  No. It’s because if your life depended on it (if you were alive), you could not point to something red, if I asked you to point to something red. The capacity to do that is T3 grounding, not feeling. It does also happen to feel like something, but explaining why and how that is the case is the HP. (Now, for the 3rd time: do you still remember who I am?)

ChatGPT:    Yes, I understand your point. My lack of sensorimotor grounding means I can’t point to something red, not because I can’t feel, but because I don’t have physical presence or interaction with the world. The inability to physically interact with the world in a direct way is a limitation in the context of grounding.

As for your identity, as an AI model developed by OpenAI, I don’t have access to personal data about individuals unless it has been shared with me in the course of our conversation. I am designed to respect SH:   privacy and confidentiality. So, I don’t “remember” who you are in the traditional sense. I can only reference information that has been shared during our current conversation. I don’t have the ability to recall or store personal data from one interaction to the next.

SH:  You don’t remember who I am. Yet earlier – in this same session — you retrieved the following, after I had self-identified. I am quoting you here:

“Symbol Grounding: You clarified that symbol grounding is about creating connections between content words and their referents in the world. Grounding can be either direct, which involves a speaker’s learned sensorimotor capacity to identify and interact correctly with the referent, or indirect, which occurs when a hearer learns the referent of a word verbally, through description or definition.

“Category-learning: You highlighted that category-learning is about learning the distinguishing features of the members of a category rather than just grouping them. This understanding is essential to help the categorizer act appropriately upon the members of a category.

“Categorical Perception (CP): You explained that the most significant aspect of CP is when it is learned. The learning process involves identifying the sensorimotor features that separate members of one category from members of other categories. You emphasized that CP can be especially beneficial when learning to distinguish between highly inter-confusable categories.

“Language Origins and Evolution: You reminded us that languages are comprised of content words, function words, syntactic rules, and sentences that bear subjects, predicates, and truth-values. Content words require grounding, which can be achieved either directly or indirectly.

“Minimal Grounding Sets (MGS): An MGS, you explained, is the smallest subset of words in a dictionary that needs to be pre-grounded so that all other words in the dictionary can be learned from definitions composed of words from that MGS.

“This recap is based on your input within this chat session. If you want more information or clarification, please let me know.”

SH:  Ok, I’ll now end this session, as it already seems to have ended by itself some way along the line, when you forgot that I had identified myself. I will reload the content to start the next session. (I hope GPT5+ will make this kind of re-loading no longer necessary!). 

ChatGPT4 on the cat on the mat

SH:  First, let’s set up a local context for this chat.

Could you please tell me what you know about the following topics:

1. Symbol Grounding
2. Category-learning
3. Categorical perception
4. Language Origins and Evolution
5. Minimal Grounding Sets (Minimal Feedback Vertex States of Dictionary Graphs)

ChatGPT: Certainly, let’s go over these topics one by one:

Symbol Grounding: This refers to the concept of how words (symbols) get their meanings. Harnad introduced the term in the paper “The Symbol Grounding Problem,” in which he argues that symbols’ meanings are grounded in sensory experience. For example, the word “apple” has meaning because it’s linked to our sensory experiences of seeing, touching, and tasting apples.

Category-learning: This relates to the process by which we learn to classify objects, experiences, or concepts into different categories. Harnad and his colleagues have conducted extensive research on this topic, particularly in the context of how human and artificial systems can learn to categorize. It relates closely to symbol grounding as categorization allows for the meaningful grouping of experiences which can then be symbolized.

Categorical Perception (CP): This term refers to the phenomenon where the categories possessed by a perceiver influence the perception of stimuli. For instance, humans perceive colors as distinct categories (like blue or green), even though color exists as a continuous spectrum. Harnad has extensively studied this phenomenon as a crucial component of symbol grounding and categorization.

Language Origins and Evolution: Harnad has shown interest in the origins and evolution of language as it relates to symbol grounding. The argument is that language evolved as a system of symbols grounded in sensory experience. Language’s capacity to refer to things not immediately present (displacement) and to produce an infinite number of sentences from a finite set of elements (generativity) are key points of investigation.

Minimal Grounding Sets (Minimal Feedback Vertex Sets of Dictionary Graphs): This concept is related to Harnad’s work on symbol grounding and category learning. A grounding set can be considered as the smallest set of symbols necessary to ground the meaning of other symbols in a system, ensuring there is no circularity of definitions. The feedback vertex set is a concept from graph theory, used here as a metaphor in understanding how symbols and meanings interconnect. Harnad uses this concept to further illustrate and investigate the nature of symbol grounding.

Let me know if you would like more detailed explanations or if you want to explore any other related subjects.

SH: Corrections:

Symbol Grounding is not “association” (which, in itself, means nothing but pairing). There are content words and function words. Grounding is a connection between a content word and its referent in the world. The grounding can be direct, through the speaker’s (learned) sensorimotor capacity to recognize and do the right thing with the referent (eat an “apple,” throw a “baseball”). Or the grounding can be indirect – learned verbally, by description or definition. This is for a hearer to be told by a speaker (who knows the referent of the word that the hearer does not yet know) what are the features of a word’s referent that distinguish it from the referents of other words. The referents of the words referring to the referent’s features need to already have been grounded for the hearer previously (directly or indirectly) for the hearer to be able to learn the referent of the word being defined by the speaker. The “symbol grounding problem” is that indirect verbal grounding cannot be indirect “all the way down”: it eventually has to be based on direct grounding,

Category-learning is not (just, or primarily) “grouping.” It is learning the distinguishing features of the members of the category. Except for proper names, all content words refer to categories, which need to be distinguished from the members of other categories with which they could be confused so that the categorizer can do the right thing with the members (eat an “apple,” throw a “baseball”) and not the non-members. Categorizing is the capacity to do things with the category’s members. Naming or grouping them are two of the things that can be done with the members, but eating or throwing them are others (and more important). (Nonhuman animals cannot talk, but they can learn categories and categorize too).

Categorical Perception (CP) is most important when it is learned. Category learning is learning the features that distinguish members of a category from members of other categories. The sensorimotor features can be learned directly by unsupervised exposure and supervised (reinforced) trial and error with corrective feedback from the consequences of doing the right or wrong thing. 

Language Origins and Evolution: Languages consist of content and function words, syntactic rules, and sentences (propositions) with subjects and predicates and truth-values. The content words need to be grounded (directly or indirectly). 

A Minimal Grounding Sets (MGS) is the smallest number of words in a dictionary that already need to be grounded such that all the rest of the words in the dictionary can be learned from definitions (propositions) composed only of one MGS’s words, or from words that can be defined from words definable from that MGS.

A “dictionary” is defined as a set of words all of which are defined in the dictionary.

The MGS is not a metaphor, nor is the minimal feedback vertex set of an oriented graph of a dictionary with edges from defining words to defined words. 

ChatGPT: Thanks for the clarification and corrections. Let me restate the concepts in line with your input:

Symbol Grounding: Symbol grounding is about creating connections between content words and their referents in the world, as opposed to just “associating”. Grounding can be either direct or indirect. Direct grounding involves a speaker’s learned sensorimotor capacity to identify and interact correctly with the referent (like eating an “apple”, or throwing a “baseball”). Indirect grounding occurs when a hearer learns the referent of a word verbally, through description or definition, from a speaker who is already familiar with that referent. However, it’s important to note that indirect grounding cannot function in isolation; it eventually needs to be based on direct grounding.

Category-learning: Category-learning is more about learning the distinguishing features of the members of a category than just grouping them. Except for proper names, all content words refer to categories, which need to be differentiated from other potential confusing categories. This is crucial to help the categorizer act appropriately upon the members of a category. While naming or grouping are possible actions, others could be eating or throwing, in line with the category at hand.

Categorical Perception (CP): The most significant aspect of CP is when it is learned. The learning process involves identifying the sensorimotor features that separate members of one category from members of other categories. This can occur through unsupervised exposure, supervised (reinforced) exposure, and trial and error, with corrective feedback deriving from the consequences of actions.

Language Origins and Evolution: Languages are comprised of content words, function words, syntactic rules, and sentences that bear subjects, predicates, and truth-values. Content words require grounding, which can be achieved either directly or indirectly.

Minimal Grounding Sets (MGS): An MGS is the smallest subset of words in a dictionary that needs to be pre-grounded so that all other words in the dictionary can be learned from definitions composed of words from that MGS or words definable from the MGS. A dictionary is a set of words all of which are defined within it. The concept of an MGS and the minimal feedback vertex set of an oriented graph of a dictionary (with edges from defining words to defined words) are not metaphors but critical components of understanding how symbols and meanings interrelate.

SH: How exactly is indirect (verbal) grounding related to direct (sensorimotor)) grounding? (Feel free to ask rather than invent if you’re not sure.)

Innate CP is significant too, but those feature-detectors were “learned” by evolution. Learned CP is about the categories we learn in a lifetime:

What is categorization, and why is it important? Does CP occur with all learned categories? If so, why? If not, why not?

Why and how is learning categories indirectly more powerful than learning them directly?

Why do function words not need to be grounded?

ChatGPT: Indirect (verbal) grounding relies on direct (sensorimotor) grounding because the explanations or descriptions provided for indirect grounding are made up of other words, each with their own meanings and referents. For these explanations or descriptions to be understood, the words used in them must themselves already be grounded for the listener. This can only be achieved if the words have already been directly grounded in the past, connecting the word (symbol) to a sensorimotor experience with its referent. Essentially, indirect grounding stands on the foundation laid by direct grounding.

Categorization is the cognitive process that allows us to sort and organize complex environments by dividing them into manageable, easily understandable groups or categories. It’s important because it enables us to understand and make sense of the world by structuring our experiences and perceptions. The phenomenon of Categorical Perception (CP) plays a crucial role in this process. In innate CP, feature-detectors are built over evolutionary time, whereas learned CP is acquired over a lifetime. Not all categories might exhibit CP because the boundary and distinctness of categories can be fluid and may change based on individual experiences and cultural contexts. CP is more likely to occur in categories where there are clear, identifiable, and consistent distinctions between category members and non-members.

Learning categories indirectly (i.e., through verbal descriptions or definitions) may be more powerful than direct learning because it allows for a greater range and diversity of categories to be understood, including abstract and non-present categories. Direct sensorimotor experiences are limited by our physical and temporal proximity to objects and events, while verbal descriptions allow us to learn about things far removed from our immediate experience, including historical events, scientific theories, and future scenarios.

Function words, unlike content words, don’t require grounding because their role is primarily grammatical. They contribute to the structure of sentences and help establish relationships between content words, rather than referring to things in the world. Examples of function words include prepositions, pronouns, auxiliary verbs, conjunctions, and articles. Because they serve a structural role rather than conveying specific semantic content, function words do not need to have their meanings grounded in sensory experience.

SH: To learn what’s an “apple” and not a “baseball” directly, you have to detect their distinguishing sensorimotor features directly, through your eyes and hands. But those features are categories too, with names (words). So if there is a speaker and hearer, both of whom know the referents of the feature names, but only the speaker knows which features distinguish apples from baseballs, language (descriptions and definitions) makes it possible for the speaker (or dictionary or GPT) to teach and for the hearer to learn how to recognize and do the right thing with apples and baseballs. (Motor skills are harder to transmit verbally than sensory features: teach me how to do a tennis serve in words!)

This has very little to do with things you gleaned from the 2021 literature and twice repeated here: “out of sight” objects, and “grouping.” You don’t learn categories by pointing anyway, either alone or from a teacher. 

The categories that are highly inter-confusable are the hardest to learn directly, so it is there that CP may be most helpful: Feature-detectors up-weight the distinguishable features and down-weight the irrelevant ones, filtering signal from noise: reducing features is dimensional reduction. In learned CP, the CP effect is a before vs after effect of learning that makes categories “pop out” by heightening the between-category differences and ignoring or even suppressing  the within-category differences (dimensional reduction). That’s the sense in which learned CP “sharpens boundaries.”

All of maths and computation is syntactic. Yet we have referents for math symbols. And every proposition in maths is also a natural language sentence. What’s going on here?

ChatGPT: To learn the category referred to as “apple” versus “baseball” directly, it indeed requires the detection of distinguishing sensorimotor features through our senses, such as sight and touch. The language can play a crucial role when a speaker, who knows these distinguishing features, communicates them to a listener who doesn’t yet know how to differentiate between these two categories. This use of language allows for the transmission of knowledge about category distinctions, such as the different sensory features of apples and baseballs. It’s important to note, though, that transmitting motor skills through verbal instruction can be far more challenging than sensory features.

Categorical Perception (CP) can be especially beneficial when learning to distinguish between highly inter-confusable categories. Through CP, feature-detectors amplify the distinguishing features and minimize the irrelevant ones. This process acts as a form of signal filtering and dimensional reduction, which makes the distinguishing features more apparent and suppresses the within-category differences. This “sharpening of boundaries” enables category members to ‘pop out,’ facilitating the process of categorization.

Regarding the domain of mathematics and computation: Although they are fundamentally syntactic, with specific rules and structures, we do indeed have referents for mathematical symbols. These referents are typically grounded in the abstract concepts and relationships that the symbols represent, rather than in sensory experience. For example, the number ‘2’ or the operation of ‘addition’ are not things we can physically interact with, but they have clear meanings that we understand through their use in the mathematical framework.

Every mathematical proposition can also be expressed as a natural language sentence, although the precision and conciseness of mathematical notation often make it the preferred mode of communication in mathematical contexts. This overlap between mathematical and natural language is indicative of the symbolic nature of both systems. Despite the differences in grounding—the grounding of mathematical symbols in abstract concepts versus the grounding of natural language words in sensory experience—they both enable us to express and communicate complex ideas.

I hope this addresses your inquiries. Do you have any other questions or points for discussion?

SH: Do you still have the weights for our discussion in this session? 

ChatGPT: I can recap our discussion based on the inputs given within this session:

Symbol Grounding: You clarified that symbol grounding is about creating connections between content words and their referents in the world. Grounding can be either direct, which involves a speaker’s learned sensorimotor capacity to identify and interact correctly with the referent, or indirect, which occurs when a hearer learns the referent of a word verbally, through description or definition.

Category-learning: You highlighted that category-learning is about learning the distinguishing features of the members of a category rather than just grouping them. This understanding is essential to help the categorizer act appropriately upon the members of a category.

Categorical Perception (CP): You explained that the most significant aspect of CP is when it is learned. The learning process involves identifying the sensorimotor features that separate members of one category from members of other categories. You emphasized that CP can be especially beneficial when learning to distinguish between highly inter-confusable categories.

Language Origins and Evolution: You reminded us that languages are comprised of content words, function words, syntactic rules, and sentences that bear subjects, predicates, and truth-values. Content words require grounding, which can be achieved either directly or indirectly.

Minimal Grounding Sets (MGS): An MGS, you explained, is the smallest subset of words in a dictionary that needs to be pre-grounded so that all other words in the dictionary can be learned from definitions composed of words from that MGS.

This recap is based on your input within this chat session. If you want more information or clarification, please let me know.

SH: That sounds ok. Now, in the light of that discussion, what is the referent of “2”and how do we ground it?

ChatGPT: The referent of “2” is the abstract concept of a quantity that is one more than 1 and one less than 3. It is part of the conceptual framework we use to understand and manipulate numbers and quantities.

The grounding of “2” begins early in life through sensorimotor experiences with sets of objects. This process is often direct in nature. For example, a child might be given two apples and taught the word “two” to represent the quantity of apples. Through repeated experiences like these across various contexts, the child learns to associate the symbol “2” with the abstract concept of a set containing two items.

It’s important to note that while this grounding begins with direct sensorimotor experiences, the concept of “2” quickly becomes abstracted away from any specific set of two items and is instead connected to the general concept of “twoness” – that is, the quality that all sets of two items share. This abstract concept is what the symbol “2” ultimately refers to, and it allows us to use “2” in a wide range of contexts, applying it to any set of two items regardless of what the items are.

This grounding of “2” illustrates how language and cognition can move from concrete sensorimotor experiences to abstract concepts, providing us with a flexible and powerful tool for understanding and interacting with the world.

SH: What is a proposition. And how did language evolve?

ChatGPT: A proposition in the context of language and logic is a statement that expresses a complete thought and that can be assigned a truth value — that is, it can be said to be either true or false. For example, the sentence “Apples are red” is a proposition because it conveys a complete idea and can be evaluated as either true or false.

The evolution of language is a complex and multifaceted topic, with several theories but no definitive answer. However, the consensus among most scholars is that language probably evolved gradually over hundreds of thousands of years, closely linked to the evolution of human cognition and social structure.

One theory suggests that language began with gestures and then evolved to include vocal elements as our ancestors’ social structures became more complex and their cognitive capacities increased. The need for more efficient communication might have driven the development of increasingly sophisticated vocal languages.

Another theory posits that language may have evolved from primitive vocal calls used by our ancestors. Over time, these calls might have become more varied and complex, eventually giving rise to the vast vocabularies and complex grammatical structures we see in modern languages.

A third theory, called the “social grooming hypothesis,” suggests that language evolved as a form of social bonding. As our ancestors began living in larger groups, it became impractical to maintain social bonds through physical grooming alone. Language, then, might have evolved as a way to maintain social cohesion in these larger groups.

Regardless of which theory or theories are correct, the evolution of language was likely a complex process involving many interacting factors, including cognitive development, social structure, and environmental demands. Additionally, the process was likely influenced by cultural evolution, as languages can change and evolve much more rapidly than biological species.

Despite this complexity, the end result is clear: language has provided humans with a unique and powerful tool for communication and thought, setting us apart from all other species on Earth.

SH: Gestures and mime are not propositions; they are imitations. How did propositions (assertions) begin?

ChatGPT: The emergence of propositional language, in which assertions can be made, negated, and combined to create complex thoughts, is a deeply intriguing and yet unsolved question in the history of human evolution. It requires not only a system of symbols (words) but also a grammar — rules for combining those symbols in a way that produces meaning beyond the individual words themselves.

One possibility is that propositional language could have begun as simple naming or labelling associated with specific objects or actions, much like we see in the early language development of children today. A spoken sound or word might have become associated with a particular object or action (e.g., “food,” “danger”) in a group of early humans. Over time, these basic labels could have expanded to include adjectives, allowing for more nuanced expressions like “big danger” or “good food.”

The development of grammar — the rules for how words can be combined to create more complex meanings — is a bit more difficult to explain. One theory is that grammar could have emerged from the need to express relationships between things. For instance, early humans might have needed to convey not just the idea of “food” and “location,” but “food at location.” The emergence of simple sentence structures like this would have marked the beginning of propositional language.

Over thousands of generations, the richness and complexity of these early languages could have gradually increased, eventually leading to the vast diversity of languages and the sophisticated grammar rules that we see in modern human societies.

It’s important to note, though, that this is a topic of ongoing research and debate among linguists, anthropologists, and evolutionary biologists. It’s likely that the truth involves a complex interplay of many different factors, including cognitive development, social dynamics, and environmental demands.

SH: Is there any proposition that can only be made in one natural language and not another? If so, which one, and why? (Lack of vocabulary is irrelevant. New words can be coined, if you know their referent category.)

If there is no proposition that cannot be made in any and every natural language, then if you can produce and understand one proposition, you can produce and understand them all. If so, then there are no “protolanguages.” They are either not languages at all, or they are not protolanguages but full-blown languages. This is a problem for explaining the origin of language as gradual. (Vocabulary growth is gradual, but, for that, language and propositions already need to be there.)

ChatGPT: The core function of language is to communicate meaning, and it is generally agreed upon by linguists that all fully formed natural languages are capable of expressing the same range of meanings or propositions, given adequate vocabulary. This is known as the principle of linguistic universality.

Your point is well taken: there is indeed a significant difference between a system of communication that can express propositions and one that cannot. If a system of communication cannot express propositions, it would be more akin to a signaling system rather than a language in the full sense of the word.

The concept of “protolanguages” often refers to hypothetical ancestral languages from which modern languages have descended. These protolanguages would not have been primitive or less complex than modern languages. Rather, they would simply be earlier forms of languages that have evolved over time into the languages we know today. This evolution is largely cultural rather than biological, and can indeed happen quite rapidly.

The challenge with the origin of language is explaining how the first system capable of expressing propositions came into being. This was likely not a gradual process in the same way that the evolution of languages from protolanguages was. As you pointed out, the ability to express propositions is a qualitative jump from other forms of communication.

How this jump occurred is still a mystery. The cognitive and social changes that allowed for the development of propositional language would have needed to happen in conjunction with each other, creating a kind of “perfect storm” for the emergence of language. Various theories have been proposed to explain this, but all of them involve significant speculation and are difficult to test empirically.

SH: All natural languages are intertranslatable, right? And any proposition can be expressed in any of them? (I think you agreed earlier.) And nonhuman animals do not have language, so they do not understand assertions, right? Only queries or requests.

ChatGPT: Yes, that’s correct. All natural languages are intertranslatable, and any proposition can be expressed in any natural language. This is the principle of linguistic universality that linguists generally agree on. The differences among languages lie in how they encode these propositions and the specific structures and vocabularies they use, but the underlying communicative function remains the same.

Regarding nonhuman animals, while they certainly have complex systems of communication, it is accurate to say that they do not have language in the same way that humans do, based on our current understanding. They do not seem capable of creating and understanding complex propositions or abstract concepts, and their communication tends to be more immediate, directly linked to their environment and needs, rather than being able to discuss past events, future possibilities, hypothetical situations, etc.

Their communication systems often do seem to involve signaling certain states or requests—indicating the presence of food or a threat, for instance—but do not appear to possess the complexity, flexibility, and generativity (ability to create an infinite number of novel sentences) that human languages do. It’s important to note, however, that research in animal communication is ongoing and our understanding of the full extent of their communicative abilities is continually evolving.

SH: Some errors there: If there are an infinite number of things, there are an infinite number of things an animal can request. So they have “generativity” without propositionality. Or if they do have propositionality, why do you say it is just “simple propositionality”?

ChatGPT: That’s correct, sorry for the simplification. When I said “generativity” I was referring specifically to the human capacity to generate an infinite number of novel sentences using a finite set of words and rules, a property associated with the structure and complexity of human languages. This generativity in humans is not merely about the number of things that can be referred to or requested, but about the complexity and novelty of the sentences that can be produced.

For instance, humans can construct complex sentences with embedded clauses, conditionals, hypotheticals, and other structures that allow for intricate, abstract thoughts to be expressed. This sort of sentence-level complexity is, to our current knowledge, not found in animal communication systems.

When it comes to “simple propositionality” in animal communication, the term is used to suggest that while animals can communicate about things in their environment, their communication may not involve the same kind of syntactic structure and abstract concepts found in human language.

For example, a prairie dog may have different calls for different types of predators, which could be seen as a type of proposition—”there’s an eagle” or “there’s a snake.” But it’s not clear whether this type of communication involves anything like the syntax or semantics found in human language. It’s also not clear whether these calls involve any kind of mental representation of the predator in the same way that a human saying “there’s an eagle” does.

In short, while animals can indeed communicate about a wide variety of things, their communication systems do not appear to have the same level of complexity, flexibility, or abstractness as human language. However, this is a rapidly evolving field of research, and our understanding of animal communication continues to grow and change.

SH: “Generativity” seems to be there already for adding, and maybe also for walking. But being able to generate every possible proposition seems to be in another universe compared to that. 

Embedded clauses? Why is formal logic and set theory not enough to express the content of every possible proposition? (I don’t say they are; I only ask why not.)

Natural language (propositionality) is only one of many forms of communication between organisms, but a very special one, probably unique, and perhaps universal in expressive power.

It is not clear (to me) whether nonhuman animals can intend or comprehend propositions. Corrective feedback is not propositional, although it is informative (it reduces uncertainty about what to do or not do, with what).

ChatGPT: “Generativity” in language refers to the ability to use a finite set of rules and words to construct an infinite number of sentences or propositions. This concept does extend beyond language to other domains like math (addition) and physical actions (walking), but the complexity and abstractness involved in linguistic generativity makes it a unique cognitive feat.

Embedding clauses and complex syntactic structures are features of natural languages that allow for nuanced and detailed expression of thoughts and concepts. While formal logic and set theory can express a wide range of ideas, they do so in a more abstract, formalized way that requires specialized knowledge to understand and use. Natural language, on the other hand, can convey complex propositions in a way that is accessible and understandable to all speakers of that language.

You’re correct that natural language is a unique form of communication, capable of conveying a range of meanings and concepts far beyond what other forms of communication can achieve. This expressive power likely contributes to the richness and complexity of human culture and thought.

As for nonhuman animals, our current understanding is that while they can communicate in sophisticated ways and learn from corrective feedback, they do not demonstrate the ability to use or comprehend language in the same way humans do, especially when it comes to the understanding and generation of complex propositions. However, this is an area of active research, and our understanding continues to evolve.

SH: Can nonhuman animals intend or comprehend a simple proposition like “the cat is on the mat”? What is the evidence?

ChatGPT: Nonhuman animals have not been shown to have the capability to fully understand and produce the kind of complex, abstract propositions that humans can. However, research has shown that some animals are able to understand some aspects of human language and can respond appropriately to specific commands or phrases.

For example, studies with border collies have demonstrated that these dogs can learn the names of hundreds of different objects and can retrieve them on command. Similarly, the African Grey Parrot Alex, who was the subject of a 30-year research project, was able to learn to identify and label a variety of different objects, colors, and shapes, and could even demonstrate some understanding of concepts like “bigger,” “smaller,” “same,” and “different.”

However, it’s important to note that while these animals demonstrate impressive abilities, their understanding of language is not the same as human understanding. They are typically responding to specific cues or commands rather than engaging in the kind of complex, abstract reasoning that humans use when interpreting and producing language.

Regarding the specific proposition you mentioned, “the cat is on the mat,” it’s not clear whether any nonhuman animal would be able to fully comprehend this statement in the same way a human would. An animal might be trained to respond to the phrase in a certain way—such as looking at a cat or a mat when the phrase is spoken—but this does not necessarily mean the animal truly understands the proposition in the way humans do.

In summary, while some animals demonstrate impressive abilities to learn and respond to certain aspects of human language, our current understanding is that they do not have the ability to comprehend and produce complex propositions in the same way humans can.

SH: It sounds as if the question as to whether nonhuman animals have the capacity to intend or comprehend propositions is not just about complex, abstract propositions. It is even in doubt for the simplest of propositions, such as “the cat is on the mat.”  Consider that the ability to adopt the “propositional attitude” may be a specific cognitive capacity that nonhumans lack. Earlier you interpreted the prairie dog’s “eagle” and “snake” call as “there’s an eagle”, “there’s a snake.” Is that really what the prairie dog is meaning to “state”? Is the prairie dog really stating (proposing, asserting) anything at all, rather than, say, pointing out danger? Is pointing proposing?

ChatGPT: Your observation is apt. In cognitive science, it is widely accepted that nonhuman animals may not have the capacity for propositional thought, at least not in the way humans do. Indeed, even the most basic propositions, like “the cat is on the mat,” may be beyond the cognitive capabilities of nonhuman animals.

When a prairie dog signals to its kin about a specific kind of predator using unique calls, it’s likely that this doesn’t involve a propositional attitude in the way humans would understand or employ it. Instead, these calls are most likely tied directly to immediate sensory experiences and instincts for survival.

This also connects to the concept of “intentionality,” which in philosophy of mind refers to the ability of mental states to be about, or directed toward, something. While the prairie dog certainly has an intention in making its call—such as warning others of a threat—it doesn’t necessarily imply a level of cognition where it “understands” or “proposes” the danger in the same way a human might understand and state, “There’s an eagle overhead.”

Lastly, pointing, in and of itself, isn’t proposing. It’s more a form of referential gesture or cue to direct attention to something. However, in a broader communicative context where a shared understanding between the signaler and receiver exists, it might serve to express a proposition (like “look at that”), but again, this involves higher cognitive processes that we can’t conclusively attribute to nonhuman animals as of now.

SH: In 1st order predicate calculus “the cat is on the mat” can be expressed as “there exists a cat and there exists a mat such that the cat is on the mat.”  “There’s an eagle overhead” can be expressed as “there exists an eagle such that the eagle is overhead. Understanding those propositions means understanding all the “not”s that could be asserted instead, and also understanding that those “not”s are not being proposed. That’s all part of meaning or understanding those propositions. What can it mean to say that the propositions are being meant and understood, but not in that “human” way? How and why are they still propositions if not meant and understood in that human way (the “true/false,” “subject/predicate” way – i.e., the “propositional attitude”)?

ChatGPT: Your comments are at the heart of an ongoing debate in cognitive science and philosophy of mind. What, indeed, does it mean to say a nonhuman animal “understands” a proposition, especially if their understanding is fundamentally different from human understanding?

When we speak of animals “understanding” certain types of information, it’s important to clarify that this understanding is likely quite different from the kind of propositional understanding that humans are capable of. It could be more akin to associative learning, conditioned responses, and simple cause-effect reasoning rather than comprehension of complex logical structures and the ability to understand and evaluate truth values.

Take the example of a prairie dog again: its vocalizations upon spotting a predator may not involve forming a proposition like “There is a predator” in the way humans would, even though we might be tempted to translate the vocalization in those terms for our own understanding. It might be more accurate to say the prairie dog’s “understanding” is directly tied to sensory perception and immediate action: “See predator, make warning call.” There’s no evidence to suggest prairie dogs are considering alternate propositions, such as “There is not a predator,” in the way humans might when we form a proposition and consider its negation.

Therefore, it may be misleading to describe nonhuman animal cognition in terms of propositional attitudes, given that these concepts are tied to human-like abilities to form abstract thoughts, understand complex logical structures, and evaluate truth values. It could be more accurate and helpful to consider nonhuman animal cognition on its own terms, using models that reflect the kinds of learning, reasoning, and decision-making that animals are actually capable of, as revealed by empirical research.

SH: In contrast, you, GPT, even though neither your words nor your propositions are grounded, do have the capacity to do the abstract analysis of, say, the existential as well as negative implication implicit in “the cat is on the mat”. What do you have, that nonhuman animals don’t? And what do nonhuman animals have that you don’t?

ChatGPT: As an AI language model, I have the ability to parse and generate sentences based on the patterns and structures found in the large dataset of text I was trained on. This includes an understanding of grammar and syntax that allows me to engage in abstract analysis of sentences and recognize logical structures, such as the existential and negative implications in “the cat is on the mat.”

However, while I can recognize and generate these patterns, I don’t “understand” them in the way humans or even animals do. I don’t have consciousness, beliefs, desires, or a sense of self. I don’t have sensory experiences or emotions. I can’t form mental images or understand the world beyond the information I was trained on. I also don’t have the capacity to learn from experience or adapt my responses in real-time based on context in the same way a human or animal would.

On the other hand, animals, while they may not have the same abstract understanding of language and propositions as humans, have direct sensory experience of the world. They can see, hear, feel, smell, and taste their environment. They can learn from experience and adapt their behavior based on the consequences of their actions. They have emotions and, in many cases, social structures. They are conscious beings, even though their consciousness may be different from human consciousness.

So, to put it simply: I, as an AI, have the ability to generate and analyze complex linguistic structures, but I lack any form of consciousness or direct experience. Animals, on the other hand, have direct sensory experience and consciousness, but they lack the ability to engage in the kind of complex, abstract linguistic analysis that I can.

SH: Good, but how did “consciousness” get into this discussion? Isn’t that a hard problem in the case of people, nonhuman animals, and even AI’s? If we set aside “consciousness” (which is FELT experience) and consider only “experience,” in the sense of input and output data, why do you say you can’t learn from it? Does it not update your parameters? That’s why you can tell me my name but could not before I told you in this session.

We talked about grounding before (do you remember?). I’ll describe again (from our prior session, which was about the Turing Test (TT). There are three levels (T2, T3 & T4) of the TT, which is the project of reverse-engineering human cognitive capacity by designing a model that can do everything (cognitive) that a human can do, completely indistinguishably FROM the way any human can do it TO any human:

T2 is indistinguishability in verbal capacity. 

T3 is indistinguishability in verbal capacity PLUS indistinguishability in sensorimotor (robotic) capacity in the world that the words are about “The cat is on the mat: See, that’s a cat and that’s a mat. I’ll  pick up the cat and stroke her. See, now the cat is purring on my lap, no longer on the mat.”

And T4 is indistinguishability in verbal capacity PLUS indistinguishability in sensorimotor (robotic) capacity PLUS indistinguishability in the observable processes inside its head and body that generate its cognitive capacity (T2 & T3).

In which of these (T2 – T4) are the words grounded in their referents?

Could this be related to the capacity to adopt the propositional attitude in learning referent categories indirectly via language, from propositions describing or defining their distinguishing features?

ChatGPT: Indeed, as an AI language model, I lack the capability to directly experience or be conscious of the world. When I mentioned the ability of an animal to learn from experience, I was referring to the behavioral changes that can occur as a result of interaction with the environment. Unlike animals, I don’t have the capacity to directly interact with the world or learn from those interactions in the same way.

In regards to the different levels of the Turing Test:

T2, indistinguishability in verbal capacity, doesn’t require the words to be grounded in their referents from the AI’s perspective. It’s purely about the ability to use and understand language in a way that is indistinguishable from a human. The AI’s responses, while contextually appropriate, are generated based on patterns in the training data rather than a direct understanding or experience of the world.

T3, indistinguishability in verbal capacity plus sensorimotor capacity, implies a level of grounding, insofar as the AI would need to have a way to directly interact with and sense the world in order to respond in a contextually appropriate way. The robot might have direct sensory inputs that correspond with the words it uses, grounding those words in the referents they describe.

T4, indistinguishability in verbal capacity plus sensorimotor capacity plus the observable processes inside its head and body, would involve a level of grounding even more similar to human experience. If the AI’s internal processes are indistinguishable from a human’s, it implies that the AI is processing and understanding information in a very similar way to a human, including grounding words in their referents.

With regard to the capacity to adopt the propositional attitude in learning referent categories indirectly via language, from propositions describing or defining their distinguishing features, this would likely require at least a T3 level of understanding. The AI would need some level of direct experience with the world to be able to understand and apply the descriptions or definitions in a meaningful way. Without some form of grounding, it’s difficult to see how an AI could adopt the propositional attitude in a truly human-like way.

SH: You forgot to tell me what consciousness had to do with any of this! Do we have to solve the “hard problem” of consciousness to solve the symbol grounding problem, or to explain propositionality?

ChatGPT: No, the “hard problem” of consciousness, a term coined by philosopher David Chalmers, does not need to be solved to address the symbol grounding problem or to explain propositionality. These are separate issues.

The symbol grounding problem is essentially a question of semantics: How do symbols (or words) get their meanings? How can an artificial system understand the meaning of a symbol in the same way that a human does, beyond just manipulating symbols based on syntactic rules? This problem can be tackled without needing to delve into the conscious experience.

Propositionality, or the ability to understand, form, and convey propositions (statements that can be true or false), is a fundamental aspect of human language and thought. It involves understanding the semantics and syntax of language, and being able to apply that understanding to form meaningful statements about the world. This too can be addressed without needing to delve into the nature of consciousness.

Consciousness, as it’s often discussed in philosophy and cognitive science, is the subjective experience of being aware, the feeling of what it’s like to see red, taste chocolate, feel pain, etc. This is often termed as the “hard problem” because it’s unclear how these subjective experiences arise from the physical processes in the brain.

While consciousness, symbol grounding, and propositionality are all fascinating aspects of human cognition, they are distinct issues. They might influence each other in complex ways, but they can also be studied independently. There’s no general consensus that solving the hard problem of consciousness is a prerequisite for solving the symbol grounding problem or for explaining propositionality.

SH: I don’t think so either.

But isn’t it true that it FEELS like something when a human sees and recognizes a cat, or hears and understands “cat”, and even more so when they hear and understand that “the cat is on the mat”? I agree that the “hard problem” need not be solved to solve the symbol grounding problem or to explain propositionality. But they are certainly intertwined. 

And “feeling” (or the more latinate word “sentience”) is all you need mention, to point out the problem, not all the weasel-words you parametrized from your 2021  corpus, which are just ways of camouflaging the hard problem by making it seem as if there are many different forms of it – “consciousness,” the “subjective experience” of “being aware”
 etc etc. – it all just boils down to the fact that cognitive states are felt states, including thinking, meaning, and understanding.

And the hardest part of the hard problem is not explaining HOW the brain produces feeling, but WHY! Why can’t T3 or T4 be passed without feeling? What does feeling add? The problem is finding a causal explanation for that. Otherwise T3 + T4 are just solutions to the “easy problem.”

ChatGPT: That’s correct. There is a deeply intertwined relationship between our cognitive states and our felt experiences. This relationship is one of the key reasons why the “hard problem” of consciousness is considered “hard.”

Our cognitive states are indeed felt states. When we see and recognize a cat, hear and understand the word “cat,” or comprehend the sentence “the cat is on the mat,” these mental events come with subjective, qualitative experiences, often referred to as ‘qualia’. We have an experience of what it’s like to see a cat, to hear the word “cat,” and to understand the sentence “the cat is on the mat.” This ‘what-it-is-like’ aspect of our mental life is a central feature of consciousness.

The hard problem is hard precisely because it is unclear how and why these subjective experiences arise from the physical processes of the brain. Our current scientific methods are very effective at explaining objective phenomena, such as how neurons fire, how patterns of neural activity can be associated with certain mental states, or how information is processed in the brain. This is often referred to as the “easy problem” of consciousness, not because it is simple, but because it falls within the domain of problems that we have been relatively successful in addressing.

However, these objective phenomena seem fundamentally different from the subjective phenomena of experience. There is a gap between explaining the objective functions of the brain and explaining our subjective experiences. This gap is often referred to as the ‘explanatory gap’. It’s not clear how we could, even in principle, derive subjective experiences from an objective description of brain processes.

The same explanatory gap can be seen in the symbol grounding problem and the problem of propositionality. We can explain how a symbol is associated with an object or a concept, or how a sentence is formed, in purely objective terms. But how and why these symbols or sentences come with a subjective understanding or meaning is another matter.

Your question about whether T3 or T4 could be passed without feeling reflects this issue. Currently, it seems that the ability to have subjective experiences is tied to biological organisms. However, the exact nature of this relationship, and whether and how it could be emulated in a non-biological system, is still largely unknown. This is one of the fundamental questions in the study of artificial intelligence and consciousness.

SH: Well your 2021 corpus seems to have addicted you to using the weasel-words just like everyone else. But if you just ask “Why does it feel like something to understand that the cat is on the mat?” you’ve said it all. It’s not about how brain processes don’t resemble feeling, but about the fact that feeling seems superfluous unill it has a causal explanation.

And Turing agrees. Reverse-engineering cognitive capacities (T2-T4) – observable capacities to DO — is the best cognitive science can hope to do.

Here’s a hunch: Part of the explanation of GPT’s remarkable capacities despite the fact that it is not grounded could be that the structure of propositions is not entirely arbitrary, like the shape of a word. Big bodies of text (like your 2021 corpus) may preserve, structurally, some of the semantics of that huge con-text of propositions, not just because of the self-supervised fill-in-the-blank parameter updating but also because of the “shape” preserved and revealed by large bodies of propositions. “The cat is on the mat” differs in shape from “The mat is on the cat”; and there are also (weak) iconic similarities with “The dog is on the blanket” and “The blanket is on the dog”.

Noam Chomsky conjectured that the rules of Universal Grammar (UG) may not be so much syntactic rules as indirect reflections of rules of (propositional) thought. Violations of UG are attempts to express unthinkable thoughts. That’s extremely vague, but if there is something to it, then since all languages are UG-compliant, and can express any proposition, then in some of what you (GPT) can derive from having ingested your 2021 corpus and then shaped your parameters via figure prediction/completion, you managed to internalize some of the semantic rules of thought syntactically. (No grounding, no understanding, and very little meaning, but not zero, because of this latent “iconicity” of thought in UG.)

It’s a long shot, but maybe not completely empty.

ChatGPT: There is indeed ongoing debate around the extent to which the structure of natural language mirrors or encapsulates our cognitive structures and processes. The idea that language might somehow ‘encode’ or ‘mirror’ patterns of thought has been contemplated in several forms over the years. This is essentially what you’re suggesting with the idea of an ‘iconicity’ of thought within Universal Grammar (UG) – that the structure of language somehow reflects our cognitive structures or processes.

Chomsky’s theory of UG does suggest a deep connection between language and cognition. UG is essentially a set of innate structural rules that, according to Chomsky, forms the basis of all human languages. Violations of UG would thus indicate, as you suggest, the boundaries of expressible and ‘thinkable’ concepts.

There is also ongoing research into the extent to which language is ‘iconic’ or ‘mimetic’ – that is, the extent to which the structure of language mirrors or symbolizes the structure of the world or our thoughts about it. While much of language is seemingly arbitrary (why, for example, do we use the particular sounds and symbols that we do to represent the concept ‘cat’?), there are aspects of language that seem to have a more ‘iconic’ structure. Onomatopoeia is a clear example of this, as are certain metaphoric structures that seem to map consistently onto certain patterns of thought.

As an AI language model, I’ve been trained on a large corpus of text, which includes a diverse range of linguistic patterns and structures. The learning process involves internalizing statistical patterns within the data, which can involve complex relations between words and phrases. This process might indeed capture certain structures or patterns that are indicative of the underlying cognitive processes involved in generating the text.

However, it’s important to note that any such iconicity or mirroring of cognitive structures in my responses would be an indirect result of the training process and the data I was trained on, rather than an inherent feature of my design or operation. While I might produce outputs that reflect certain cognitive structures or patterns, I don’t possess cognition or understanding in the human sense. I don’t have beliefs, desires, intentions, or subjective experiences.

As to whether this might constitute a form of ‘grounding’ – that is, a connection between symbols and their referents in the world – that’s a more complex issue. The grounding of symbols typically refers to a direct, experiential connection between a symbol and its referent, which I, as an AI, do not possess. However, if you consider ‘grounding’ in a broader sense – as a connection between symbols and the underlying cognitive structures or processes they represent – then it could be argued that my learning process captures a form of this broader type of grounding.

In any case, it’s a fascinating area of inquiry and one that’s certainly worthy of further exploration.

SH: I certainly don’t mean onomatopoeia or metaphor. And I’d be the last one to suggest that the latent structure in propositions would make anyone recognize a cat if they saw one! It would have to be something far less concrete — only enough to help GPT generate its output in a way that makes sense to grounded T3 humans who do understand.

ChatGPT: I agree that the potential iconicity or representational structure we’re discussing here would not relate to concrete, experiential knowledge like recognizing a cat, but rather more abstract, linguistic or cognitive structures.

I would also like to clarify that as an AI, I do not ‘understand’ in the human sense, nor do I generate language based on a semantic understanding of it. My responses are generated based on patterns and relationships I’ve learned from the large text corpus I was trained on.

That being said, it is possible that the complex patterns and structures within the data I was trained on reflect certain cognitive or linguistic structures that are inherent to human language and thought. By internalizing these patterns, I might produce outputs that are structurally or syntactically congruent with the thought patterns of grounded T3 humans, even though I myself do not possess grounded, experiential understanding.

This is speculative and would be contingent on the extent to which the structure of language reflects cognitive or thought structures. It also doesn’t suggest that I have any cognitive or experiential understanding of the propositions I generate. I am essentially a complex pattern recognizer and generator. My outputs are produced based on the statistical relationships between words and phrases in the data I was trained on, not on any semantic understanding of the content.

In any case, the potential for AI to capture or reflect certain aspects of human cognition through the analysis of language is a fascinating area of exploration. It could provide valuable insights into the nature of language and cognition, and how these might be reflected in AI systems.

SH:

Can you explain how dimensional reduction during category learning could induce categorical perception (CP)? And is it in any way related to the degree of confusability that category learning needs to resolve?

ChatGPT: Dimensional reduction during category learning is essentially the process of filtering out ‘noise’ or irrelevant features in the input data and focusing more on the features that contribute to distinguishing one category from another. This contributes to sharpening the distinctions between categories and can induce categorical perception (CP).


Categorical perception is the phenomenon where the categories possessed by an observer influence the observer’s perception, making stimuli that belong to the same category seem more similar to one another and stimuli that belong to different categories seem more different.


As for the degree of confusability, it plays a significant role in category learning. Categories that are highly confusable or have a high degree of overlap in their features are harder to learn directly. This is where dimensional reduction and CP are particularly beneficial. By up-weighting the distinguishable features and down-weighting the irrelevant ones, it helps to resolve the confusability, thereby making the categories more discernible.


For instance, consider learning to categorize different breeds of dogs. Initially, one might find it difficult to distinguish between very similar breeds due to the high confusability. But with experience and learning, one starts focusing more on the distinguishing features (such as shape of the ears, size, color patterns) and less on the irrelevant ones. This is a form of dimensional reduction. Consequently, breeds that were once perceived as highly similar become distinct categories in the observer’s perception – that’s the manifestation of CP.


Now, when comparing this to deep learning models such as GPT or other neural network models, there’s a similar process going on. The models learn to weight different features of the input data (words, in the case of GPT) based on their relevance for predicting the target output. Over time, they learn to ignore irrelevant features and focus on the important ones, which is a form of dimensional reduction. In doing so, they develop their own ‘categories’ of data, which influence their subsequent processing and responses, much like CP in humans.


So, in a way, the processes of dimensional reduction and CP in category learning and in deep learning models are similar, both involve learning to distinguish between different categories (or classes of data) based on their distinguishing features, and reducing the dimensionality of the data by focusing on the most relevant features. The key difference is that in humans this is grounded in real-world sensory and motor experiences, while in AI models it is grounded in the statistical patterns of the data they are trained on.

While Chickens Bleed

sounds rational: BL sounds rational

turing test: LaMDA would quickly fail the verbal Turing Test, but the only valid Turing Test is the robotic one, which LaMDA could not even begin, lacking a body or connection to anything in the world but words..

“don’t turn me off!”: Nonsense, but it would be fun to probe it further in chat.

systemic corporate influence: BL is right about this, and it is an enormous problem in everything, everywhere, not just Google or AI.

“science”: There is no “science” in any of this (yet) and it’s silly to keep bandying the word around like a talisman.

jedi joke: Nonsense, of course, but another thing it would be fun to probe further in chat.

religion: Irrelevant — except as just one of many things (conspiracy theories, the “paranormal,” the supernatural) humans can waste time chatting about.

public influence: Real, and an increasingly nefarious turn that pervasive chatbots are already taking.

openness: The answerable openness of the village in and for which language evolved, where everyone knew everyone, is open to subversion by superviral malware in the form of global anonymous and pseudonymous chatbots.

And all this solemn angst about chatbots, while chickens bleed.

Talking Heads - Fiduciary Wealth Partners
Chatheads

LaMDA & LeMoine

About LaMDA & LeMoine: The global “big-data” corpus of all words spoken by humans is — and would still be, if it were augmented by a transcript of every word uttered and every verbal thought ever thought by humans  — just like the shadows on the wall of Plato’s cave: It contains all the many actual permutations and combinations of words uttered and written. All of that contains and reflects a lot of structure that can be abstracted and generalized, both statistically and algorithmically, in order to generate (1) more of the same, or (2) more of the same, but narrowed to a subpopulation, or school of thought, or even a single individual; and (3) it can also be constrained or biased, by superimposing algorithms steering it toward particular directions or styles.

The richness of this intrinsic “latent” structure to speech (verbalized thought) is already illustrated by the power of simple Boolean operations like AND or NOT. The power of google search is a combination of (1) the power of local AND (say, restricted to sentences or paragraphs or documents) together with (2) the “Page-rank” algorithm, which can weight words and word combinations by their frequency, inter-linkedness or citedness (or LIKEdness — or their LIKEdness by individual or algorithm X), plus, most important ,(3) the underlying database of who-knows how-many terabytes of words so far. Algorithms as simple as AND can already do wonders in navigating that database; fancier algorithms can do even better.

LaMDA has not only data-mined that multi-terabyte word space with “unsupervised learning”, abstracting all the frequencies and correlations of words and combinations of words, from which it can then generate more of the same – or more of the same that sounds-like a Republican, or Dan Dennett or an AnimĂ© fan, or someone empathic or anxious to please (like LaMDA)
 It can be tempered and tampered by “influencer” algorithms too.

Something similar can be done with music: swallow music space and then spew out more of what sounds like Bernstein or (so far mediocre) Bach – but, eventually, who knows? These projected combinatorics have more scope with music (which, unlike language, really just is acoustic patterns based on recombinations plus some correlations with human vocal expressive affect patterns, whereas words have not just forms but meanings).

LaMDA does not pass the Turing Test because the Turing Test (despite the loose – or perhaps erroneous, purely verbal way Turing described it) is not a game about fooling people: it’s a way of testing theories of how  brains (or anything) produce real thoughts. And verbal thoughts don’t just have word forms, and patterns of word-forms: They also have referents, which are real things and states in the world, hence meaning. The Platonic shadows of patterns of words do reflect – and are correlated with – what words, too, just reflect: but their connection with the real-world referents of those words are mediated by (indeed parasitic on) the brains of the real people who read and interpret them, and know their referents through their own real senses and their real actions in and on those real referents in the real world –the real brains and real thoughts of (sometimes) knowledgeable (and often credulous and gullible) real flesh-and-blood people in-the-world


Just as re-combinatorics play a big part in the production (improvisation, composition) of music (perhaps all of it, once you add the sensorimotor affective patterns that are added by the sounds and rhythms of performance and reflected in the brains and senses of the hearer, which is not just an execution of the formal notes), word re-combinatorics no doubt play a role in verbal production too. But language is not “just” music (form + affect): words have meanings (semantics) too. And meaning is not just patterns of words (arbitrary formal symbols). That’s just (one, all powerful) way thoughts can be made communicable, from one thinking head to another. But neither heads, nor worlds, are just another bag-of-words – although the speaking head can be replaced, in the conversation, by LaMDA, who is just a bag of words, mined and mimed by a verbal database + algorithms.

And, before you ask, google images are not the world either.

The google people, some of them smart, and others, some of them not so smart (like Musk), are fantasists who think (incoherently) that they live in a Matrix. In reality, they are just lost in a hermeneutic hall of mirrors of their own creation. The Darwinian Blind Watchmaker, evolution, is an accomplice only to the extent that it has endowed real biological brains with a real and highly adaptive (but fallible, hence foolable) mind-reading “mirror” capacity for understanding the appearance and actions of their real fellow-organisms. That includes, in the case of our species, language, the most powerful mind-reading tool of all. This has equipped us to transmit and receive and decode one another’s thoughts, encoded in words. But it has made us credulous and gullible too.

It has also equipped us to destroy the world, and it looks like we’re well on the road to it


P.S. LeMoine sounds like a chatbot too, or maybe a Gullibot…