Monochrome Mary, GPT, and Me

Frank Jackson’s hypothetical black-and-white room-reared color neuroscientist ā€œMaryā€, unlike ChatGPT, sees (but only black/white/gray). So her vocabulary for setting up her explanation of color perception is nevertheless grounded. (Even Helen Keller’s was, because, lacking sight and hearing since 19 months old, she still had touch and kinesthesia.) ChatGPT has no grounding at all.

So we can get from ā€œhorse, horn, vanish, eyes, observing instrumentsā€ to the unobservable ā€œpeekaboo unicornā€ purely verbally, using just those prior (grounded) words to define or describe it.

But ChatGPT can’t get anywhere, because it has no grounded vocabulary at all. Just words ā€œin contextā€ (ā€œ100 trillion parameters and 300 billion words…: 570 gigabytes of text dataā€).

So much for ChatGPT’s chances of the ā€œphase transitionā€ some emergence enthusiasts are imagining — from just harvesting and organizing the words in ChatGPT’sĀ Ā bloated 570gig belly, to an ā€œemergentā€Ā understandingĀ of them. You, me, Helen Keller or Monochrome Mary, we could, in contrast, understand ChatGPT’s words. And we couldĀ meanĀ them if we said them. Ā So could the dead authors of the words in ChatGPT’s belly, once. That’s the difference between (intrinsic) understanding and word-crunching by a well-fed statistical parrot.

Two important, connected, details: (1) Mary would be more than surprised if her screen went from B/W to color: she would become able to DO things that she could not do before (on her own) locked in her B/W room — like tell apart red and green, just as would any daltonian, if their trichromacy were repaired.Ā 

More important, (2) Mary, or Helen, or any daltonianian could be told any number of words about what green is neurologically and what things are green and what not. But they cannot be told what seeing green FEELS-LIKE (i.e., what green looks-like). And that’s the point. If it weren’t for the ā€œhard problemā€ – the fact that itĀ feels-like somethingĀ to see, hear, touch [andĀ understandĀ andĀ mean] something (none of which is something you DO) – Mary, Helen, and a daltonian would not be missing anything about green. Grounding in transducer/effector know-how alone would be all there was to ā€œmeaningā€ and ā€œunderstanding.ā€ But it’s not.Ā 

Mary, Helen and the daltonian can make do with (imperfect) verbal analogies and extrapolations from the senses they do have to the senses they lack. But ChatGPT can’t. Because all it has is a bellyful of ungrounded words (butĀ syntactically structuredĀ by their grounded authors – some of whom are perhaps already underground, long dead and buried…).

So meaning and understanding is not just sensorimotor know-how. It’s based onĀ sentience too (in fact, first and foremost, though, so far, it’s an unsolved hard-problem to explain how or why). ChatGPT (attention ā€œsingularityā€ fans!) lacks that completely.Ā 

It FEELS-LIKE something to understand what red (or round, or anything) means. And that’s not just transducer/effector know-how that percolates up from the sensorimotor grounding. It’s what it feels-like to see, hear, touch, taste and manipulate all those things out there that we are talking about, with our words. 

ChatGPT is talking, but ABOUT nothing. The ā€œaboutnessā€ is supplied by the original author’s and the reader’s head. ChatGPT is just producing  recombinatory, fill-in-the-blanks echolalia.

Monochrome Rainbow by DALL-E

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.