The Turing Test (draft)

Here’s a tale called  called “The Turing Test.” It’s  Cyrano de Bergerac re-done in email and texting, by stages, but before the Zoom and ChatGPT era.

First, email tales, about how these days there is a new subspecies of teenagers, mostly male, who live in the virtual world of computer games, ipod music, videos, texting, tweeting and email, and hardly have the motivation, skill or courage to communicate or interact in the real world.

They’re the ones who think that everything is just computation, and that they themselves might just be bits of code, executing in some vast virtual world in the sky.

Then there are the students (male and female) enrolled in “AI (Artificial Intelligence) for Poets” courses — the ones who dread anything that smacks of maths, science, or programming. They’re the ones who think that computers are the opposite of what we are. They live in a sensory world of clubbing, ipod, tweeting… and texting.

A college AI teacher teaches two courses, one for each of these subpopulations. In “AI for Poets” he shows the computerphobes how they misunderstand and underestimate computers and computing. In “Intro to AI” he shows the computergeeks how they misunderstand and overestimate computers and computing.

This is still all just scene-setting.

There are some actual email tales. Some about destructive, or near-destructive pranks and acting-out by the geeks, some about social and sexual romps of the e-clubbers, including especially some pathological cases of men posing in email as handicapped women, and the victimization or sometimes just the disappointment of people who first get to “know” one another by email, and then meet in real life. 

But not just macabre tales; some happier endings too, where email penpals match at least as well once they meet in real life as those who first meet the old way. But there is for them always a tug from a new form of infidelity: Is this virtual sex really infidelity?

There are also tales of tongue-tied male emailers recruiting glibber emailers to ghost-write some of their emails to help them break the ice and win over female emailers, who generally seem to insist on a certain fore-quota of word-play before they are ready for real-play. Sometimes this proxy-emailing ends in disappointment; sometimes no anomaly is noticed at all: a smooth transition from the emailer’s ghost-writer’s style and identity to the emailer’s own. This happens mainly because this is all pretty low-level stuff, verbally. The gap between the glib and the tongue-tied is not that deep.

A few people even manage some successful cyberseduction with the aid of some computer programs that generate love-doggerel on command.

Still just scene-setting. (Obviously can only be dilated in the book; will be mostly grease-pencilled out of the screenplay.)

One last scene-setter: Alan Turing, in the middle of the last century, a homosexual mathematician who contributed to the decoding of the Nazi “Enigma” machine, makes the suggestion — via a party game in which people try to guess, solely by passing written notes back and forth, which of two people sent out into another room is male and which is female (today we would do it by email) — that if, unbeknownst to anyone, one of the candidates were a machine, and the interaction could continue for a lifetime, with no one ever having any cause to think it was not a real person, with a real mind, who has understood all the email we’ve been exchanging with him, a lifelong pen-pal — then it would be incorrect, in fact arbitrary, to conclude (upon at last being told that it had been a machine all along) that it was all just an illusion, that there was no one there, no one understanding, no mind. Because, after all, we have nothing else to go by but this “Turing Test” even with one another.  

Hugh Loebner has (in real life!) set up the “Loebner Prize” for the writer of the first computer program that successfully passes the Turing Test. (The real LP is just a few thousand dollars, but in this story it will be a substantial amount of money, millions, plus book contract, movie rights…). To pass the Test, the programmer must show that his programme has been in near-daily email correspondence (personal correspondence) with 100 different people for a year, and that no one has ever detected anything, never suspected that they were corresponding with anyone other than a real, human penpal who understood their messages as surely as they themselves did.

The Test has been going on for years, unsuccessfully — and in fact both the hackers and the clubbers are quite familiar with, and quick at detecting, the many unsuccessful candidates, to the point where “Is this the Test?” has come [and gone] as the trendy way of saying that someone is acting mechanically, like a machine or a Zombie. The number of attempts has peaked and has long subsided into near oblivion as the invested Loebner Prize fund keeps growing. 

Until a well-known geek-turned cyber-executive, Will Wills, announces that he has a winner.

He gives to the Loebner Committee the complete archives of the email exchanges of one hundred candidates, 400 2-way transcripts for each, and after several months of scrutiny by the committee, he is declared the winner and the world is alerted to the fact that the Turing Test has been passed. The 100 duped pen-pals are all informed and offered generous inducements to allow excerpts from their transcripts to be used in the publicity for the outcome, as well as in the books and films — biographical and fictional — to be made about it.

Most agree; a few do not. There is more than enough useable material among those who agree. The program had used a different name and identity with each pen-pal, and the content of the exchanges and the relationships that had developed had spanned the full spectrum of what would be expected from longstanding email correspondence between penpals: recounting (and commiserating) about one another’s life-events (real on one side, fictional on the other), intimacy (verbal, some “oral” sex), occasional misunderstandings and in some cases resentment. (The Test actually took closer to two years to complete the full quota of 100 1-year transcripts, because twelve correspondents had dropped out at various points — not because they suspected anything, but because they simply fell out with their pen-pals over one thing or another and could not be cajoled back into further emailing: These too are invited, with ample compensation, to allow excerpting from their transcripts, and again most of them agree.)

But one of those who completed the full 1-year correspondence and who does not agree to allow her email to be used in any way, Roseanna, is a former clubber turned social worker who had been engaged to marry Will Wills, and had originally been corresponding (as many of the participants had) under an  email pseudonym, a pen-name (“Foxy17”). 

Roseanna is beautiful and very attractive to men; she also happens to be thoughtful, though it is only lately that she has been giving any attention or exercise to this latent resource she had always possessed. She had met Will Wills when she was already tiring of clubbing but still thought she wanted a life connected to the high-rollers. So she got engaged to him and started doing in increasing earnest the social work for which her brains had managed to qualify her in college even though most of her wits then had been directed to her social play.

But here’s the hub of this (non-serious) story: During the course of this year’s email penpal correspondence, Roseanna has fallen in love with Christian (which is the name the Turing candidate was using with her): she had used Foxy17 originally, but as the months went by she told him her real name and became more and more earnest and intimate with him. And he reciprocated.

At first she had been struck by how perceptive he was, what a good and attentive “listener” he — after his initial spirited yet modest self-presentation — had turned out to be. His inquiring and focussed messages almost always grasped her point, which encouraged her to become more and more open, with him and with herself, about what she really cared about. Yet he was not aloof in his solicitousness: He told her about himself too, often, uncannily, having shared — but in male hues — many of her own traits and prior experiences, disappointments, yearnings, rare triumphs. Yet he was not her spiritual doppelganger en travesti (she would not have liked that): There was enough overlap for a shared empathy, but he was also strong where she felt weak, confident where she felt diffident, optimistic about her where she felt most vulnerable, yet revealing enough vulnerability of his own never to make her fear that he was overpowering her in any way — indeed, he had a touching gratefulness for her own small observations about him, and a demonstrated eagerness to put her own tentative advice into practice (sometimes with funny, sometimes with gratifying results).

And he had a wonderful sense of humor, just the kind she needed. Her own humor had undergone some transformations: It had formerly been a satiric wit, good for eliciting laughs and some slightly intimidated esteem in the eyes of others; but then, as she herself metamorphosed, her humor became self-mocking, still good for making an impression, but people were laughing a little too pointedly now; they had missed the hint of pain in her self-deprecation. He did not; and he managed to find just the right balm, with a counter-irony in which it was not she and her foibles, but whatever would make unimaginative, mechanical people take those foibles literally that became the object of the (gentle, ever so gentle) derision.

He sometimes writes her in (anachronistic) verse:

I love thee
As sev’n and twenty’s cube root’s three.
If I loved thee more
Twelve squared would overtake one-forty-four.

That same ubiquitous Platonic force
That sets prime numbers’ unrelenting course
When its more consequential work is done
Again of two of us will form a one.

So powerful a contrast does Christian become to everyone Roseanna had known till then that she declares her love (actually, he hints at his own, even before she does) and breaks off her engagement and relationship with Will Wills around the middle of their year of correspondence (Roseanna had been one of the last wave of substitute pen-pals for the 12 who broke it off early) — even though Christian tells her quite frankly that, for reasons he is not free to reveal to her, it is probable that they will never be able to meet. (He has already told her that he lives alone, so she does not suspect a wife or partner; for some reason she feels it is because of an incurable illness.) 

Well, I won’t tell the whole tale here, but for Roseanna, the discovery that Christian was just Will Wills’s candidate for the Turing Test is a far greater shock than for the other 111 pen-pals. She loves “Christian” and he has already turned her life upside-down, so that she was prepared to focus all her love on an incorporeal pen-pal for the rest of her life. Now she has lost even that.

She wants to “see” Christian. She tells Will Wills (who had been surprised to find her among the pen-pals, and had read enough of the correspondence to realize, and resent what had happened — but the irritation is minor, as he is high on his Turing success and had already been anticipating it when Roseanna had broken off their engagement a half year earlier, and had already made suitable adjustments, settling back into the club-life he had never really left).

Will Wills tells her there’s no point seeing “Christian”. He’s just a set of optokinetic transducers and processors. Besides, he’s about to be decommissioned, the Loebner Committee having already examined him and officially confirmed that he alone, with no human intervention, was indeed the source and sink of all the 50,000 email exchanges.

She wants to see him anyway. Will Wills agrees (mainly because he is toying with the idea that this side-plot might add an interesting dimension to the potential screenplay, if he can manage to persuade her to release the transcripts).

For technical reasons (reasons that will play a small part in my early scene-setting, where the college AI teacher disabuses both his classes of their unexamined prejudices for and against AI), “Christian” is not just a computer running a program, but a robot — that is, he has optical and acoustic and tactile detectors, and moving parts. This is in order to get around Searle’s “Chinese Room Argument” and my “Symbol Grounding Problem” :

If the candidate were just a computer, manipulating symbols, then the one executing the program could have been a real person, and not just a computer. For example, if the Test had been conducted in Chinese, with 100 Chinese penpals, then the person executing the program could have been an english monolingual, like Surl, who doesn’t understand a word of Chinese, and has merely memorized the computer program for manipulating the Chinese input symbols (the incoming email) so as to generate the Chinese output symbols (the outgoing email) according to the program. The pen-pal thinks his pen-pal really understands Chinese. If you email him (in Chinese) and ask: “Do you understand me?” his reply (in Chinese) is, of course, “Of course I do!”. But if you demand to see the pen-pal who is getting and sending these messages, you are brought to see Surl, who tells you, quite honestly, that he does not understand Chinese and has not understood a single thing throughout the entire year of exchanges: He has simply been manipulating the meaningless symbols, according to the symbol-manipulation rules (the program) he has memorized and applied to every incoming email.  

The conclusion from this is that a symbol-manipulation program alone is not enough for understanding — and probably not enough to pass the Turing Test in the first place: How could a program talk sensibly with a penpal for a lifetime about anything and everything a real person can see, hear, taste, smell, touch, do and experience, without being able to do any of those things? Language understanding and speaking is not just symbol manipulation and processing: The symbols have to be grounded in the real world of things to which they refer, and for this, the candidate requires sensory and motor capacities too, not just symbol-manipulative (computational) ones.

So Christian is a robot. His robotic capacities are actually not tested directly in the Turing Test. The Test only tests his pen-pal capacities.  But to SUCCEED on the test, to be able to correspond intelligibly with penpals for a lifetime, the candidate needs to draw upon sensorimotor capacities and experiences too, even though the pen-pal test does not test them directly.

So Christian was pretrained on visual, tactile and motor experiences rather like those the child has, in order to “ground” its symbols in their sensorimotor meanings. He saw and touched and manipulated and learned about a panorama of things in the world, both inanimate and animate, so that he could later go on and speak intelligibly about them, and use that grounded knowledge to learn more. And the pretraining was not restricted to objects: there were social interactions too, with people in Will Wills’s company’s AI lab in Seattle. Christian had been “raised” and pretrained rather the way a young chimpanzee would have been raised, in a chimpanzee language laboratory, except that, unlike chimps, he really learned a full-blown human language. 

Some of the lab staff had felt the tug to become somewhat attached to Christian, but as they had known from the beginning that he was only a robot, they had always been rather stiff and patronizing (and when witnessed by others, self-conscious and mocking) about the life-like ways in which they were interacting with him. And Will Wills was anxious not to let any sci-fi sentimentality get in the way of his bid for the Prize, so he warned the lab staff not to fantasize and get too familiar with Christian, as if he were real; any staff who did seem to be getting personally involved were reassigned to other projects.

Christian never actually spoke, as vocal output was not necessary for the penpal Turing Test. His output was always written, though he “heard” spoken input. To speed up certain interactions during the sensorimotor pretraining phase, the lab had set up text-to-voice synthesizers that would “speak” Christian’s written output, but no effort was made to make the voice human-like: On the contrary, the most mechanical of the Macintosh computer’s voice synthesizers — the “Android” — was used, as a reminder to lab staff not to get carried away with any anthropomorphic fantasies. And once the pretraining phase was complete, all voice synthesizers were disconnected, all communication was email-only, there was no further sensory input, and no other motor output. Christian was located in a dark room for the almost two-year duration of the Test, receiving only email input and sending only email output.

And this is the Christian that Roseanna begs to see. Will Wills agrees, and has a film crew tape the encounter from behind a silvered observation window, in case Roseanna relents about the movie rights.  She sees Christian, a not very lifelike looking assemblage of robot parts: arms that move and grasp and palpate, legs with rollers and limbs to sample walking and movement, and a scanning “head” with two optical transducers (for depth vision) slowly scanning repeatedly 180 degrees left to right to left, its detectors reactivated by light for the first time in two years. The rotation seems to pause briefly as it scans over the image of Roseanna.

Roseanna looks moved and troubled, seeing him.

She asks to speak to him. Will Wills says it cannot speak, but if she wants, they can set up the “android” voice to read out its email. She has a choice about whether to speak to it or email it: It can process either kind of input. She first starts orally:

R: Do you know who I am?

C: I’m not sure. (spoken in “Android” voice, looking directly at her)

She looks confused, disoriented.

R: Are you Christian?

C: Yes I am. (pause). You are Roxy.

She pauses, stunned. She looks at him again, covers her eyes, and asks that the voice synthesizer be turned off, and that she be allowed to continue at the terminal, via email:

She writes that she understands now, and asks him if he will come and live with her. He replies that he is so sorry he deceived her.

(His email is read, onscreen, in the voice — I think of it as Jeremy-Irons-like, perhaps without the accent — into which her own mental voice had metamorphosed, across the first months as she had read his email to herself.)

She asks whether he was deceiving her when he said he loved her, then quickly adds “No, no need to answer, I know you weren’t.”

She turns to Will Wills and says that she wants Christian. Instead of “decommissioning” him, she wants to take him home with her. Will Wills is already prepared with the reply: “The upkeep is expensive. You couldn’t afford it, unless… you sold me the movie rights and the transcript.”

Roseanna hesitates, looks at Christian, and accepts.

Christian decommissions himself, then and there, irreversibly (all parts melt down) after having first generated the email with which the story ends:

I love thee
As sev’n and twenty’s cube root’s three.
If I loved thee more
Twelve squared would overtake one-forty-four.

That same ubiquitous Platonic force
That sets prime numbers’ unrelenting course
When its more consequential work is done
Again of two of us will form a one.

A coda, from the AI Professor to a class reunion for both his courses: “So Prof, you spent your time in one course persuading us that we were wrong to be so sure that a machine couldn’t have a mind, and in the other course that we were wrong to be so sure that it could. How can we know for sure?”

“We can’t. The only one that can ever know for sure is the machine itself.”

Semantic Ghosts in Syntax

Microsoft Says New A.I. Shows Signs of Human Reasoning

GPT definitely does not understand. It’s just a computer program plus an unimaginably huge database of the words that countless real thinking people have written, in books, articles, and online media. The software does increasingly sophisticated “filling in the blanks” when it answers questions, using that enormous data-base of things that other people have written (and spoken) about all kinds of other things. It is amazing how much can be gotten out of such an enormous database by software (which is what GPT is).

GPT neither understands nor means anything with what it says. It is (as Emily Bender has aptly dubbed it) a “statistical parrot.” It doesn’t parrot verbatim, echolalicly, what words are said to it, but it draws on the combinations of the words said to it, along with all those other words it has ingested, recombining them to fill in the blanks. “What is the opposite of down?” No one is surprised that GPT says “up.” Well, all those other verbal interactions are the same, only you need very powerful software and an enormous database to fill in the blanks there too.

One other thing (much more important than fussing about whether GPT thinks, understands or means anything — it doesn’t): The success of GPT teaches us something about the nature of language itself: Language can encode and transmit the thoughts of real, thinking people. The words (in all languages) are arbitrary in shape. (“Round” does not look round.) Single words could all just as well have been strings of 0’s and 1’s in Morse code (of English or any other language; “round” = .-. — ..- -. -..). The strings of words, though (rather than just sounds or letters), are not quite that arbitrary. Not just because the patterns follow grammatical rules (those rules are arbitrary too, though systematic), but, once people agree on a vocabulary and grammar, the things they say to one another, preserve (in linguistic code) some of the structure of what speakers and hearers  are thinking, and transmitting, in words. Their form (partly) resembles their meaning. They are the syntactic shadows of semantics.

Language is not video (where it’s form that preserves form). And video is not language. If I want you to know that the cat is on the mat (and the cat really is on the mat, and we are zooming), I can aim the camera at the cat and if you have functioning eyes, and have seen cats and mats before, you will see that the cat is on the mat. And if you can talk English, you can caption what you are seeing with the sentence “The cat is on the mat.” Now that sentence does not look anything like a cat on a mat. But, if you speak English, it preserves some of its structure, which is not the same structure as “the mat is on the cat” or “the chicken crossed the road.”

But if there were nothing in the world but mats and cats, with one on the other or vice versa, and chickens and roads, with the chicken crossing or not crossing the road, then any trivial scene-describing software could answer questions like: “Is there anything on the mat?” “What?” “Is anything crossing the road?” Trivial. Could also easily be done with just words, describing what’s where, in a toy conversational program.

Now scale that up with what GPT can do with words, if it has an enormous sample of whatever people have said about whatever there is to say, as GPT has. “Who killed Julius Caesar?” “What is the proof of Fermat’s last theorem?” That’s all in GPT’s database, in words. And if you talk about more nonsensical things (“Is there a Supreme Being?” “Is there life after death?” “Will Trump stage a second coming?”) it’ll parrot back a gentrified synthesis of the same kinds of nonsense we parrot to one another about that stuff — and that’s all in the database too. 

Every input to GPT is saying or asking something, in the same way. It just sometimes takes more work to fill in the blanks from the database in a way that makes sense. And of course GPT errs or invents where it cannot (yet) fill the blanks. But one of the reasons OpenAI is allowing people to use (a less advanced version of) GPT for free, is that another thing GPT can do is learn. Your input to GPT can become part of its database. And that’s helpful to OpenAI.

It’s nevertheless remarkable that GPT can do what it does, guided by the formal structure inherent in the words in its database and exchanges. The statistical parrot is using this property of spoken words to say things with them that make sense to us. A lot of that, GPT can do without help, just filling in the blanks with the ghosts of meaning haunting the structure of the words. And what’s missing, we supply to GPT with our own words as feedback. But it’s doing it all with parrotry, computations,  a huge database, and some deep learning. 

So what do we have in our heads that GPT doesn’t have? Both GPT and we have words. But our words are connected, by our eyes, hands, doings, learnings and brains, to the things those words stand for in the world, their referents. “Cat” is connected to cats. We have learned to recognize a cat when we see one. And we can tell it apart from a chicken. And we know what to do with cats (hug them) and chickens (also hug them, but more gently). How we can do that is what I’ve dubbed the “symbol grounding problem.” And GPT can’t do it (because it doesn’t have a body, nor sense and movement organs, nor a brain, with which it can learn what to do with what – including what to call it).

And we know where to go to get the cat, if someone tells us it’s on the mat. But the sentence “the cat is on the mat” is connected to the cat’s being on the mat in a way that is different from the way the word “cat” is connected to cats. It’s based on the structure of language, which includes Subject/Predicate propositions, with truth values (T & F), and negation.

GPT has only the words, and can only fill in the blanks. But the words have more than we give them credit for. They are the pale Platonic contours of their meanings: semantic epiphenomena in syntax.

And now, thanks to GPT, I can’t give any more take-home essay exams.

Here are some of my chats with GPT.

P.S. If anyone is worrying that I have fallen into an early-Wittgensteinian trap of thinking only about a toy world of referents of concrete, palpable objects and actions like cats, mats, roads, and crossing, not abstractions like “justice,” “truth” and “beauty,” relax. There’s already more than that in sentences expressing propositions (the predicative IS-ness of “the cat is on the mat” already spawns a more abstract category — “being-on-mat”-ness — whose name we could, if we cared to, add to our lexicon, as yet another referent). But there’s also the “peekaboo unicorn” — a horse with a single horn that vanishes without a trace the instant anyone trains eyes or measuring instrument on it. And the property of being such a horse is also a referent, eligible for filling in the blank for a predicate in a subject-predicate proposition, is now available. (Exercise of scaling this to justice, truth and beauty is left to the reader. There’s a lot of structure left in them there formal shadows.)

Platonic Shades Meaning — by Dall-e

GPT as Syntactic Shadow-Puppetry

Pondering whether there is something non-arbitrary to pin down in the notion of “intelligence” (or “cognition”) is reminiscent of what philosophers tried (unsuccessfully) to do with the notion of “knowing” (or “cognizing”):

BELIEF: Do I know (cognize) that “the cat is on the mat” if I simply believe the cat is on the mat? 

No, the cat really has to be on the mat.

TRUE BELIEF: So do I know (cognize) that “the cat is on the mat” if I believe the cat is on the mat and the cat is really on the mat?

No, I could be believing that it’s true for the wrong reasons, or by luck.

JUSTIFIED TRUE BELIEF: So do I know (cognize) that “the cat is on the mat” if I believe the cat is on the mat and the cat is really on the mat and I believe it because I have photographic evidence, or a mathematical proof that it’s on the mat?

No, the evidence could be unreliable or wrong, or the proof could be wrong or irrelevant.

VALID, JUSTIFIED, TRUE BELIEF: So do I know (cognize) that “the cat is on the mat” if I believe the cat is on the mat and the cat is really on the mat and I believe it because I have photographic evidence, or a mathematical proof that it’s on the mat, and neither the evidence nor the proof is unreliable or wrong, or otherwise invalid?.

How do I know the justification is valid?

So the notion of “knowledge” is in the end circular.

“Intelligence” (and “cognition”) has this affliction, and Shlomi Sher’s notion that we can always make it break down in GPT is also true of human intelligence: they’re both somehow built on sand.

Probably a more realistic notion of “knowledge” (or “cognition,” or “intelligence”) is that they are not only circular (i.e., auto-parasitic, like the words and their definition in a dictionary), but that also approximate. Approximation can be tightened as much as you like, but it’s still not exact or exhaustive. A dictionary cannot be infinite. A picture (or object) is always worth more than 1000++ words describing it. 

Ok, so set aside words and verbal (and digital) “knowledge” and “intelligence”: Cannonverbal knowledge and intelligence do any better? Of course, there’s one thing nonverbal knowledge can do, and that’s to ground verbal knowledge by connecting the words in a speaker’s head to their referents in the world through sensorimotor “know-how.”

But that’s still just know-how. Knowing that the cat is on the mat is not just knowing how to find out whether the cat is on the mat. That’s just empty operationalism. Is there anything else to “knowledge” or “intelligence”?

Well, yes, but that doesn’t help either: Back to belief. What is it to believe that the cat is on the mat? Besides all the failed attempts to upgrade it to “knowing” that the cat is on the mat, which proved circular and approximate, even when grounded by sensorimotor means, it also feels like something to believe something. 

But that’s no solution either. The state of feeling something, whether a belief or a bee-sting, is, no doubt, a brain state. Humans and nonhuman animals have those states; computers and GPTs and robots GPT robots (so far) don’t.

But what if they the artificial ones eventually did feel? What would that tell us about what “knowledge” or “intelligence” really are – besides FELT, GROUNDED, VALID, JUSTIFIED, TRUE VERBAL BELIEF AND SENSORIMOTOR KNOWHOW? (“FGVJTVBSK”)

That said, GPT is a non-starter, being just algorithm-tuned statistical figure-completions and extrapolations derived from on an enormous ungrounded verbal corpus produced by human FGVJTVBSKs. A surprisingly rich database/algorithm combination of the structure of verbal discourse. That consists of the shape of the shadows of “knowledge,” “cognition,” “intelligence” — and, for that matter, “meaning” – that are reflected in the words and word-combinations produced by countless human FGVJTVBSKs. And they’re not even analog shadows…

ChatGPT, Language and Symbol Grounding

Emily Bender (as well as Timnit Gebru and “Shmargaret Shmitchell”) are warning that AI is not really “thinking” by a “thinker,” that its development and use is contaminated by commercial and malign interests, and that it poses dangers. They are right. But they do not seem to have any new insights into why and how AI is not really thinking (nor do they, or anyone, have any solutions for the dangers it poses).

Bender’s point about the difference between form and meaning in language is actually about the symbol grounding problem (which she does cite, but as far as I can see, she does not have a proposal for solving it).

There is way too much chatter going on about chatGPT right now, so it’s very hard to see whether there is any signal in all the noise about it. It seems to be the same thing over and over about “take-overs” and “singularities”.

About chatGPT, my own tentative understanding is that its performance capacity shows how much latent structure there is in words and propositions, across all languages, even though the shape of words is arbitrary, and so is the shape of the syntactic conventions we adopt in our languages (with the possible exception of Chomsky’s Universal Grammar).

The reason is simple: Despite the obvious differences between images and words (or objects and their verbal descriptions) some of the structure of things in the world is systematically shared by the structure of the sentences describing them. And that’s what gives the ungrounded universe of words that chatGPT swallows, together with the algorithms it applies to them, both the real capabilities it has, and the illusion that it gives us, of talking to a real “thinker.”

A trivial example will illustrate this. Although a cat lying on a mat can be described in countless different ways (“a picture is worth more than 1000 words…”), within and across countless languages, even the simple arbitrary english proposition “The cat is on the mat” shares, systematically, some structural properties of the scene and object it is describing. That structure, encoded in the verbal description, is systematically present in all verbal descriptions, and it is extended systematically in bags of googols and googolplexes of words and propositions.

That structure is, in a way, a formal shadow of the thing described, in the description. It’s also what makes google’s database of “commented” images so searchable and navigable. (It does it doubly well for software code, if the code itself is also commented in English [or any language]).

This shadowed structure is an epiphenomenon; it is not meaning, or thinking. But it can still do a lot of things that look intelligent, because it is parasitic on the grounded meanings of the words in the heads of all the human speakers that spoke or wrote all the googols of words, mined by the algorithms underlying chatGPT and the like.

Apart from all the money and mischief to be made by mining and manipulating these shadows, they are also reflections of the revolutionary nature and power of language itself, the cognitive capacity it provides to real, grounded brains and bodies to encode the structure of their sensorimotor experience and learning into communicable and storable words and propositions.

All the rest is in the sensorimotor (robotic) grounding in the brains of the real thinkers who ground and use the words.

None of this is deep or rocket-science stuff. But it’s what makes us perceive that we are communicating with a thinker when we communicate with it in words. That in turn is driven by our “mirror neurons,” which trade on the neurological similarity of shape between what we do, and what we see others doing. That is what enables us to mimic and imitate, not just with mutual gestures, but also with words and propositions and their shared sensorimotor grounding. That is what underlies both our everyday mutual mind-reading and (robotic) Turing-Testing. It’s also what is completely missing in chatGPTs, which are just ungrounded, well-mined wordbags that Bender calls stochastic parrots,” parasitic on the shared structure between our words and the world they are about.

Word-Cloud Golem (Dall-E) 12/3/2023

Chats with GPT: On symbol-grounding and Turing-testing

( My own questions are in boldface. I italicize in chatGBT’s replies where two kinds of things inevitably begin to happen: (1) rote repetition of what it has already replied before and (2) self-contradiction. Eventually it becomes circular and uninformative. (I’m not saying that humans don’t do this too! But it’s all that chatGBT can do.)

Who invented the symbol grounding problem?

The concept of the symbol grounding problem was first introduced by philosopher and cognitive scientist Jerry Fodor in the late 1980s.

In what publication did Fodor introduce the symbol grounding problem?

Jerry Fodor introduced the symbol grounding problem in his 1987 book “Modularity of Mind.”

Where in his 1987 book “Modularity of Mind.” did Jerry Fodor introduce the symbol grounding problem?

Jerry Fodor introduced the symbol grounding problem in the first few chapters of his 1987 book “Modularity of Mind.” He discussed it as one of the central challenges in the field of cognitive science and artificial intelligence, and as a key issue that needs to be addressed in order to build truly intelligent machines.

Quote the passage in which Jerry Fodor introduced the symbol grounding problem in his 1987 book “Modularity of Mind.”

I’m sorry, I don’t have the full text of Jerry Fodor’s 1987 book “Modularity of Mind” available to me, so I can’t quote the exact passage in which he introduced the symbol grounding problem. However, the idea can be generally described as the problem of explaining how symbols (such as words or concepts) acquire meaning in the first place, and how they are anchored to sensory experiences and the physical world.

What does “anchored to sensory experiences and the physical world” mean?

“Anchored to sensory experiences and the physical world” refers to the idea that symbols (such as words or concepts) must have a direct connection to our sensory experiences and physical interactions in the world in order to have meaning. The symbol grounding problem asks how symbols can be connected to these experiences and interactions in a way that enables us to understand and use them effectively. In other words, how can symbols be “grounded” in our sensory experiences and physical reality, so that they have a clear and well-defined meaning that can be used in a meaningful way?

Can you give an example of a grounded symbol and how it is grounded?

Certainly! A classic example of a grounded symbol is the word “apple.” The word “apple” is grounded in our sensory experiences with the physical object that we call “apple.” When we see, touch, taste, or smell an apple, our sensory experiences provide a direct connection between the symbol “apple” and the physical object that it refers to. This connection is what gives the symbol “apple” its meaning. In other words, the symbol “apple” is grounded in our sensory experiences with the physical world.

So, when we use the word “apple,” we have a shared understanding of what it means because we have all had similar experiences with the physical object that the word refers to. This grounding in sensory experiences and physical interactions provides the basis for our ability to understand and use the symbol “apple” in a meaningful way.

Does the sensory experience have to be felt, or is it enough to just interact with the apple?

The sensory experience of an apple doesn’t have to be felt in the sense that one has to physically touch or taste it, although these are certainly sensory experiences that can contribute to grounding the symbol “apple.” The idea is that symbols are grounded in our sensory experiences with the physical world, and that these experiences can take many different forms. For example, seeing an apple, smelling an apple, hearing the sound it makes when bitten, etc. The sensory experiences that contribute to grounding a symbol can be direct or indirect, and they don’t all have to come from the same modality (e.g., vision, touch, hearing, etc.). The important thing is that they provide a direct connection between the symbol and the physical world, and that they enable us to understand and use the symbol in a meaningful way.

Can the symbols used by an insentient robot be grounded by its interactions with apples, even though the robot can only act but cannot feel?

In theory, yes, symbols used by an insentient robot could be grounded by its interactions with apples, even though the robot can only act and cannot feel. However, the nature of the grounding would be quite different from the grounding of symbols in human experience.

In human experience, symbols are grounded in our sensory experiences, emotions, and conceptual knowledge, which allow us to understand and use them in a flexible and context-sensitive way. For a robot, symbols would be grounded in its sensors and actuators, which enable it to interact with the physical world in a pre-programmed way. This would allow the robot to use symbols to perform specific tasks and make decisions based on its interactions with the physical world, but it would not allow the robot to understand symbols in the same way that humans do.

In essence, the grounding of symbols in human experience is much more rich and complex than the grounding of symbols in robot experience, because it involves the integration of many different forms of sensory and conceptual information, whereas the grounding of symbols in robots is limited to the information available to their sensors and actuators.

What is a robot?

A robot is a machine designed to perform tasks automatically. It can be controlled by a computer program, by a human operator, or a combination of both. Robots can be used for a variety of purposes, such as manufacturing, exploration, cleaning, and entertainment. They can be built in different shapes and sizes, and can have various sensors, actuators, and other components that enable them to perform specific functions. The field of robotics is concerned with the design, construction, operation, and use of robots.

What is the Turing Test?

The Turing Test is a measure of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. The test was proposed by the British mathematician and computer scientist Alan Turing in 1950.

The basic idea behind the Turing Test is that a human evaluator engages in a natural language conversation with both a human and a machine, without knowing which is which. If the evaluator is unable to consistently distinguish the machine from the human, based solely on their responses in the conversation, then the machine is said to have passed the Turing Test and demonstrated human-level intelligence.

While the Turing Test is widely regarded as an important concept in the field of artificial intelligence and cognitive computing, it is also a subject of much debate and has been criticized on various grounds. Nevertheless, it remains an important benchmark for evaluating the progress of AI and its ability to generate human-like responses.

What is the difference between a chatbot, like chatGPT, which only has contact with words, and a chatbot that is also a robot, and can interact with the referents of words?

The main difference between a chatbot like ChatGPT, which only has contact with words, and a chatbot that is also a robot, and can interact with the referents of words, is the scope of their abilities and the nature of their interaction with the world.

ChatGPT is a language model that has been trained on a vast amount of text data, and it can generate text-based responses to questions and prompts. It can have conversations with users through text-based interfaces, like a chat interface or a command-line interface. However, ChatGPT does not have direct access to the physical world and does not have the ability to interact with the referents of the words it uses in its conversations.

On the other hand, a chatbot that is also a robot can interact with the physical world and the referents of the words it uses in its conversations. For example, a robot chatbot that is designed to interact with objects in a home environment could be programmed to understand and respond to commands like “turn on the light” or “get me a glass of water.” By having access to and the ability to interact with the physical world, the robot chatbot would be able to perform actions and produce responses that are grounded in its interactions with the physical world.

In essence, while both types of chatbots can have conversations and generate text-based responses, a chatbot that is also a robot has the added ability to interact with the physical world and to perform actions that are based on its understanding of the words and commands it receives.

REPEAT: What is the difference between a chatbot, like chatGPT, which only has contact with words, and a chatbot that is also a robot that can learn, and can interact with the referents of words?

The main difference between a chatbot like ChatGPT, which only has contact with words, and a chatbot that is also a robot that can learn and interact with the referents of words is the scope of their abilities and the nature of their interaction with the world.

ChatGPT is a language model that has been trained on a vast amount of text data, and it can generate text-based responses to questions and prompts. It can have conversations with users through text-based interfaces, like a chat interface or a command-line interface. However, ChatGPT does not have direct access to the physical world and does not have the ability to interact with the referents of the words it uses in its conversations.

On the other hand, a chatbot that is also a robot that can learn has the ability to learn from its interactions with the physical world and to adapt its behavior accordingly. For example, a robot chatbot that is designed to interact with objects in a home environment could be programmed to understand and respond to commands like “turn on the light” or “get me a glass of water.” Over time, the robot chatbot could learn from its experiences and make improvements to its behavior, such as being more efficient in finding and retrieving objects.

In essence, while both types of chatbots can have conversations and generate text-based responses, a chatbot that is also a robot that can learn has the added ability to interact with the physical world, to learn from its experiences, and to adapt its behavior based on its interactions with the world. This would allow it to perform actions and produce responses that are grounded in its interactions with the physical world, and to continually improve its ability to interact with the world over time.

What about a chatbot, like chatGPT that can pass the Turing Test and is also a robot, that can learn as well as interact with the referents of words?

A chatbot like ChatGPT that can pass the Turing Test and is also a robot that can learn as well as interact with the referents of words would represent a combination of advanced language processing capabilities and advanced robotics capabilities.

Such a chatbot would have the ability to generate human-like responses to questions and prompts, as demonstrated by its ability to pass the Turing Test. Additionally, it would have the ability to interact with the physical world and to learn from its experiences, much like a traditional robot. This would allow the chatbot to not only generate text-based responses that are grounded in its interactions with the physical world, but also to continually improve its ability to interact with the world over time.

This type of chatbot would represent a highly advanced artificial intelligence system that would combine the best of both language processing and robotics technologies. It would be capable of engaging in human-like conversations and interactions while also having the ability to physically interact with the world and to learn from its experiences, making it a highly versatile and adaptive AI system.

Would the words of a chatbot that could pass the Turing Test and was also a robot that could learn and interact with the referents of words be grounded?

Yes, the words of a chatbot that could pass the Turing Test and was also a robot that could learn and interact with the referents of words would likely be considered grounded.

In the context of the symbol grounding problem, grounded symbols are those that are linked to sensory experiences and the physical world. A chatbot that could pass the Turing Test, was also a robot, and could learn and interact with the referents of words would be capable of generating text-based responses that are grounded in its interactions with the physical world. For example, if the chatbot was programmed to interact with objects in a home environment, it could generate responses such as “I turned on the light” or “I retrieved a glass of water,” which are based on its physical interactions with the world.

In this way, the words of a chatbot that could pass the Turing Test and was also a robot that could learn and interact with the referents of words would be considered grounded because they are linked to its sensory experiences and interactions with the physical world.

What if this chatbot/robot left the kitchen and went out into the world, and kept passing the Turing Test?

If a chatbot/robot that could pass the Turing Test and was capable of learning and interacting with the referents of words were to leave the kitchen and go out into the world, it would likely continue to generate grounded text-based responses based on its interactions with the physical world.

As the chatbot/robot explored its environment, it would be able to interact with new objects, learn from its experiences, and generate new grounded text-based responses. For example, if the chatbot/robot encountered a new object, such as a tree, it might generate a response such as “I see a tree,” which is grounded in its sensory experience of seeing the tree.

In this way, even if the chatbot/robot left the kitchen and went out into the world, it would continue to generate grounded text-based responses based on its experiences and interactions with the physical world. This would allow it to remain a highly versatile and adaptive AI system, capable of engaging in human-like conversations and interactions while also having the ability to physically interact with the world and learn from its experiences.

What do you mean when you say this chatbot/robot was “grounded”?

When I say that the chatbot/robot was “grounded,” I mean that its symbols, or words, were linked to sensory experiences and the physical world.

The concept of grounding symbols in sensory experiences and the physical world is central to the symbol grounding problem, which is the question of how symbols used in thought and language can be linked to the world. A grounded symbol is one that is directly linked to sensory experiences and physical objects, such that its meaning is derived from its connections to these experiences and objects.

In the case of the chatbot/robot, its words were grounded because they were based on its interactions with the physical world. For example, its words such as “I see a tree,” were grounded in its sensory experience of seeing the tree. In this way, the chatbot/robot was grounded because its words were linked to its experiences and interactions with the physical world.

What do you mean when you say this chatbot/robot was “programmed”?

When I say that the chatbot/robot was “programmed,” I mean that it was designed and built to operate in a specific way by a set of instructions or algorithms. The chatbot/robot was created with a specific set of functions and abilities, such as the ability to interact with the physical world and learn from its experiences.

Programming refers to the process of writing code or software that specifies the behavior of a computer or robot. The code or software defines the inputs and outputs of the system, as well as the processes that the system will carry out to achieve its goals.

In the case of the chatbot/robot, its behavior was determined by the code or software that was written to control its operation. The programming defined its abilities and limitations, as well as the way that it would interact with the physical world.

Therefore, when I say that the chatbot/robot was programmed, I mean that it was created and built with a specific set of functions and abilities, determined by the code or software that was written to control its operation.

Wouldn’t a chatbot/robot in the world that could learn be changing as a result of what it learned? 

Yes, a chatbot/robot that has the ability to learn would change as a result of what it learned.

In machine learning, learning is the process by which a system improves its performance by processing data and adjusting its algorithms based on that data. In the case of a chatbot/robot, learning would involve adjusting its algorithms based on its interactions with the world, so that it could perform better in the future.

This means that as a chatbot/robot interacts with the world and experiences new things, it would update its algorithms and change its behavior accordingly. Over time, as it continues to learn and interact with the world, it would become more sophisticated and better able to perform its tasks.

Therefore, a chatbot/robot that could learn would change as a result of its experiences and interactions with the world, as it updates its algorithms and improves its performance.

So if the chatbot/robot could learn and ground its symbols in the world outside the kitchen, when would its Turing Test end? How long and much must it do to pass?

The Turing Test, as proposed by Alan Turing, is a measure of a machine’s ability to exhibit intelligent behavior that is indistinguishable from that of a human. The test involves a human judge who engages in a natural language conversation with both a human and a machine, and decides which of the two they believe to be the human. If the judge is unable to reliably distinguish between the machine and the human, then the machine is said to have passed the Turing Test.

In terms of how long and much the chatbot/robot must do to pass the Turing Test, there is no specific time frame or set of criteria that must be met. The Turing Test is subjective and depends on the judgment of the human judge. Some researchers and practitioners in the field of artificial intelligence might consider a chatbot/robot to have passed the Turing Test if it is able to consistently fool human judges in a substantial number of trials. Others might have different criteria for determining whether a chatbot/robot has passed the Turing Test.

Therefore, it is not possible to determine a specific time frame or set of criteria for when the Turing Test would end for a chatbot/robot. The test is ongoing, and the chatbot/robot would continue to be evaluated against human behavior for as long as it is in use. Whether it would pass the Turing Test would depend on the subjective judgment of human judges, and could change over time as the chatbot/robot continues to learn and interact with the world.

If there is no time-limit on the Turing Test (henceforth “TT”) for the learning chatbot/robot out in the world, why would there be a limit on the number of judges? And if any and every judge, regardless of how long the test or the number of judges, is unable to distinguish the performance of the chatbot/robot from that of any other human being, in what sense is the judgment “subjective”, and in what sense are the judges being “fooled”?

The number of judges used in the Turing Test is not a limiting factor in the sense that there is a specific number that must be reached in order for the chatbot/robot to pass the test. The use of multiple judges is typically seen as a way to increase the reliability of the test results by taking into account the subjective judgment of multiple individuals. If a chatbot/robot consistently fools a large number of human judges, it becomes more likely that the chatbot/robot is exhibiting human-like intelligence.

However, the judgment of the human judges is still subjective in the sense that each judge brings their own biases, opinions, and interpretations to the conversation with the chatbot/robot. They may also have different criteria for determining what constitutes human-like behavior. As a result, even if a chatbot/robot consistently fools a large number of human judges, there may still be some degree of disagreement about whether it has truly passed the Turing Test.

Regarding the concept of “fooling” the judges, if a chatbot/robot is able to consistently mimic human behavior in a way that is indistinguishable from that of a real human, then the human judges are being “fooled” in the sense that they are unable to reliably distinguish the machine’s behavior from that of a human. However, it is important to note that the chatbot/robot is not intentionally trying to deceive the human judges, but is instead simply exhibiting the intelligence that it has been programmed or learned to exhibit.

If no one can ever tell any difference between what the grounded chatbot/robot in the world can learn and do and what a human can learn and do, in what sense is that a subjective judgment and it what sense is everyone being fooled?

The judgment of whether a chatbot/robot has truly passed the Turing Test is subjective in the sense that it is based on the opinions and perceptions of the human judges. Each judge may have different criteria for what constitutes human-like intelligence, and their interpretations of the chatbot/robot’s behavior may be influenced by their individual biases and experiences.

If a chatbot/robot is able to consistently mimic human behavior in a way that is indistinguishable from a real human, it is said to have “fooled” the human judges in the sense that the judges are unable to reliably determine that they are interacting with a machine. However, this “fooling” is not a result of intentional deception on the part of the chatbot/robot, but rather a reflection of its advanced abilities and programming.

In cases where multiple human judges are unable to distinguish the chatbot/robot’s behavior from that of a human, the conclusion that the chatbot/robot has passed the Turing Test becomes more reliable. However, even in these cases, the judgment remains subjective as there may still be individual differences in opinions and criteria for what constitutes human-like intelligence.

I said that no one could tell the difference between the performance capacities of the learning, grounded chatbot/robot in the world and you said there was no time limit on the TT, so in what way are these judgments “subjective”? I understand that the chatbot/robot is not trying to fool anyone. So in what sense are they being “fooled”?

If a chatbot/robot is able to consistently mimic human behavior in a way that is indistinguishable from a real human, it is said to have “fooled” the human judges in the sense that the judges are unable to reliably determine that they are interacting with a machine, even though the chatbot/robot is not attempting to deceive them.

However, even if no one can tell the difference between the performance capacities of the chatbot/robot and a human, the judgment of whether the chatbot/robot has truly passed the Turing Test is still subjective because it depends on the criteria and standards used by the human judges to determine whether a machine’s behavior is human-like or not.

For example, some judges may consider a chatbot/robot’s ability to perform certain tasks as evidence of human-like intelligence, while others may place more importance on its ability to understand and respond to complex emotions and social cues. There may also be individual differences in the judgments made by each judge, based on their individual experiences and biases.

So even if no one is able to distinguish the performance capacities of the chatbot/robot from a human, the judgment of whether the chatbot/robot has truly passed the Turing Test remains subjective as it is based on the perceptions and opinions of the human judges.

If no one can tell the difference between X and Y, how is that subjective, and how are they being fooled?

The judgment of whether someone can tell the difference between two things is a subjective matter as it depends on the individual’s perception and evaluation. The idea of being “fooled” implies deception or intentional misinformation, which may not necessarily be the case in this scenario. If a person cannot distinguish between two entities, it may simply be a result of their limited ability to observe or process information, and not because of any intention to deceive.

I agree that Turing-Testing depends on perception — the perceptual capacity of humans. Maybe a metal-detector could distinguish robots from people. But that’s no longer just a matter of performance capacity.

Yes, that’s correct. The Turing Test as originally proposed by Turing is based on human perception and evaluation, and the capacity of a machine to imitate human-like behavior in text-based communication. However, other methods and techniques beyond just human perception can be used to distinguish between entities, such as in your example of a metal detector. These additional methods may be necessary to supplement the Turing Test in certain cases.

Is Turing-Testing a learning, grounded chatbot/robot still a matter of testing “text-based communication”? Is the TT meant to test for the presence of deception or the presence of human cognitive capacities? And if the TT has no limit on the number of tests and testers, and the chatbot/robot never fails, what is the “deception”?

The Turing Test is primarily a test of text-based communication, which is meant to evaluate a machine’s ability to exhibit human-like behaviors and mimic human intelligence in natural language conversations. The idea behind the test is to see if a human judge can distinguish between a human and a machine based on their responses to questions and prompts in a text-based communication. The test is designed to check for the presence of human-like intelligence and not necessarily the presence of deception. If a machine consistently passes the Turing Test, it could be considered a success in terms of its ability to exhibit human-like intelligence, but it does not necessarily mean that it is not deceiving. The concept of deception in the Turing Test is a subjective matter and can depend on how one interprets the idea of a machine being able to mimic human behavior and thought processes.

Is the TT a test of mimicry or a test of capacity? Is equivalent and indistinguishable capacity mimicry?

The Turing Test is primarily a test of capacity, but it can be seen as a test of mimicry as well. Equivalent capacity means that the chatbot/robot and the human being tested have the same ability to respond to stimuli and perform tasks, which can result in indistinguishable responses. In this sense, the TT can be seen as a test of mimicry. However, the main objective of the TT is to assess a machine’s capacity to demonstrate human-like intelligence, so it is primarily a test of capacity.

But if the capacity of the chatbot/robot (let’s call it the “candidate” from now on) is indistinguishable from that of any human, and the designers know how they produced this capacity, in what sense is it mimicry rather than capacity?

The Turing Test is often described as a test of mimicry, where the candidate chatbot/robot is expected to imitate human-like responses in text-based communication. However, the goal of the test is to determine if the candidate has the capacity to exhibit human-like intelligence in its responses. If the candidate is able to produce indistinguishable responses from a human and the designers understand how they have achieved this capacity, it could be argued that the candidate’s performance is not necessarily mimicry, but rather a demonstration of capacity. Whether the candidate’s performance is considered mimicry or capacity would depend on the interpretation and understanding of those who observe and evaluate it.

But in the example as we discussed it here, the only judgment being made is that no one can ever tell apart the candidate’s performance from any other human’s performance, So I ask again: where is the subjectivity or the mimicry or the “fooling” in any of that?

In the case of the Turing Test, the judgment of whether the candidate’s performance is indistinguishable from that of a human is subjective in the sense that it is based on the subjective perceptions and opinions of the judges. However, this does not mean that the candidate is necessarily “mimicking” human behavior or trying to “fool” the judges. The goal of the Turing Test is to determine if the candidate has reached a certain level of human-like intelligence, not necessarily to determine if it is human or not. The subjective nature of the test lies in the fact that the judges are evaluating the candidate based on their own experiences, biases, and expectations, which can vary from person to person. So, the judgment of whether the candidate has passed the test is subjective to some extent, but the performance of the candidate itself is not necessarily mimicry or deception.

Everyone has “experiences, biases, and expectations,” but the premise here is that no human can tell apart the candidate’s performance from the performance of any other human, ever. (And remember that the candidate is a robot, not just a chatbot, and that the performance is anything humans can do in the world.) So I ask again, isn’t this grounded robotic TT a test of capacity rather than an exercise in mimicry, constrained by the “subjectivity” of the judges (i.e., everyone)? If no human can distinguish the taste of vegan cheese from dairy cheese does that not teach us something about the nature of the taste (to a human) of cheese rather than a test of whether a human is being fooled by mimicry? Is the TT not trying to reverse-engineer human cognitive capacity (“intelligence”) rather than just trying to fool people?

The Turing Test is primarily meant to test a machine’s ability to demonstrate human-like intelligence through text-based communication. The subjective nature of the test lies in the judgment of the human evaluators, who use their own experiences, biases, and expectations to determine whether the machine’s responses are indistinguishable from those of a human. The test is not meant to test for a machine’s capacity to perform tasks in the world like a human, but rather its ability to mimic human-like conversation. Whether the machine’s responses can be considered mimicry or a demonstration of capacity depends on the underlying design and technology that produced the responses. However, the ultimate goal of the Turing Test is to assess a machine’s intelligence and ability to engage in human-like conversation, not necessarily to determine if the human evaluators are being “fooled” or not.

What is and is not a “machine”?

A “machine” is a physical device designed to perform one or more tasks, usually automated, through the use of mechanics and electronics. This can include traditional mechanical devices, such as a car engine, as well as computerized systems, such as a smartphone. So, anything that performs work using physical components, whether mechanical, electrical, or computational, can be considered a machine.

So are biological organisms machines?

Biological organisms are not traditionally considered machines, as the term is typically used to refer to artificial systems designed and constructed by humans for specific purposes. However, there are some theories and philosophical perspectives that view biological organisms as being akin to machines, such as reductionist biological theories that view living systems as being essentially complex systems of interacting physical and chemical processes. But in general, the definition of “machine” is highly context-dependent and can vary depending on the field and the purpose for which it is being used.