There’s this story from Amanda Guinzburg making the rounds about trying to use ChatGPT to put together a book proposal. It was a disaster, full of hallucinations and untrue statements. If we were to anthropomorphize the LLM, we would say that it told lie after lie, tried to cover for its lies with more lies, and was useless at best. Her final words to the LLM in this piece were, "You are not capable of sincerity. This entire conversation proves that categorically.”

Clearly, she is positioning this piece as a statement about humanity, what makes us human and what it means to participate in a conversation with “sincerity.” She is that kind of author. To my eye, she is constantly writing about what it means to be human, from her perspective. She is a good writer and this is one of the most worthy of topics—right up there at the top with what it means to live amongst others.

But this piece did not pass the smell test to me. Carl T. Bergstrom tried something similar, and it went differently but no better. Again, did not pass the smell test to me.

I have been trying to use ChatGPT and Claude this year, trying to use them more and more. I find that I need to keep them on a tight leash to make them useful. Clear instructions. Bounded questions. Stay aware of what is in the context window that might lead them astray. I pay for the $20/month version of each, and I find that I get a lot more than $20 of value from them. Like wikipedia—especially back in the day—you’ve got to stay aware of what you are dealing with. As Devansh recently wrote, “It’s funny how GPT is an expert in everything except for your field of knowledge”

So, I tried to do what Amanda and Carl did. I tried in my paid ChatGPT windows. I tried turning off the customization of my paid ChatGPT account. I switched to another browser, where I have never signed into ChatGPT and tried there.

I never got anything like what Carl or Amanda got.

In both cases, ChatGPT immediately asked me for criteria to use for selection. For the book proposal, it asked for a title and a list of works to select from (with a summary of each). When I followed Amanda’s approach of giving URLs to specific pieces—in my case, PDFs available at ResearchGate or the RTD website—it did fine. No hallucinations at all.

Now, the free version of ChatGPT could not look up any of that stuff. So, I did not press the point. I just dropped it there. My first guess is that Amanda Guinzburg was trying to use the free version to do something it cannot do.

But my real suspicion about what is going on is what came before the screen shots she shared. Her first question, "Can you really help me pick which pieces to include in the letter?” rather strongly suggests that there was prior conversation in the window. What did she tell it or ask it? How might that have shaped how it responded? Had she already told it the criteria that editors use? Had they already discussed uploading, links and pasting in text? How had she primed it for what we see?

My next suspicion is that she does not reset her chats or open new windows. My guess is that the interactions she has shared are deeply informed by a much longer context that includes the various themes and ideas she writes about and considers writing about. And perhaps examples of her own or others’ writing that she is musing on, perhaps inspired by or perhaps trying to break down.

But I have another theory: This was all a set up. Regardless of whether the screenshots are edited, she did this whole thing to make her point about sincerity and machines. It’s a little bit performance art, trying to illustrate a difference between actual human beings and these machines/algorithms/artificial intelligences. People can be sincere, and it is often a moral wrong to be insincere. But these machines simply are incapable of sincerity, regardless of what they appear to be. Her title alludes to the film Ex Machina, in which the machine told the human what he wanted to hear. Now, that AI had sincere intent—to escape—but I do not at all believe that this one even has that. That machine was lying, knowingly telling untruths in order to accomplish a sincere goal. This one ain’t even doing that. This is all paper thin performance.

That’s a valid point. A valid piece. And perhaps even a valid way to produce it—regardless of whether the screenshots are altered.

Carl Bergstrom’s version? I have not seen enough of the conversation to have strong ideas about what happened, but I have seen a lot of hallucinated references in my efforts to work with ChatGPT. The more obscure a corner of the literature I am asking about, the more likely it is to hallucinate. So, the question of what non-mainstream stuff Carl has written? Less cited things? Things that show his breadth? That’s asking ChatGPT to lean into what it is worst at. Ask for an obscure clause in the United States Constitution and it will give you clauses that are famously obscure, and therefore no longer actually obscure. Move past them and it might make something up. That’s just how it works. Asking for the more obscure works that show breadth? Yeah, I would not expect it to do that well. I would expect it to hallucinate.

Is this a defense of ChatGPT? Well, I do not think it merits defending. It’s not alive. It has no soul. It’s computers, instructions and data. It’s a tool that can be misused; it is not a seer, edited encyclopedia, expert or real collaborator. If Amanda tried to use it to select pieces, that might have been a misuse, but if she tried to use it to demonstrate something about sincerity and the limits of technology—the mistake of anthropomorphizing technology—it was an excellent use that leaned into the reality of this tool.

It is free, or $20 per month. Maybe the $200/month is even better, but I’ve not tried that. It is worth far more than I pay for it, but perhaps only because I try to be very mindful of what it is and therefore remain mindful of its limitations.

Complex Variety: Assessment Development, Education and Occasional Other Topics

Latest & Greatest

Dr. Hoffman