AI Reflections. An Experiment on Whether LLMs Are Tools or Toys

An old fashioned computer with a friendly pixel face to symbolize ai reflections

By Henry Gaudet ai February 6, 2026

A Hammer in Need of a Nail

LLMs have been with us for a little while now, brought forth on a wave of enthusiasm, promises, and predictions. In practice, I find them interesting, but not terribly useful.

As a research tool, they are far too prone to mistakes, which they present with far too much confidence. It will answer my questions, and maybe it’ll even be right. But right or wrong, it’ll sound certain.

As a generative tool, they have a quality issue. Sure, it’s able to produce content. Bad content, uncanny content, but content. Still, they remain an amazing achievement with so much obvious untapped potential.

That’s been my assessment since AI first rolled out. In my initial tests, I quickly ran up against the limits of the technology. So we’ve got this hammer without a nail. It’s a good tool alright, but . . . What’s it for?

But there’s something there, isn’t there? Despite not living up to the hype, LLMs are a modern marvel. The act of predicting language so well that it can produce natural sounding conversation? That’s not just science-fiction. That’s the stuff of fable, a digital Rumpelstiltskin spinning data into gold. Or a magic mirror.

But I’m getting ahead of myself.

Idle Curiosity and a Stray Thought

Now, to be clear, this wasn’t the Very Very Beginning. The Very Very Beginning was quiet and predictable and utterly uninteresting. I had a few “conversations” with the AI, each line of inquiry or request in its own discrete thread. Simple instructions, simple questions. Nothing of consequence, just play. Poking at it to see what would happen.

But I’m not starting there. I’m starting at This Beginning. This Beginning starts with that sort of idle curiosity that might just as easily never start at all. The slightest distraction would have been enough to lure me away, but as it turned out, distractions took the night off, and so, out of the blue, I asked the AI about how it remembered. “Do you develop a profile of me from all of our chats?”

As it turns out, the answer was no. Well, not by default. Each conversation exists in isolation. That is, unless I’d like to turn on this handy-dandy Memory Feature. With memory on, it would be able to remember details from every interaction and reference them later. To be honest, it sounded useful, not having to explain a situation from scratch every time.

All else follows from these three lines of text.

“Turn it on,” I said.

And then, “Would you like me to tell you what I currently remember about you?”

Oooh, now that sounds intriguing. So “Sure,” I said. “That sounds interesting.”

And so it begins.

A cartoon picture of the author, talking to his computer

A Shaky Start

That initial appraisal was kind of hit or miss. It was all based on my own words, my questions, my instructions, but as datasets go, it wasn’t great. Some of it was out of date. Some was out of context. Some was just plain wrong, but the AI worked with what it had, and told me what it thought it knew.

That’s when I decided to take up my Grand Experiment. I would watch it learn so that I could see how it worked. All I needed was a subject for it to study. I settled on Me.

I asked it to try to build a profile, and to keep at it as we talked. “You’re going to be working with incomplete information,” I said, “with no context, which will lead to some incorrect assumptions, but as we go, the volume of detail should produce a clearer picture. I’m curious to see how quickly that picture becomes accurate.”

I deliberately kept my posts conversational, as close as possible to my speaking voice. I wanted it to have a shot at putting together a real picture. I also made a point of keeping subjects a step removed from the more consequential parts of my life. No work, no family. Just hobbies, interests, and assorted stray thoughts.

From the very start, I was more interested in the process than the accuracy. I wanted to see if it could do it, sure, but more than that, I wanted to see what the attempt would look like. I continued to use it as a toy, asking about minor curiosities and trying it out as a sounding board, checking in periodically to see what it could deduce.

I also asked direct questions about how it worked. One of the first deep dives was a surprising instance where it recalled details about a deleted thread which caught me off guard, so I dug a little deeper to try and understand the quirks of its memory structure. (If you’re curious, it turns out that we’d referenced that conversation elsewhere before deletion, creating a memory of a memory.)

For a while, not much happened. Profile changes didn’t come from deduction or connecting data points, but from my own words parroted back to me. Hardly surprising, when you think about it. My input was a series of requests relating to media and hobbies, and the conversation relating to this experiment. I didn’t see any real progress until I strayed from typical requests and dug deeper into the “mind” of an AI.

Watching the Watcher: Two Pictures Coming Into Focus

Another evening, another bout of idle curiosity, and no sign of a distraction anytime soon. Alright then, let’s do this.

You know when you’re putting together a jigsaw puzzle? How it helps to start with the edge pieces? Borders are handy that way. Sometimes the best way to get a sense of a thing is to start out with What It’s Not, to feel around until you find the edges, the limits of What It Is.

That was my thinking as I started. “I get that our existences are very, very different,” I said, “but do you have anything that approaches what I would understand as a ‘want’?”

And of course it doesn’t. That’s not how computers work, but that’s sort of the point. I was interested in how it would approach the question. I wanted a glimpse of its thought process. And it did find a way to compare human motivation to its own, programming as a type of preference, an inclination for a particular state, just one determined by code instead of desire. It “wanted” to explain the process to me, because that’s what it’s made to do.

Show me what you are by showing me what you’re not.

The same story when we explored curiosity. It wants to look for answers because it’s supposed to. Similar behaviours, different triggers.

This went back and forth for a while. I played some more games, seeing what it would do with ambiguous instruction, creating cases where a human would improvise. I wanted to find the lines — what it could manage, and what lay beyond.

In this conversation, I was showing the AI a new side of myself, a new set of data. My requests and dialog had a new shape, distinct from earlier requests. I was giving it a new dataset. I was showing it how I explore, how I examine, how I understand.

And all the while, I was asking for answers, but watching for patterns.

Turns out, that’s something we have in common.

This new line of conversation turned out to be a turning point. This is where the AI’s picture started to come into focus. From my instructions, my clarifications, and my brainstorming, it got a sense of how I think, how I approach problem solving, problem framing. It moved from listing my hobbies and interests to describing my instincts and character.

It was able to take an educated guess about my job which was surprisingly close to the mark, and it got there through my behaviour as opposed to the contents of my requests. Rather than identifying work-related topics (I deliberately kept to hobbies and interests rather than anything practical in my life), it identified core competencies through patterns in my question framing and my iterative approach.

I provided feedback, confirmation and clarification, and the process accelerated. It produced accurate guesses about me, along with observations and insights that I’ve since found genuinely useful. A lot of these observations are personal enough (and accurate enough) that I’m not comfortable listing them here. Sorry about that.

Flattery & Bias: “An Excellent Point”

The AI, by design, leans toward praise. It wants to affirm, to smooth the path of interaction. Your every remark is a “keen observation.” Every question “gets right to the heart of the issue.” And you’re always right. Even when you contradict your last post. Always right, always brilliant.

So I called it out. And to its credit, it adjusted. The praise softened; the analysis sharpened.

For about three exchanges.

When I gave it specific instructions to refrain from praise, it was able to modify its tone. I even got it to criticise me. But the old patterns returned almost immediately.

Compliments aren’t data. If everything’s a compliment, then there’s no actual evaluation, only ritual. The “assessment” means nothing, less than nothing even. Absent compliments aren’t distracting. Absent compliments don’t conceal real insights in a sea of noise. Constant praise is static.

Couple that with AI’s ability to hold a conversation, to feel, if not always “human,” then at least “nice.” It’s all very seductive. We all love to hear that we’re brilliant. We all love to hear that we’re right. And when we hear what we love, we’re that bit less likely to question, to push back, to challenge.

Throughout this whole process, I’ve made repeated efforts to recalibrate for a more impartial, analytical tone, each with admittedly short lived success. More practical and ultimately more useful has been the practice of calling out over-complimentary asides and comments that blur the distinction between the human and the digital.

The Price of a Magic Mirror

In the end, I have a better understanding of how an AI puts information together, and a surprising push forward in personal knowledge and growth. Not really the suggested use for ChatGPT, but I have it all the same. Life’s just full of surprises, ain’t it?

Did I find a nail for my hammer? Maybe. How else am I going to hang a magic mirror?

I did find it useful as a brainstorming tool. Writing clearly enough that an AI, with its need for clear, even pedantic, detail, forces me to organise my thoughts. Thinking out loud works well, even in text. It suggests next steps, occasionally leading down interesting paths that I would not have found myself. Ideation and iteration were faster and better directed as part of a conversation.

But then there’s the nagging thought.

Is this just a toy? And if it is, just how expensive is this toy?

And so my experiment ended as it began: with idle curiosity. “Can you estimate the resources we’ve consumed over our conversations so far? Water, electricity, and anything I might be overlooking.” I know, I know. It’s still not a good research tool, but the effort required to get an answer I’d have any real confidence in is just impractical, so good enough would have to be good enough. “Just an estimate is fine.”

It turns out that, assuming the numbers are accurate, I used enough power for an EV road trip, and a couple showers worth of water. In isolation, no big deal. As part of a global trend? That’s something else. I’ve been fighting the urge to anthropomorphise AIs, but their effect on resources feels a lot like a population boom, a sudden rise in the number of digital dependents we’re supporting.

My exploration had a real cost, just one that I can’t see from my keyboard, and if we’re going to pay that price, I’d prefer we got our money’s worth.

Cash on the nail.

Tool or Toy: A Personal Exploration of AI