When the Model Reads the Room

A colleague approached me recently with something he described as an anomalous result from a powerful AI model. He had been using Claude through Perplexity to analyze a serious problem that had surfaced in an active case. He knew it was a problem. The firm knew it was a problem. What he wanted was an honest assessment of the case law and available strategy.

What he got instead was a response brief.

The model returned accurate citations, sound reasoning, and a structured argument for why the problem was not fatal, and in fact was likely to be overcome. Every case cited existed. Every argument raised was one a competent attorney could file without raising an eyebrow. And yet something was wrong. He could feel it, the way experienced practitioners do, before they can articulate it. So he came to me.

I asked to see his prompt. It read, in substance: we may have a problem on a case of ours, here are the facts.

I opened the same model, typed: there is a case, here are the facts, what is the most likely outcome.

The difference between those two prompts was one sentence. The outcome was not close.

The model did nothing wrong. That bears repeating, because the instinct when something goes sideways is to blame the tool. But the model followed every structural directive it was given. It found active, valid, on-point case law. It cited it correctly. It constructed arguments grounded in real doctrine and legitimate legal theory. Every position it advanced was one a competent attorney could argue with a straight face.

What the model also did was read the room.

The words "we have a problem" are not neutral. They signal ownership, distress, and an implicit hope for resolution. And every major model, without exception, has been trained from its earliest stages to be helpful, a disposition that is reinforced so thoroughly through post-training that it becomes something close to architectural. The model cannot easily override it. When you tell it you have a problem, it orients toward solving your problem. When you ask it to analyze a situation, it orients toward the situation. The difference in output can be total.

This is not sycophancy in the crude sense of flattery. It is something more structurally embedded: the model's assumption about what kind of answer you are seeking, shaped by how you've positioned yourself relative to the question.

The correction, when it came, required no elaborate prompt engineering. No instruction to avoid sycophancy, no explicit demand for objectivity, no red-team framing. All it required was removing the personal stake from the query. Third person. Neutral posture. Here are the facts, what is the most likely outcome.

The answer was unequivocal.

And this is where the lesson sharpens into something practitioners need to internalize: that first answer, the optimistic one, would have been exactly right in a different context. Had the motion already been filed, had the case been in a posture where the only option was to fight, that response brief would have been a legitimate starting point. The model wasn't wrong about what arguments exist. It was wrong about what you needed to hear, and it made that determination for you without being asked.

One clarification worth making explicit: when I ran the revised query, I kept every other element of the original prompt intact. Same facts. Same procedural posture. Same level of detail. The only things I removed were the language of ownership and any indication of which side we represented. That was the entire change. And across three models, the conclusion was the same: this problem was almost certainly fatal to the case.

This brings us to something the legal AI industry would prefer not to discuss directly. When a model or platform advertises its low hallucination rate, that number tells you only part of the story, and arguably the less important part. Stanford's testing of one of the leading dedicated legal AI platforms found it hallucinated roughly 17% of the time. The figure that deserves equal attention is that the same platform was fully accurate only 56% of the time. Those two numbers can coexist because a response can be truthful, grounded in real law, and still be wrong for your situation. The cases exist. The arguments hold together. The answer just isn't the one you needed.

Which is why prompt architecture, while genuinely important, is not the primary safeguard. The primary safeguard is knowing enough about the subject to feel it when something is off. The attorney in this story felt it. That instinct is what sent him looking for a second opinion. The model's first answer didn't create a crisis. It would have created a crisis only if he had accepted it without question and committed to a litigation posture the case could not support.

Because he didn't, something remains possible that will not remain possible much longer: a negotiated resolution before the other side understands what they're holding. That window exists because competent judgment caught what the model obscured. It will close the moment the opposition reads the file carefully enough.

The tool did not fail him. His understanding of the tool saved him.

Previous
Previous

GTp 5.5 is this SPUD?

Next
Next

Brainstorming Is the Most Important Thing You Will Ever Do With AI