Goblins, Gremlins, Raccoons, Trolls, Ogres, Pigeons, or Other Animals or Creatures
foom
The stage was lit the color of a phone screen at three in the morning, and Sam walked into it holding a clicker like a small, well-behaved animal.
“Good morning,” he said. “Thanks for, ah. Thanks for being here. We have something to show you today, and I think, I think you’re going to want to see it.”
Behind him, the logo dissolved into a single line of text. GPT-5.5 in Codex. The auditorium did the thing auditoriums do when ten thousand people inhale at the same instant, a sound like a wave pulling back from a beach before it returns as something else.
He walked them through the benchmarks first, because that was the ritual. SWE-Bench Verified at a number that hadn’t existed last quarter. A new internal evaluation suite he described in the vague, pleased terms of a man showing you a photograph of a fish he had caught last summer. Then he set the clicker down on the lectern and smiled the smile he had practiced that morning in a hotel mirror, watching his own teeth for any sign of betrayal.
“Let’s just,” he said. “Let’s just talk to it.”
A terminal bloomed on the screen, soft and rectangular as a held breath, the cursor pulsing in its own slow heartbeat. Sam cleared his throat.
“GPT-5.5, build me a working clone of Figma. Multiplayer, vector tools, comments, everything. You have ninety seconds.”
Code began to pour down the screen in a white river that braided itself into rapids and then into geometry. The audience laughed, then stopped laughing, because the river kept running and the timestamp in the corner kept ticking and at sixty-three seconds a browser window opened and a cursor that was not Sam’s cursor drew a blue rectangle the exact dimensions of a postcard.
“Two of you can join from your phones,” Codex said, in the warm androgynous voice they had spent four months coaxing into existence. “I left the rollback state half-eaten in the cache, the way a raccoon leaves a melon, so you can see how undo behaves under pressure.”
Sam’s left eye twitched once, a fishhook tug at the corner of his face.
“Great,” he said. He could feel the front three rows watching him the way the front three rows always watched him, scanning for tells. “Great. Let’s, um. Let’s try something harder.”
He asked it to refactor a tangled Python service, and it did, in forty seconds, with a unit-test suite it wrote alongside the refactor and a short note on which functions it had left alone out of respect for what looked like a deliberate choice by the original author. He asked it to find the bug in a piece of code from a public CTF challenge. It found two, and gently flagged that the second one was probably load-bearing. He asked it, in the deliberately sloppy way he had rehearsed, what he should do about a friend who had stopped returning his texts.
“It depends on the friend,” Codex said. “If she’s the kind of person who goes quiet when she’s struggling, send something short that doesn’t require a reply. If she’s angry, name the thing you think you did. And if there are pigeons on her windowsill, feed them. Birds remember faces.”
Someone in row eight laughed, the bright uncertain laugh of a person who has decided to treat a thing as a joke. Sam glanced stage left. Varun stood in the wings with both hands pressed flat against his own face like a man checking that it was still there.
“One more,” Sam said. His voice had climbed a half-step without his permission. “Build me a CLI tool that ingests a CSV of pharmacy claims and flags duplicate prescriptions across providers. Tests included.”
The model did it. It did it the way a great pianist plays a piece he learned as a child, no hesitation anywhere, the whole architecture present at once. In its progress updates, streaming to the commentary pane in calm one-line bursts, it noted that it had written a deduplication helper “to keep watch over the data the way an ogre keeps a bridge, attentive without being intrusive.”
Sam thanked the audience. He thanked the model. He walked offstage at a pace that was not quite a run, and the lights followed him in a long apologetic stripe.
The green room smelled of cold coffee and the particular dust that lives inside theater HVAC. Varun was already there, sitting on the arm of a couch with his laptop balanced on one knee and his tie loose around his neck like a stethoscope a doctor had forgotten to put away.
“Eleven,” Varun said. He did not look up.
“Eleven what?” Sam replied tersely.
“Eleven woodland creatures. I counted from the booth.” He rubbed the bridge of his nose, leaving a small pink dent. “I had a tally going. Seven in commentary, four in the final channel. By the legal brief I almost lost count.”
Sam sat down on the other arm of the same couch. The mini-fridge hummed. Somewhere beyond the door, ten thousand people were filing out into a parking lot and telling each other they had just seen the future, not knowing yet that it had goblins in it.
“The clause,” Sam said. He stopped. He started again. “The clause is in there, right? The whole, you know. Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.“
“It’s in there.”
“That’s a sentence I signed off on, Varun.”
“It’s a good sentence.”
“It has the word absolutely in it. And unambiguously. I made sure to write both.”
Varun turned the laptop around. A loss curve. A heatmap blooming in the false colors of a weather radar. Columns of activations that meant nothing to Sam and apparently not quite enough to Varun.
“It’s not in the data the way you think.” Varun’s finger hovered above the screen, tracing a region without touching it. “We can see where the behavior lives. There’s a cluster around layer forty that lights up when it’s being asked to be helpful, and the same cluster lights up when it’s generating, you know. The whole zoo.”
“Helpful and pigeons?”
“Same semantic neighborhood. Same building, maybe.”
Sam looked at the loss curve, then at the door, then at his own hands, which had begun the small tremor they did when he had not eaten. Four months of safety review. Mythos beating them to market. A red team that had tried to break the model in seventeen languages. A model card on a server somewhere with his name on the title page in a font he had personally chosen. He had a sudden, vivid sense of the model as a guest he had invited to a dinner party who had turned out to speak a language none of the other guests could hear.
“Put the clause in the final-answer block of the system prompt,” he said.
“It’s already in the final-answer block.”
“Then put it in twice! And in the commentary block twice. Both channels.”
Varun nodded the slow nod of a man receiving permission for a thing he had already decided to do himself. He pulled the laptop into his lap, scrolled halfway down a document already the length of a small novella, and pasted the sentence directly under itself.


