Microsoft researcher builds goat-powered Age of Empires neural net to challenge LLM human assumptions

Adrian de Wynter uses goats, grass, and gates in AoE 2 to argue we anthropomorphize LLM outputs too easily.

ByYousef Al-ZahraniTechnology Correspondent, The Executives Brief

about 3 hours ago·4 min read

Microsoft researcher builds goat-powered Age of Empires neural net to challenge LLM human assumptions

Executive summary

Microsoft AI researcher Adrian de Wynter built an LLM inside Age of Empires II using a scenario editor, goats, and in-game objects as binary representations. The work culminates in a paper arguing we should stop assuming LLMs behave like humans just because they were trained on natural language.

A Microsoft AI researcher didn’t just critique how people talk about large language models. He built one inside Age of Empires II, with goats doing the job of bits. Adrian de Wynter, a Microsoft researcher, created a neural network in the strategy game’s scenario editor and paired it with a paper titled "If LLMs Have Human-Like Attributes, Then So Does Age of Empires II." The whole point is to show how easily we mistake human-like behavior for human-like understanding, and how that mistake can quietly creep into research.

De Wynter’s contraption is not metaphorical. He constructed a functioning NOT AND gate and a 1-bit perceptron, a simple form of neural network, using objects in the Age of Empires II world to represent computer binaries. Grass represents 0, bridges represent 1, and goats play the role of bits. Videos of the goat-powered LLM are on his GitHub page, and to a casual observer the process looks baffling, which de Wynter says demonstrates the argument.

If the headline sounds like philosophy cosplay, it’s actually aimed at a very practical sore spot: the way LLMs can generate natural language that reads like human thought. Because models like ChatGPT or Claude can produce responses with human-like tone, they attract human-like interpretations. That has fueled debate over whether LLMs might be sentient. The source is clear that, at present, there are far more reasons to conclude AIs are not and will never be conscious. Still, the “maybe it’s thinking” narrative persists, partly because people are predisposed to see human qualities in non-human things, and partly because AI companies have equivocated over the issue.

De Wynter is trying to break that spell by swapping the “natural language” ingredient for something that is clearly not human. He says the processes going on inside the game are fundamentally similar to the computations that power systems like ChatGPT. But by making the underlying mechanics goats, grass, and bridges rather than natural language, he prevents observers from perceiving the resulting behavior and output as human. In plain terms, he is isolating the illusion: the same kinds of relationships between weights and operations can exist without the surface cues that trigger anthropomorphism.

That’s why his paper is framed as formal demonstration, not vibes. He told 404 Media that he has “this tendency to dial up things to 11 when I really think I need to make a point,” and that absurdism is standard in philosophy and theoretical computer science. In his explanation, de Wynter distinguishes between what makes LLMs what they are and what makes people perceive them as humans. He summarizes the core logic as: there are “things which make the LLMs what they are in themselves,” meaning the relationship between weights defined by some operation, and there are “things which make them what they are perceived as.” Replace the surface-level cues, and the human read starts to look less inevitable.

Now for the part that should interest executives and board members, even if you never plan to build a perceptron with livestock. De Wynter argues that assuming LLMs have human-like properties without demonstrative proof could cause problems in scientific research. He goes further in the paper, saying he has peer reviewed more than 300 computer science papers in the last two years, finding that over half began with the assumption that LLMs have human-like traits. His stated proposal is blunt: stop assuming LLMs behave like humans just because they were trained with natural language, and instead run experiments that let us see LLMs as they are, not as we believe they should be.

So where does this land in the real world of product, governance, and oversight? It lands in how organizations design evaluation, documentation, and risk framing. When stakeholders treat model behavior as if it implies human cognition, the organization can drift into overclaiming, over-trusting, or building policies that map to intuition rather than measurable capabilities. Even when no one says “the model is conscious,” decision-making can still be distorted by anthropomorphic interpretation. That’s the second-order risk: not an existential fear, but a governance problem where the wrong mental model shapes experiments, reporting, and accountability.

This matters for peers because the industry is already moving fast, and scrutiny is only increasing. The systems are increasingly embedded into research workflows, customer support, internal ops, and more. De Wynter’s goat-powered demo is a reminder that “reads human” does not automatically mean “acts human,” and neither does it automatically mean “understands” in the way we do. Boards and leadership teams do not need to care about goats. They do need to care about the underlying habit he’s calling out: using language-like output as a proxy for human-like traits, then letting that proxy leak into how you evaluate, validate, and govern.

If you’re building AI systems or overseeing them, the practical question is whether your experiments track what models do, not what you think they mean. De Wynter’s paper is an odd-looking, in-game-wired argument for a very serious governance standard: treat claims about LLMs as testable hypotheses, not as anthropomorphic defaults.

Executive ActionsLocked

This story's Key Insights and Take-aways are locked.

Create a free account to unlock Executive Actions for one credit.

Always free for Executives Club members. Join the Club

Taggedmicrosoft adrian-de-wynter ai-research large-language-models anthropomorphism age-of-empires-ii neural-networks scientific-method ai-governance

Microsoft researcher builds goat-powered Age of Empires neural net to challenge LLM human assumptions

This story's Key Insights and Take-aways are locked.

More in Technology

CMF by Nothing says Phone Pro 2 is its last 2025 phone

Steam Next Fest puts demos center stage, including a Virtual Boy-inspired shooter you can try now

In 2031 thought-experiment, US AI funding and EU complacency collide