SONATAnotes
Teaching AI to Emulate Human Irrationality (or “Why Can’t the Machine Make All the Decisions?”)
There’s a reason why workplace comedies and dramas like “The Office”, “Succession”, and “Scrubs” are perennial TV favorites: we have all experienced situations at work where colleagues drive us insane with their irrationality, irritability, or flat-out ineptitude. And while hopefully our workplaces aren’t quite the stuff of television, it’s fair to say that – if you’re interested in training people for real-world working conditions, your training needs to account for human foibles such as irrationality, ego, resentment, and fear.
Case in point: our company designs interactive role play activities and simulation exercises for things like retail customer service, negotiation skills – even managing a hospital floor as a charge nurse – using AI models like ChatGPT and Gemini. However, we’ve often found our efforts are at odds with AI platforms’ tendency to put a positive spin on everything.
For example, the early versions of a leadership simulation we designed to help executives hone their communication and conflict management skills were undermined by ChatGPT’s propensity to take a ridiculously optimistic view of everything. Even in simulations where a toy company was facing a scandal involving toxic materials in their products – a situation where someone in the room was definitely going to get fired – the AI-generated executives were all smiles and helpful suggestions.
In fact, when my wife playtested the first draft she basically used ChatGPT’s helpfulness as a cheat code, asking her impossibly calm and fair-minded leadership team “How can we all collaborate to solve this crisis?”. After which the characters would all offer very reasonable suggestions, to which my wife could simply reply “Great, go do that!” – resulting in instant victory, every time.
While at first this hack was funny (kind of like the Japanese comedy “One Punch Man” – about a bored superhero who can defeat any villain with a single punch) it did call the validity of the training exercise into question, and sparked debate and brainstorming amongst our company’s AI scenario designers (“prompt engineers”) about how to inject more human fallibility into the equation.
Does “Dysfunctional” Equal “Realistic”?
So what does “realism” look like in the context of skills training?
Realism requires human fallibility. A TV show with no conflict would not only be incredibly boring, it would also be unrecognizable as reality.
Yet too much drama isn’t necessarily real either. While the AI-generated executives in our toy company crisis management simulation might have been a bit too helpful and cheery, one would hope a room full of seasoned leaders at a large, successful organization would be able to solve most problems without much drama. We could force the AI to dial up the drama until the whole thing turned into an episode of Succession – with the AI executives all scheming to avoid blame and get promoted at their colleagues’ expense – but that wouldn’t be true to life, either.
This raises the key question: just how flawed do AI generated characters need to be in order to come across as authentic and real?
The answer, as always, is “it depends.” Some of us are fortunate enough to work in highly functional teams, while others of us are trapped in more difficult situations. Heck, some organizations even have a “Jekyll and Hyde” personality and can swing from harmonious to cutthroat depending on the specific people involved and the nature of the issue at hand.
Ultimately, we concluded that we should let the user decide this issue for themselves. We gave them control of both how stressful any given playthrough of the simulation would be, as well as how (dys)functional the organization’s leadership team should be.
But this led to a new question for us to deal with: How do you quantify levels of dysfunction and describe them in computer-friendly logic an AI model can understand?
Helping AI Learn to Play Dumb
Since we decided to allow users of our leadership simulation to determine how functional the AI-generated leadership team is – a setting they can change with each new simulation – we needed to ensure the AI understood what “functional” vs “dysfunctional” behavior looked like.
Thankfully, AI thrives on exemplars – examples written into the text of the “prompt” that show the AI what a given behavior or situation might look like. So we created plenty of examples for the AI that demonstrated both counterproductive and productive behaviors, described what sorts of personality traits would be seen in these scenarios, and finally, mapped out behavioral patterns that would be associated with these traits. For example, a person with a consensus-seeking disposition could have the following three behaviors:
- Productive behavior: Acts as a diplomat and peacemaker
- Neutral behavior: Tries to gauge the mood of the group and work towards consensus
- Counterproductive behavior: Passive aggressive and intolerant of dissent
This sort of guidance – teaching the AI how to behave like a human – is crucial. When engineering an AI simulation, you don’t need to describe things like “what sort of crisis a semiconductor manufacturer might face”, because it can learn those things from news headlines. Conversely, the AI won’t know how to model the behaviors of real-life executives unless it’s shown how to do so.
If AI Defaults to ‘Smart’ and ‘Reasonable’ – Why Train Humans at All?
While accounting for flawed human behavior did make our training scenarios feel more realistic, there were times when the prompt engineering team had to stop and ask, “What the heck are we doing? Why bother helping AI emulate counterproductive human behavior just so it can train flawed humans to collaborate with other flawed humans? Why not just eliminate the human element and let the AI make decisions based on whatever criteria it’s using to evaluate human performance?”
While plenty of leaders at large organizations are thinking this exact thing – and calculating how much they can save in wages by replacing humans at every level – the reality as we’ve discussed elsewhere is that the most effective decision-maker isn’t an AI and it’s not an unassisted human: it’s a human who knows how to wield AI tools effectively (or “centaurs” as human-AI teams are known in competitive chess).
However flawed humans might be when it comes to decision-making, AI has its own problems that require human oversight and intervention. A great example is radiology, where AI models have shown impressive accuracy at diagnosing X-rays and other images. The only problem is that the judgments the AI makes, even if accurate, are often based on factors that have nothing to do with anything shown on image: Radiologists have discovered that AIs draw conclusions based on everything from what brand of x-ray machine is used to what angle the x-ray machine was positioned at – all irrelevant data which harms the reliability of the AI’s conclusions.
There is also real value in training AI to model human shortcomings – pointing out negative human tendencies can help AI to evaluate its input more critically. For instance, training AI about biases can help it to account for the concerns of women and people of color when drawing conclusions from medical studies that might have had mostly white male participants. Even today, 78% of medical trial participants are white and many clinical studies are up to 58% male.
An AI only needs its algorithm corrected once to recognize and account for these biases going forward. In contrast, a human researcher might go through years of anti-bias training and never be able to overcome their assumptions.
In fact, AI can even help humans check their reasoning and assumptions for these types of biases, as well as other errors. A group of researchers at MIT developed a method for AI to assess the computational limitations and past decision patterns of human users (or other AI models) and predict when they’re about to make a mistake.
This is an approach to AI that supports human decision-making by intervening when humans need it, but it also prevents situations where humans allow the AI to make decisions for them, because the AI doesn’t get involved in issues unless it predicts a flaw will emerge in the human decision making process.
For research purposes, teaching AI to model flawed human behavior also allows us to observe those flaws in a simulated setting. An AI that can model human flaws can allow us to study those flaws in an ethical manner that doesn’t involve the complexities of involving real human subjects.
Conclusion
Teaching an AI how to model flawed human behavior to other humans allows us to use AI as a tool that can train and support human beings to make better real-world decisions. Managing the messy complexities of dealing with other human beings is a vital skill, and creating more realistic models of that messiness can make the difference between overly theoretical “textbook” training and real, authentic practice that workers can actually benefit from.