Developing Job Skills with AI Simulations: How Realistic Do Practice Simulations Need to Be?

While we often think of “simulations” as a modern phenomenon, like meteorologists modeling hurricanes with supercomputers or airline pilots practicing in a mock cockpit with computer monitors mounted on the windows. However, humans have been using simulation as a learning tool for millennia. For instance, the ancient Greeks used theater as a way to simulate moral dilemmas, ancient Indian and Chinese physicians used clay models of organs to practice surgery, and Roman and Viking military leaders used “sand tables” to sketch out battle plans.

The ability to realistically model, experiment, and practice something in a simulation before trying it in real life has only become more important as humans are called on to navigate increasingly complex systems and tasks – from risk management in financial markets to cancer treatment to disaster response. However, developing realistic simulation exercises required weeks if not months or years of preparation – whether it was creating tabletop exercises for cruise ship personnel to practice management decisions or bringing together hundreds of public safety and healthcare personnel and hiring dozens of actors to rehearse an elaborately scripted (yet still not entirely realistic) terrorist attack.

But today, with generative AI, it’s possible to create comparably elaborate simulations with a few quick keystrokes. Where our company used to spend weeks planning simulation-based workshops for organizations like the US Army Corps of Engineers, today we’ve been using AI-based tools to help clients simulate everything from wildlands firefighting to manufacturing, with the AI envisioning the workings of an entire ecosystem or supply chain in seconds, and responding in real time.

Still, just as chess is nothing like real-life warfare and cutting into a holographic cadaver is nothing like operating on a living human being, even the best AI simulations aren’t a 100% accurate facsimile of a real world experience. Which raises a question: How realistic is realistic enough to provide a valuable learning experience?

Reality-Checking AI Simulations

AI’s ability to spontaneously generate plausible narratives about any situation is a double-edged sword. If you simply go to a platform like ChatGPT and say “Create a training simulation where the user is a charge nurse at a hospital” it might produce a superficially compelling ‘choose-your-own-adventure’ narrative, but the key word here is “superficial.” For example, it might miss key details – like having hospital patients go directly to a doctor instead of first seeing a triage nurse – or generating situations that are seemingly more inspired by television medical dramas than real-world healthcare settings.

AIs might also overemphasize certain situations while underrepresenting others. When we first produced our flight attendant simulation, almost every situation revolved around passengers being unable to open the door to the toilet or passengers struggling to get their bags in the overhead compartment. But, common as those situations are, flight attendants need to prepare for a wider array of challenges.

And because AI has a tendency to defer to the user, it might not understand what the limits of the user’s power would be in a given situation. It might allow a flight attendant to unilaterally decide to delay or cancel a flight, or for a customer service operator to give a full refund to every caller who complains.

If we want to correct this, we need to expand on our simple “Create a training simulation where…” prompt, and add detail on the relative frequency of certain types of events (e.g. “there is an X% chance of mechanical problems and a Y% chance of customer service issues during the flight), the limitations on the user’s power (e.g., “if the user attempts to do something that would be beyond the authority of a park ranger, tell them that they will need to contact their supervisor for permission) and other guard rails to keep the simulation within the bounds of plausibility.

Balancing Realism with the Learning Experience

While realism is important, it doesn’t always correlate to the effectiveness of a training exercise.

For example, the most realistic way to simulate a sales conversation is to require the user to type out every word they would say during a meeting. That said, there’s relatively little value in making trainees key in all the specifications of a product for the benefit of a simulated buyer.

So when do we want to replicate reality vs. streamlining narratives for the sake of the learning experience?

  1. Simplifying for Clarity: In a real-life disaster response operation, the emergency management agency’s budget might be divided into smaller pools of money that can only be used for specific purposes (e.g. $X for transportation, $Y for compensating volunteers).  However, simply saying “You have a $3 million budget” in a disaster management simulation might help players focus on the larger, strategic concerns involved. 
  2. Simplifying for Convenience: Letting a user say “I hand the customer the brochure” is usually better than forcing them to walk through a long list of technical specifications (though allowing the user to type “I say something so compelling the customer immediately decides to buy twice as much from us” is too much of a shortcut).  
  3. Simplifying for Narrative: Sometimes it’s not worth letting reality get in the way of a compelling story or a useful critical thinking exercise: for instance, the customer service manager at a hardware store might spend most of their day on paperwork and reports while only handling a small number of customer issues.  But if we’re more interested in practicing the customer issues than the paperwork, we might gloss over the purely administrative aspects of the job. 

Introducing a Realistic Degree of Uncertainty

Sometimes, AI scenarios (and the humans who design them) forget that including too much detail is sometimes less realistic than not enough detail. By their very nature as problem-solving tools, AI models such as ChatGPT are predisposed to share as much information as possible with the user – but we don’t want it saying, “The customer nods enthusiastically, as the price you just offered is within the range she has been authorized to spend” during a high-stakes sales negotiation simulation. Similarly, in our wildfire firefighting and charge nurse simulations, we had to prevent the AI from saying things like “Your supervisor approves the request for additional resources, but they will arrive 3 hours too late, and some of the repair parts will be the wrong type for your equipment. What would you like to do next?”

In fact, one of the first things we added to our AI training simulation player was the ability to let the AI write down notes without necessarily sharing them with the player.

Focusing on Decisions Over Details

Ultimately, the standard of whether a training simulation is “realistic enough” is whether it meaningfully tests the user’s ability to make decisions and apply skills – even if the setting itself is simplistic or even fanciful. A customer service simulation set aboard an alien cruise ship in another galaxy is absolutely fine… so long as the challenges that the user deals with require them to exercise the same decision-making muscles they would use at their real-life day job.

We actually encourage training participants to ask our AI simulators to generate a variety of settings – for instance, we had a group of law enforcement officers set a de-escalation training scenario in Al Capone’s Chicago circa 1929, and even ran a sales negotiation workshop where we had participants play as silk merchants in 12th century Damascus. The historical settings made things interesting, but the communication and decision-making skills required proved timeless.

Conclusion – “Realism” is Relative

The strength of AI lies in the sheer variety of scenarios it can generate, and how quickly it creates them. But while it’s easy to make halfway plausible simulations with AI, it’s not easy to make useful simulations with AI.

In the end, there isn’t a single, ideal level of realism that scenario designers should aim for. Different learning objectives and different audiences will call for different things. And again, usefulness, not realism is always our guiding star.

Games like chess and sandtable simulations haven’t survived for millennia because of how accurately they recreate reality, but rather because they provide a useful model of reality. And the same applies for AI-generated simulations and role plays.


By signing up for the newsletter, I agree with the storage and handling of my data by this website. - Privacy Policy

This field is for validation purposes and should be left unchanged.