4 min read

Sweet Spot

Here’s the thing about a good simulation: it has a compact set of mechanics while still allowing for a rich complexity of possible outcomes. You can wrap your head around it. You are given a small set of primitives, a handful of knobs to turn. It strips back the domain to its essence. Yet it still has enough depth that you could explore it for hours and still feel like you haven’t seen everything.

Consider one of my favorite games: The Sims.

The Sims models modern life. Your sims can do all sorts of realistic things: make friends, have arguments, fall in love, fall out of love, move, get a job, have a side hustle, have hobbies. They experience joy, grief, and a range of emotions in between. Each sim has their own unique story. It’s such a rich world that the events in the game and interactions between the sims can feel startlingly realistic.

However the mechanics behind the game are elegantly simple. Sims have needs, traits, aspirations, skills, moods, and relationships. It’s a small set of defining attributes that affect your sims reactions and behavior: a tidy, elegant model.

In keeping with the simplicity of The Sims, the Curious Duck software development simulation has a tiny set of primitives at the core of the game mechanics. Just three, to be exact. There’s work, workers, and queues. In the future there will be a few more, but for now these three allow for modeling a fairly large number of scenarios.

Last week I was working on building out one of the early scenarios in the game. It involves a situation that will feel familiar to anyone who has worked in a traditional model of software development:

  1. A programmer has a set of tasks.
  2. The programmer works on their tasks and delivers code.
  3. A tester picks up the delivered code.
  4. The tester finds bugs and reports them to the programmer.
  5. The software ships (and the scenario is over) when the software has no more known issues.

In simulation terms, this scenario is a tiny loop with two workers. Each worker has two queues: an inbox and an outbox. They're connected: the programmer's outbox is the tester's inbox. The tester's outbox is the programmer's inbox. The scenario starts with an initial set of work in the programmer's inbox.

Sounds simple, right? Indeed, I expected it to be fairly easy to implement.

The first complication came when I realized that the programmer and tester aren't exactly the same. The programmer picks up a task from their inbox and delivers a code change. There’s a 1:1 ratio of work in to work out. Then the tester picks up the work and…

...oh.

Instead of turning each input into an output, the tester produces a variable number of bugs. I needed to make a distinction between two types of workers. No problem. That was easy enough to represent in code.

So, how to determine how many bugs the tester would produce?

In the real world there are numerous contributing factors that affect how many bugs a tester might find. How much time does the tester have? How skilled are they? Are they blocked on access to a scarce resource, like a test environment? How well do they understand the system? Is this a greenfield project or legacy code? How well does the programmer understand the system or existing code base? How skilled is the programmer? How clear are the requirements? Did the programmer test their code before sending it on to the tester?

In the spirit of making the simulation as simple as possible (but no simpler), I needed a model that elided as much of the real world complexity as possible while still retaining enough variation to make the game interesting.

After much wrestling, I distilled it down to two characteristics of work: size and reliability. The less reliable the work, the more likely there are bugs; and bigger changes are likely to introduce a higher number of bugs.

I plugged some numbers into the model, then ran the simulation and eagerly watched as work flowed through the system. The programmer finished a task. The tester picked up the code. And the tester found bugs. Thousands of them. Thousands! Big ones. The programmer’s backlog was completely overrun. The scenario was now in an infinite loop. With each bug fix the programmer delivered, the tester found thousands more bugs. It was pure chaos. Might be hilarious as an animated gif, but it didn’t make for a playable game. WHOOPS!

I adjusted the numbers I had assigned to reliability, cost, and probabilities. Scaled up and down the amount of work a given bug represented. Iterated. Eventually struck a balance and ran the simulation again.

This time, the tester found fewer and fewer bugs with each cycle of work through the system. After a few cycles the software had no known issues. The scenario ended.

But now the game, while more realistic, was boring. So very boring. The scenario output read like a status report.

Will Wright, the inventor of The Sims, summed this up in his article on 5 Design Tips for Game Systems:

“Too much stability risks boring players, but too much chaos makes the player feel out of control. Use your system dynamics to strike a balance.” – Will Wright

I'm learning that this, more than the complexity of the code, is why it takes so long to develop a game. It's one thing to write the code; it's quite another to tune the game mechanics to make the game playable.

My near term goal is to publish a tiny taster of the game so you can play it, and if I'm lucky you'll give me feedback.

But first I need to tweak the engine some more to find that sweet spot between stability and chaos.

Stay curious,

Elisabeth