Comment & Opinion

Inside the Game: How to deal with a bug that kills your live game

Kara Loo on staying calm and not panicking

Inside the Game: How to deal with a bug that kills your live game has partnered up with US developer Pixelberry Studios to highlight its candid stories on the trials and triumphs of a startup game studio whose debut title High School Story stayed in the top grossing top 100 chart for a year. 

This bi-weekly series of articles will provide a mix of drama, detailed learnings, and actual numbers from their experience launching and supporting a top game.

So you work at a startup. One day, the inevitable happens, and something goes wrong.

Maybe your latest update crashes on launch, or maybe there's a horrific graphical glitch that renders the game nearly incomprehensible.

What's important at that point isn't just figuring out what went wrong, but also in figuring out how to scramble your team to respond and get through the crisis.

What happened

If you've been following this series of articles, you'll know that Pixelberry has a strenuous schedule of releasing at least one update every single month.

And within that, we release downloadable content every single week - these downloadable drops contain quests that continue the stories of our products.

After the release of our second game, Hollywood U, our schedule doubled.

With that grueling pace, there's more pressure on the team to produce content quickly and make sure every release is bug-free. For the most part, things run smoothly, but every now and then, a bug slips in.

Right after the release of our second game, Hollywood U, our schedule essentially doubled.

We didn't want to compromise support of High School Story, so we staffed up our team and planned to keep the grueling pace of releases going on both products.

Some players were reporting that their games were freezing and they weren't able to play. This is how Pixelberry got the game back up and running.

Step 1: Early detection

  • Friday, December 19th, 2014 - 9:45PM

The day of the downloadable content's release, our customer service team noticed a handful of reports from players having trouble with their game.

Within the hour, this issue was escalated to our developers, QA team, and producers via a crisis hotline.

The downloadable content was removed from the server as soon as the problem was identified, but players who got the content during the couple of hours where it was available were left unable to access their games.

The problematic quest in question

The team scrambled to come up with a solution and to mitigate the damage. Tension was high, along with concerns about how this had happened, what we could do to prevent it from happening again, and the driving urgency to get a fix out - fast.

  • Tip: Pixelberry has a 'crisis hotline' for big issues. An email sent to the mailing list will generate a text message to each person on the list. Though we've only had to use this twice, we've found that it's useful to have this in case of emergencies.

Step 2: Fast, but careful fixes

Even though the team was working late hours - in some cases, not sleeping all night! - and we all wanted to get a fix out fast to minimize the impact, it's crucial to not do more harm than good when trying to make a change.

In the worst-case scenario, doing something quickly without fully knowing the consequences could have even caused more problems for players.

  • Saturday, December 20th, 2014 - 6:38AM

In the early morning of Saturday, co-founder Winston She devised a fix. This fix was tested by QA and verified to work, then released.

Step 3: Communication with Players

Though I'm listing this step third, from our experience, it should be going on throughout the crisis process.

We reached out to every person who contacted us about the crash and letting them know that a solution was coming soon.

It's important to not only fix the problem, but also to let any stranded players know that we're working on it as fast as we can.

During the crisis, I worked with our customer service lead to craft the wording to use, then that wording was distributed to all of our player specialists so they could begin the task of reaching out to players.

For our team, this means reaching out to every single person who contacted us about the crash and letting them know that a solution is either available or coming soon - and thanking them for letting us know that they were having problems.

Even a 1-star review bashing the game for having a crash is a sign that the player cares enough to try to get our attention. They want their game back!

And by giving us more information, such as what device they're playing on, where the game was when it crashed, or what the specific circumstances were, they can help us diagnose what's going wrong.

Step 4: Assess the damage

When trying to figure out how many players are impacted by a given problem, it can be difficult to get a good estimate of the scope.

We often look at how many tickets come in from our in-game 'contact us' section, though this can be misleading if the problem in question is locking players out of their games.

Another good way of measuring impact is to read through the most recent reviews of the game and count up how many reference the issue. Checking social media channels, like Facebook, tumblr, or Twitter, can also give you a sense of the scope.

Bug released around 5pm. Slight dip in Grossing position in the morning. Bug fixed at 12pm. After 12pm, increase in Grossing started.

You can also check the rankings of the games. When Hollywood U'scrisis hit, we looked to the top grossing chart to see if our sales were being hurt. While we did see a drop when the bug was introduced, we were also able to see a trend going up when the bug was fixed.

Step 5: Figure out the problem in your process and fix it

While it can be tempting to start the process off with trying to get to the bottom of who was at fault, the truth is that fixing the problem and keeping the atmosphere positive is often far more productive for the team.

Once things have settled down, though, it's important to review the processes that were in place that allowed the bug to get through - and to fix those processes.

In our case, this meant adding in more time for QA, hiring more headcount both to the writing team for content production and the QA team for testing, and also putting more meetings in place for communication between those two teams.

We also split up the release dates, so that Hollywood U and High School Story no longer share the same day for new quests coming out. This lets the release team focus on each content drop for each product without distraction.

And it's working. Since that time, we've released a new set of quests for each product every single week without further incident.

Kara Loo is the COO of Pixelberry Studios. 

Through partnerships with non-profits, Pixelberry's hit game High School Story has taught millions of players about tough teen issues, like cyberbullying and eating disorders.

You can find out more at regularly posts content from a variety of guest writers across the games industry. These encompass a wide range of topics and people from different backgrounds and diversities, sharing their opinion on the hottest trending topics, undiscovered gems and what the future of the business holds.