How to understand mobile player behaviour, according to Pocket Gems

CEO Liu on mastering metrics

How to understand mobile player behaviour, according to Pocket Gems
As developers across the globe attempt to master free-to-play, half the battle for many is defining just what the model is.

One definition is to think of F2P as a psychological experiment - one that relies on the developer being able to predict and guide player behaviour.

In that context, player data rather than just pure revenue rates is arguably the best way of determining whether a F2P developer really has mastered the model.

Ben Liu, CEO of newly turned publisher Pocket Gems, touched on such issues during his talk at GDC, detailing his firm's focus on measuring retention using cohort analysis – the monitoring a group of people who began playing a game on a certain date and comparing their stats to that of older users.

It's a method that allows Pocket Gems to reveal whether post-release improvements to titles have actually been effective, informing any updates released in the future.

To get a wider take on Pocket Gems' approach, we asked Liu to help us unpack some of his data and advice to other developers.

PocketGamer: What is the optimum size of a cohort for your testing?

Ben Liu: It depends on what you're trying to measure.

If you're looking for a metric that's a parallel distribution then you need a lot of users.

You need more than a few thousand. But if you're looking for something that's bimodal, either a yes or a no answer like did they pay money or not, did they come back or not, then the number of users you need is less.

Could you roughly define parallel distribution?

Parallel is like revenue per user. It can be really different and is more of an exponential distribution. There are high numbers at different ends of the spectrum. There is lots of variation within that distribution.

How did you determine that a few thousand players is the best sample size?

It's based on statistical theory. If you look at how to figure out when something's statistically significant there are a lot of mathematical rules around bimodal distribution [that say] you can get significance at a certain number of users, and it's a few thousand.

What kind of noise in the data might indicate that a cohort size is too small?

The larger number of users the better, but it can take a long time or it can be very expensive to do.

If you do a repeated test, if you look at different days and you get widely divergent results, that could be an indication that your cohort's too small.

How did you determine your metrics for a good number of sessions per day? You said in your talk that for mobile games a good number is five or more sessions.

It's just based on data. We've launched over 20 games on both iOS and Android and looking at all the things that we've tested and tried out five or more sessions is a good rule of thumb.

And what did that correlate to? The person who played five or more sessions per day spent x amount of money?Ostensibly that metric will match up to some other metric.

It matches to long-term engagement in the game, so if they play all the time they'll stay in the game for a long time.

We think a lot less about monetisation and a lot more about engagement and how fun the game is and how long people stay in the game and how much they enjoy it. The number of sessions is a really good correlation to that.

What if somebody plays a single session a day, but they play for two hours? Which metric is more important?

On mobile specifically the number of sessions, just because of how people play the games, the number of sessions is really important. Length of sessions is important but not as important.

For social the opposite is true. It would be really important to look at the length and not the number of times they were playing.

How does the number of sessions interact with length of play?

I think you'll see total play time correlates with the number of sessions, so if you play more sessions you play longer as well.

You added both achievements and story elements to updates for Tap Zoo, both of which helped you retain players. But achievements and a focus on narrative didn't typically feature in mobile games in the early days. Does their success suggest there are more parallels between core and mobile games than people realise?

I would answer the question a little bit differently.

I would say that human beings love following different stories to see what happens next, and I think there are lots of different forms of telling those stories. Our questing system, where you have chains of things that lead to steps down the road and in the future, is just a way of doing that.

Core games have been around a lot longer so they have their own developed ways of doing that as well, but I think it's just tapping into a fundamental human desire to tell each other stories and hear from each other and see what's next.

What's a good resource for a new developer to refer to as they get started developing metrics for measuring the success of their apps?

There's a lot of literature and analytics providers out there on the market so I would read as much as I could and start with a few metrics, measure them, see if they correlate to success, and then build on that through a process of experimentation.

GDC is fantastic, and there are number of sessions which talk about this, so I would start with the GDC vault and look at all the sessions that are about metrics and analytics and then go from there.
Thanks to Ben Liu for his time.
Dennis Scimeca is a freelance writer from Boston, MA. You can follow him on Twitter at @DennisScimeca.

Dennis Scimeca is a freelancer from Boston. His weekly video game opinion column, First Person, is published by Village Voice Media. He occasionally blogs at, and can be followed @DennisScimeca.