Applying Cohort Analysis to Understand User Retention

Applying Cohort Analysis to Understand User Retention

Today Dima Vecheruk, Data Analyst, shares some of his recent work to understand customer retention over time. 

As a member of the Loyalty Group, my current focus at GetYourGuide is customer retention and engagement over time. I'm working with mission teams on solutions and communication channels that provide long-term value for our customers. Providing long-term value to customers ensures their return to GetYourGuide next time they travel.

One important loyalty driver is our amazing app. It allows customers to easily navigate to their meeting point, use a mobile voucher, or manage their bookings with just a few taps. Of course, the app can also be used to discover and book new tours and activities.

But how to measure customer loyalty? We realized we need to distinguish between users who have just installed the app and haven't had time to experience its full functionality and users who have enjoyed our in-app service for a long time. While a lot of customers install the app to access the mobile voucher for their first activity with GetYourGuide, our goal is to create repeated engagement with the app that would give us more opportunities to surface relevant tours and activities to our users.

With this aim in mind, we decided to do an analysis of user retention in different acquisition cohorts.

What is a cohort?

The term cohort originates in Sociology. For business analytics applications, a cohort is a group of leads or customers who share common characteristics or experiences within a defined time-span (see Wikipedia for further elaboration).  

For example, we can define a cohort of users by the date they first installed our app or made their initial purchase. We can also add other shared user properties to the cohort definition such as geography or the marketing acquisition channel that brought the user to our site.

Why are cohorts useful?

Cohorts are useful in all situations that require observing user behavior over time (customer life cycle):

  • Customer loyalty (Since each person in the cohort has had the same amount of time to gain experience with our service and will hopefully like it enough to come back!)

  • Tracking the impact of product changes (e.g. cohorts exposed to old or new versions of an app)

  • Comparing the behavior of new vs. returning visitors

  • Comparing the role of acquisition channels in long-term retention

Using cohort analysis to understand customer retention

To understand app performance in terms of user retention, we started with building acquisition cohorts of app installers.

From a table of app logs (tracking of app interactions per user), we generated a subset that defined our acquisition cohort by the date of the first app install. We used the amount of time between the first install and another app interaction (date of app usage in the table) to understand user retention after X periods. We then assigned each user to an acquisition cohort by the week/month/year of their initial app install.


From this table we constructed a retention curve for the app installers we acquired each month (the retention curve for the March 2015 cohort is shown below - note that all data points here and in the following diagrams are fake and used for illustration only). The curve starts at the top left-hand corner where we see 100% of the users in the first month of acquisition. A month later, the share of returning users we see in the app falls to 28.2%, and then stabilizes after 6 months.


We can use year over year comparisons to see if we improved user retention over time. Below, you can see the same graph for users acquired in March 2015 and in March 2016.


In this case, we can see our longer-term retention notably improved!

Next, we would like to know if the improvement indicates a general trend or if we are looking at a particular March outlier. We regroup our customers into annual cohorts and compare the users acquired not in the same month, but in the same year. Comparing all of 2015 users to all of 2016 users, we can indeed see a remarkable improvement in retention, which of course translates into a higher customer lifetime value. This is the ultimate goal of all loyalty supporting measures because each touchpoint with a customer is a chance for a new transaction.


Segmenting cohorts by marketing channel

We can go even further by additionally splitting users according to the marketing channel through which we have acquired them.

This can be done by assigning each app install to a marketing channel (which in itself can be tricky, but that's a different topic). In fact, we can use any demographic or behavioral characteristic to specify customer cohorts as long as it can be joined to each install based on the user id.

With this functionality, we can assess the influence of the marketing acquisition channel on a user's long-term retention.

Below we can see retention is significantly higher for the “red” channel than for the “green” channel. This means we can afford to invest a bit more money in the “red” acquisition channel, because users from this channel are more loyal (hence, more valuable) in the long term.

The possibilities for further segmentation are limited only by the depth of your user segmentation!


Try it yourself

At GetYourGuide, we are using Looker for this kind of analyses. If you would like to try it out yourself, there is a useful Looker Analytic Block that shows how to build a retention analysis explore out of a user table and app usage table, which is very similar to the example discussed above.

If you would like to dig further into marketing analytics at GetYourGuide, make sure to check out our open positions on the Data team in Berlin.

A Balancing Act: Diving Headfirst into the World of Product Management

A Balancing Act: Diving Headfirst into the World of Product Management

10 Reasons to Join Our Sales Team

10 Reasons to Join Our Sales Team