May 3, 2018

Where to Find an Insight? Part 1

GetYourGuide

Careers Team

Data

Working at GetYourGuide

Open roles

View all open roles

How does GetYourGuide know what customers want? What is an event and how do you track it? Where do you keep it? Dive into this article by Nikita Belokopytov, our Android Engineer, and learn what kind of dilemmas and challenges GetYourGuide has been through on it’s data-driven and customer-oriented path.

Busy and Eventful

A good company does everything for the customer, and here at GetYourGuide we are pretty serious about becoming the best company to ever grace the travel industry. It’s not easy to understand what our customers want - the pace of life in the 21st century does not allow us to talk with every single one of them (although we’d really like to). We have to be inventive and bold in order to stay on top. So, not only do we have a great team of analysts, product managers and researchers, but we encourage everyone in the company to help us understand where the hopes and fears of those who book with us lie.

Ideally, even before we start analyzing customer behaviour, we must set up tracking. Tracking is a system that sends us small bits of information about specific actions our users perform, like the name of the current visible page, duration of a visit, or number of activities browsed before buying. These bits are called attributes, and when grouped together and dispatched at the same time, they form an event (Fig. 1).

Figure 1. An example of an event that is sent when a booking has been made, with four attributes inside. — Figure 1. An example of an event that is sent when a booking has been made, with four attributes inside

After being sent to us, the events come to a huge warehouse of information where groundbreaking insights are slowly distilled from them. The more events we get, the more material we have to work with; it’s only a matter of figuring out where to dig first. That leads us to the next challenge.

The Bottleneck

When there is so much we can measure, how do we know what is the most important thing to track right now? On a strategic scale, the question is traditionally answered by a Senior Analyst or a Head of Product, and then distributed among the departments and mission teams. The bigger the company gets, the harder it becomes to get everyone on the same page and to focus the attention where it matters. What becomes even trickier is that shall the structure become too rigid, the company will start experiencing diminishing returns in the form of less diverse opinions and less initiative.

When there is a single team with only one Product Manager and one Analyst, it’s easy to focus and communicate different perspectives at the same time. So, keeping track of all the new types of events being created is as effortless as a morning stand-up. With two teams it’s still easy enough, although it already becomes difficult to focus when everyone gets in the same room to brainstorm. In my experience, three teams will inevitably cause loss of context and is enough to mandate a dedicated person responsible for keeping track of all the analysis and data. This dedicated person will have an overview and expertise and will be the first to talk to in case anyone wants to measure something new.

That being said, it’s not terribly efficient to go around asking permission for each new event ; it creates overhead and the risk that at some point people will just organically stop syncing with each other and drift apart. However, if your start with too much independence, you might end up in a place where other teams cannot benefit from your knowledge and are unable to re-use your data somewhere else.

A couple of iterations later

We decided to keep all our events in one huge table where the attributes would be kept in its columns (Fig. 2). The vast majority would be optional and not need to be filled out in any particular single event. The alternative would have been to have different tables for each particular event type, which would’ve caused a lot of overhead if suddenly a new one had to be tracked.

Figure 2. A single table storing two types of events. Vertical cells are called columns and represent attributes that are possible to store. Horizontal cells are called rows and contain all data received with a particular event. — Figure 2. A single table storing two types of events. Vertical cells are called **columns** and represent attributes that are possible to store. Horizontal cells are called **rows** and contain all data received with a particular event

Our assumption is that the majority of new event types would not require new attributes, since the essence of the GYG service remains the same (for now). In fact, there are already a lot of event types that are only different in name or place of origin and otherwise share the same set of attributes. Turns out the event’s name and origin are the only attributes all events have to have.

If a team decides to track a new type of event that does not add any new columns to our Great Events Table - they are free to do so without any hesitation. Controlling the attributes, while being lax on the diversity of events themselves is a flexible solution that allows teams to analyze their experiments independently and without interfering with each other.

To recap:

Adding new event types to track is delegated to teams
Adding new attributes for events requires a checkup and an entry in a global list of attributes with a comment describing it

In order to generate an insight one would need data to generate it from and a way to analyze it. We have just described a scalable way of getting the data, but what about a scalable way to analyze it? Stay tuned for the Part Two where we dive into the problem deeper yet.

‍

Join the journey.

Our 800+ strong team is changing the way millions experience the world, and you can help.

Open roles

Related blogposts

Engineering

March 22, 2024

Basemath—GetYourGuide’s Way of Sequential Testing. Part II: The Theory

Alexander Weiss

Senior Data Science Manager

Engineering

March 13, 2024

Basemath—GetYourGuide’s Way of Sequential Testing. Part I: The Application

Alexander Weiss

Senior Data Science Manager

Engineering

February 21, 2024

How we Leverage Postgres for our Search Data Processing Pipeline at GetYourGuide

Dharin Shah

Senior Software Engineer

Keep up to date with the latest news

Oops! Something went wrong while submitting the form.

Life at GetYourGuide

Our teams

Tech at GetYourGuide

Locations

How we hire

Blog

Where to Find an Insight? Part 1

Busy and Eventful

The Bottleneck

A couple of iterations later

Other articles from this series

Featured roles

Marketing Executive

Marketing Executive

Marketing Executive

Join the journey.

Related blogposts

Keep up to date with the latest news