Structured Logging in Interana

Interana Blog Staff

Interana's purpose is to help people understand how digital products are used in the world. However, Interana can also help people understand how Interana itself is used. It may sound a little meta, but Interana is just as capable of introspection as it is of analyzing the world around it. As engineers at Interana, we use this capability to improve the tool we deliver to our users.

Internally, structured logging is the language we use to reveal the inner workings of Interana and the behavior of people using it. What is structured logging? It's pretty simple. Structured logging is logging that records events that happen in Interana, in our preferred data format: json. Each line of structured logging records some common fields like event_class and timestamp. In addition, we log details of important events - user-triggered actions, system events, errors, etc. We can add structured logging at every level of the system, and since the format is json, it doesn't matter if we're logging in our Python query engine, our C++ datastore, or any other component of the system. Structured logging allows us to record behaviors and activities occurring in Interana. Once we have our structured logs, we import them into Interana (sometimes the same instance the logs were generated from!) so they can be queried and visualized.

So what can we learn from structured logging? Well, we can learn how Interana users are interacting with the system. For example, we can tell what time of day users run the most queries. We can see how many errors users experience as they use the system. We can ask more detailed questions, like what types of queries users are adding to dashboards, or what percentage of queries span more than 7 days. These insights help our team understand where to focus development efforts based on how users are interacting with the system.

The benefits of structured logging don't stop there, though. Since we also log system behaviors, we can draw conclusions about how Interana itself is performing as multiple users interact with the system. For instance, if we see a spike in errors for users, we can cross-reference that with event data for system heartbeats, resource consumption, background processes, and so on. This allows us to track down the source of the problem. With this visibility into the internal workings of Interana, we can make decisions like whether to add capacity to a cluster, or how to change our algorithms to optimize resource usage. And using Interana's living dashboards, we can monitor for unexpected changes in system behavior to identify problems over time.

Structured logging is a great way for us to analyze the performance of our system and identify areas to improve. But it also reveals the breadth and flexibility of Interana. Billions of people interact with complex systems on a daily basis, and more and more those events are recorded, ready for analysis. Interana allows users to analyze those interactions between people and systems, to understand what's really going on. In enabling that understanding, Interana is essential to how we do our jobs as engineers, and to how hundreds of our users do theirs.

Previous article Blog Summary Next article