Tag you're IT! A PM's Guide to Getting Event Data

Interana Blog StaffOctober 19, 2017

Many product managers in digital businesses today have mostly used SaaS web and mobile analytics products that combine proprietary client-side tagging with prescriptive analytics—think the Google Analytics, Adobes, and Mixpanels of this world. These products use "client-side tagging," meaning they report on the data they generate themselves by having javascript tags that they ask developers to add to their web pages or mobile apps. These tags transmit data on what users do through their apps or browsers over the Internet to the analytics services. This is one form of event data—data with a timestamp that represents what an actor does at that moment in time, whether it's a swipe, a song play, a document edit, or a login.

What many PMs don't know is that there are a lot of other useful sources of event data in their organizations. These other sources can provide some of the same information as well as different dimensions or events that are not available to client-side tagging. This post is intended to help product managers understand what other sources of event data may be out there to analyze with Interana.

Tweet: A #product manager's guide to getting event #data

Server-side log files

Most applications write server-side log files. That means that the application server responding to the user's web request or mobile app action prints a record into a log file with details about the request. The most commonly known of these are web server log events that track each http request served by a web server such as Apache, Microsoft IIS or nginx.

These server-side log events may exactly 1:1 mirror what you could log client side via tags. Or the backend application server that the web server speaks to may log even richer information, including details of a transaction that may not be seen by the browser or the web server. For example, how a media service's content management service is operating behind the scenes is often not available in client side logging. Or how a notification service sends SMS alerts to subscribers. B2B SaaS companies often have a lot of metadata that can be accessed in server side logging but can't be gleaned from the client. You need these other sources to get a complete picture of a user's or topic's behavior.

If you can get a sample of these logs or ideally access to a repository of them, you can bring those into Interana directly. What gets written into a server side logfile entirely depends on what the developer has chosen to log, and potentially what level of logging has been configured by an administrator. Depending on the format it might need some transformation, but Interana's transformer library can work with a variety of source formats and do some basic cleanup to make the data beautiful. Even if the data is not ideal, often starting to analyze what is there and showing it to developers helps them get an idea of what they should log for analytic convenience.

Ask your developer: where do server side log events get dropped? Can I get access? A sample?

Log Managers / IT Operations Analytics Tools

Over the last ten years, more and more IT organizations have been centralizing their application logs into log management repositories, also often known as IT Operations Analytics tools. These include Splunk, SumoLogic, Elasticsearch/Logstash/Kibana aka ELK and Loggly. IT people use them to alert on critical operations and security issues and comply with legal requirements to retain log records for extended periods of time. Yet the same log events are the same raw records of digital behavior that have value to you as a PM if only you could ask the right questions. Most of these tools have APIs that can extract a subset of these log events based on whatever filter you want and export them into the JSON format that Interana loves. Once in Interana, you get more powerful bird's eye exploration PMs love on the same log events where IT admins care about finding a needle in a haystack with their tools.

Ask your IT admin: do we have a log management tool like ELK, Splunk or Sumologic? Can I get access? Can you run an export of log events for my application in JSON format from these tools for me?

Data pipelines

Increasingly, developers are seeing logging as a major interface to their application because of business needs to analyze behavior. They are writing events with better structure designed to be read and analyzed by machines at scale. They are sending those events to data pipelines of various kinds in their cloud or datacenter environment. They are making these pipelines accessible to users and applications across the organization that want to consume these events for different purposes including analytics. The event data sent to these pipelines can be produced by servers or by client tagging. We've been calling this trend "logging with intent."

You might find your organization has one or more data pipelines with valuable data that you can siphon into Interana. It might be Amazon Kinesis, Azure Event Hubs, Apache Kafka, or Confluent if it's more oriented toward developers logging direct from their apps. mParticle and Segment are popular platforms that provide consolidated tag management and data routing—you implement their tags once and route data to all your analytics platforms and downstream business logic. With data pipelines, analytics is totally freed from event data generation and you can collect once and use many times. Interana works well with popular data pipelines and has how-tos in its documentation on integrating with many of them.

Ask your data engineer: do we have Kinesis, Kafka or another event data pipeline or bus? Can I get a sample of events from it? Can you make Interana a consumer of events from it for me?

Ask your developer: Can we implement mParticle or Segment javascript tagging to generate client log events independently from our analytics tool choice?

Metadata

Beyond what's in event logs themselves, other business systems and operational datastores often have metadata that relates to IDs in your log events and adds more context. For example, an event log may reference a customer ID and your CRM system may have information about that customer's demographics and account. You can often easily get export dumps of that data directly through familiar UIs like Salesforce.com and Marketo. This data can be supplied to your Interana administrator to create lookups so that the extra dimensions can be used to filter and summarize your event data.

Data Blending

One of the best parts of using a flexible analytics solution like Interana is blending all these sources together to get a full picture of the customer journey or lifecycle of a complex service. Some Interana customers like Comcast blend dozens of data sources combining real world activities like truck rolls with digital interactions—way beyond what client tagging would provide. Others like Sonos take significant advantage of modern data pipelines providing a "self-service" way for developers to drop new self-describing events into the analytic stream. What are the components of your service architecture that might be logging event data of value to you as a product manager? Talk to their developers and figure out where the data is being logged.

I hope this guide has been a useful primer from one PM to another on getting data beyond simple single-destination tagging. Let me know what you find in your org! cfrln@interana.com.




Previous article Blog Summary Next article