In the startup analytics world, I’m sure you’re familiar with the following situation: you’ve got billing systems, product databases, event tracking, surveys, churn prevention software, sales, customer service interactions, and uncountably more systems. SOMETHING that you’re doing is getting these customers to pay, and to become loyal, but what is it? This is where you need to start using advanced analytics and potentially a legitimate data warehouse (but that doesn’t need to be scary!).
I’m realizing that even after just a few weeks of consulting, I’m answering the same few questions multiple times for different people at different companies. Specifically, post-initial traction size companies (generously, 750k or $1M in annual revenue). Oftentimes, this size company has had some small amount of dedication to data, but more often than not, it’s incomplete. At smaller sizes, you may not really have a big enough sample size to justify a thorough effort around data – if a 10% difference is just a few people, does it matter if you’re dropping that much data? However, around $1M, you probably have enough steady paying users to make a 10% data loss start to be particularly meaningful, and you also probably have enough data sources to postulate questions that you know you could answer – if only all the data sources talked to each other.
Philosophically, I think that the lean startup methodology can work at the expense of analytics. When something in the product is broken, you’re not going to push it to production, but how many startups will block a push because the tracking isn’t quite right? They must exist, but I haven’t worked at any. For this reason, I always want to start with the data sources that I trust the most: the product database. In SaaS, there will almost always be a lot of valuable information stored in a database somewhere. Manipulated effectively, this database is likely to have a lot of insight into what core actions a user is taking on your site. You may lose some granularity around where or how a user accomplished a task (depending on how complicated your app is), but stuff that gets stored by the product database is inherently high-impact. For the conference call company, it’s users, phone calls, team members; for the online education company it’s courses taken, courses completed; for the form digitization company it’s forms converted, submissions made, active users, etc.
Not only is it the most likely to be accurate, but if it’s broken or has missing data, the developers will probably notice, and be willing to fix it, no matter who noticed. Occasionally, you’ll be the one to find a bug, but it’ll be a high priority fix. When I do an analytics implementation or deep dive, this is where I look to start.
Access to a copy of the production databases is a critical place to start in most cases. However, that’s generally a pretty easy step. This data already lives in a database and making an “analytics copy” where your analysts won’t mess up production data is quite straightforward.
What will follow over Part 2 and Part 3 is a case study of an implementation of a new analytics setup that I hope will encourage other companies to do what they can to enable better decision-making from top to bottom.
If you’ll enjoy following along with this journey, be sure to sign up for the newsletter so that you’ll know as each part of the story comes out!
[…] discussed my analytics philosophy in Part 1. In order to follow along, let’s work through connecting all of your data sources […]
[…] advanced analytics parts 1 and part 2, we now have an amazing dataset to work with, where all of our data sources have now […]