The Quality of Web Analytic Data. Or, why you need to be doing High Impact Segmentation

Avinash Kaushik of Intuit had a great post recently on the poor quality of web analytic data. A side story to me was how much better the amount and quality of data we have for online marketing is over every other measured medium. As we look more and more towards this data and the quality of it to define audiences and behavior there is an important optimization concept that emerges. There is also a lesson of failed web 1.0 thinking and 2.0 simplicity.

Think of any time you’ve compared data sources measuring the same metric. How many times do the numbers match? Never. How often are they within 10%? Probably, not too often. It is because of the questionable quality of data that you can’t too sharply define, cluster or segment for personalization and targeting. Data discrepancy dictates that segmentation must be done at high source levels to ensure confidence and reduce margin of error.

No problem here though. The greatest value in segmentation is found at higher levels in the food chain. Test where the impact is greatest. It’s likely that segment will be pretty large. An additional benefit is that with all that traffic at those higher levels you need a relatively short amount of time to get signal. It is a fast and simple way to test and lends itself to continued iteration forming the backbone for Relevant Design and Agile Marketing methodologies that I encourage you to think about for your business or clients.

So what was the lesson here? Take a look at all those personalization companies from the .com bubble. Where are they now? The first issue is that they had no testing platform. They weren’t agile. So all the great personalization ideas, well nobody knew how effective or relevant they were to users. But really I think it was that they sliced their ideas and concepts to thin, focusing and delivering personalization on meaningless and non-relevant micro levels (Welcome back Rick!) when larger issues like delivering content users have told you interests them from their click path were simply not possible (beyond Amazon and a few others) with this large, heavy 1.0 enterprise software.

New ways of thinking always encounter resistance. But more and more of the smartest people in optimization are moving toward agile strategies using high impact segmentation. The web is undergoing a revolution in the delivery content and experiences will continue to become more and more relevant. Anyone who is optimizing media by segments and sending and testing targeted content delivery has a competitive advantage. For how long is unknown. What I do know, is at some point in the next few years we’ll wonder how we ever delivered relevance without segmentation.


4 responses to “The Quality of Web Analytic Data. Or, why you need to be doing High Impact Segmentation”

  1. Avinash Kaushik Avatar

    Jonathan: The beauty of bringing testing, especially MVT, is that it levels the playing field on so many levels, not just data. My hypothesis is that people get so hung up on data quality is because they want to make do or die calls on data and they want certainty (in a world where so little of it exists).
    But when you are trying different recipes or relevant content by segmentation you can get data apples to apples (even if slightly rotten apples to slightly rotten apples). That makes it easier for people to digest and accept the results.
    Of course that is just one of the many benefits of having a segmentation and testing strategy.
    Great post.
    PS: I do disagree slightly on the quality of data in other parts of the world. I grew up in the world of ERP and CRM systems and large data warehouses and traditional business intelligence systems. Data quality there could be within one percent. Mostly because else the SEC would come calling. But that is a completely different world than the web, which is a unique beast in of itself. I suppose that is what makes it fun.


  2. Jonathan Mendez Avatar

    Thanks for your comment and compliment. Your post from a couple months ago got me thinking about a great many ideas but I thought of it again recently as James and I have been discussing the value of segmenting in larger groups and iterating with sub-segments based on the data vs. starting with thin slices. This of course is the same idea we employ for good testing strategy, that you would start a LPO test with A/B..N and then move to an MVT based on the results.
    I agree with you on CRM and ERP data. My thoughts were more along the lines of the media data marketers and advertisers use today to base their decisions on.


  3. Thinking Aloud Avatar

    Data Quality and Segmentation

    Jonathan Mendez’s Blog: The Quality of Web Analytic Data. Or, why you need to be doing High Impact Segmentation…


  4. Vivian Zhu Avatar
    Vivian Zhu

    One of the latest painful exercise I went through with our analytical team is trying to compare and match the online data results from two major online tracking companies for the same DM-drive-to-Web campaign we implemented for our fortune 100 company.
    Both tracking tools are used by my client, and as the agency we are entitled and obligated to explain what is going on with our campaign with both data sources. The discrepancy on each steps from initial clicks, to unique visits, to landing page, to enter the sign up tool, to final conversion, are all higher than 10%, sometime up to 40%. As a good traditional direct marketers, we also have match-back data as our backbone number, which is claiming 30% higher than the web-tracked conversions. It makes our client jump up and down and scream: How did they sign up? I want to claim them.
    Yes, it is more fun, when it puzzles and challenges us (trying to tie the untie-able is not). It is more fun when we know that there are truth out there that the number can not fully explain, because not all numbers are telling the same story. But exactly in its half revealing way, it sheds some hope of enlightenment. To know that unexplainable often time poses hope of truth, like an unknown planet hidden in the shadow of the known ones, is what ultimately motivate us to strive for the solution and answer through even smarter design, testing and analysis.
    In my humble example, we convinced our client to use a single source of data as our online web tracking tool, and complement it with offline match-back. But I often time wonders, what does the other one is trying to tell us, other than someone carelessly screwed up some tagging?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at

%d bloggers like this: