Web analytics is now into its second decade, so it is worthwhile to take a broad survey of the state of the field today. My assessment is pessimistic and may not agree with everyone's perceptions. Web analytics practitioners (vendors, analysts and industry evangelists) have, in my opinion, a skewed view of the industry because they tend to meet people who are already "analytics aware." However, I think anyone who stands outside the industry and looks at what most online businesses are doing will share my view.
Web analytics is, in theory, central to doing business online. It should occupy the same position in business that bookkeeping occupies -- no one should conceive of being online without measuring activity any more than a business would spend or earn money without keeping track of it. This is, however, not the case. Web analytics is poorly understood and dominated by misconceptions. This is not helped by the low standard of the industry in general. In view of this, I will first discuss some key misconception users have about web analytics, and then provide a short overview of the industry.
Misconceptions
There are four key misconceptions in web analytics: that web analytics systems are objective, accurate, and internally consistent, and that the numbers from one system are comparable with the numbers from a different system.
Objectivity
We tend to assume the numbers we read in web analytics reports reflect visitor behavior. For example, if the reports say that a visitor spent five minutes reading a page, we assume a real human being spent five minutes reading that page. In fact, we don't know that at all. All we know is that a computer requested the page at one moment in time, and that same computer requested another page five minutes later. With some analytics systems, we don't even know if it's the same computer, only that it's the same IP address. We make many assumptions about the nature of what is going on at the other end of that IP address, but we don't really know.
These assumptions and inferences are spread right throughout every web analytics system. Because we don't know what people are actually doing, we have created a series of arbitrary assumptions about what aspects of visitor behavior our web metrics are describing. For example, a bounce is defined as a one-page visit. We take bounces to indicate people who arrived at a website, but did not find it of interest, and thus left without entering the site. Therefore, the bounce rate is supposed to tell us how many people didn't like the site.
However, when labeling a visit as a bounce, web analytics systems do not take into account how long someone spent reading that one page, or whether they came back again. If somebody spends 10 minutes reading my homepage, can I really say they didn't find the site of interest? On the other hand, if they spend 30 seconds on the site, then come back a few hours later for an hour or two, can I really say the site was of no interest to them? Is it not more reasonable to conclude they found the site of such interest they decided they should return later when they had the time to engage with the site at length? Neither of these scenarios is measurable with today's web analytics systems. So what does the bounce rate really tell us? It's highly likely that a significant proportion of the people who "bounced" did not find the site of interest, but we really had no way of knowing what that proportion is vis-à-vis the other types of visitors I have detailed above.
It is possible to provide a web analytics system that considers these subtleties. The reason this has not occurred is that there is an assumption on the part of people creating web analytics software that most metrics practitioners are not intellectually capable of, or interested in, systems that provide such levels of precision.
Next page >>