Online advertising is a numbers game -- and that's supposed to be its strength. "Everything is trackable and measurable," the mantra goes.
When it comes to audience measurement, though, not everyone is using the same ruler. That leads to knuckle-whitening discrepancies in which a third-party service may size a publisher's audience at 50 percent less than what its internal numbers show.
What's at stake?
"If you under-represent by 20 or 30 or 40 percent the audience for something, that can have massive implications for media planning and for the perceived value of those properties," Yankee group analyst Daniel Taylor says.
Who's right? That's an argument that may never be solved.
Before we get into the latest salvos in this battle, let's do a quick reprise of the methodologies. There are two basic ways of measuring the number of unique visitors to a website: trying to count each one, the census method, or estimating the number by extrapolating from a carefully selected group, also known as a panel.
The most direct way of taking a census is to use server logs, but it's not the only way. Network-centric services tap into the data provided by ISPs and measure where their traffic goes. Some third-party measurement services rely on dropping cookies to track unique users; others tag each page of a site and then count the tags.
Alternatively, you can gather a bunch of people you believe are representative of the universe of internet users, watch what they do, and extrapolate their behavior to everyone else's. This panel approach follows a statistical model that scientists of all stripes rely on for their research; Arthur Nielsen pioneered its use for measuring television audiences in the 1940s.
No matter which method an audience measurement service starts with, each of them use top-secret formulas to weight the results, measuring them against other databases and known quantities to get what they think are the most accurate results.
It's not surprising that the census, or site-centric method, leads to disputes. After all, after 10 years, we still can't agree on what an online impression is. Neither is it shocking that the panel method is under fire.
My lucky numbers
Publishers like the census method because they know exactly where the data comes from. But panel proponents say that census methods inflate the audience count by including visits by spiders and bots, as well as by counting people who delete their cookies as unique every time they come back.
Manish Bhatia, president of global services & U.S. sales for Nielsen Online, says server logs inflate traffic counts by 20-50 percent, depending on the site.
The network-centric methodology eliminates the cookie deletion issue, according to Marc Johnson, CMO for Hitwise. "We don't rely on cookies or individuals themselves," he says. Instead, the company's ISP partners install its software in their networks, analyze usage logs and feed the anonymous data to Hitwise. (But just to make things more confusing, Hitwise could be placed in the panel camp, because it doesn't measure all internet traffic, just a sample of it, albeit a very large one.)
Most of the cookie- or tag-based services do some number crunching to account for cookie deletion, and insist this isn't a problem.
Critics say that ISP data may show more unique users than there really are due to dynamic IP addressing. Moreover, it may count foreign site visitors as in-country because they come in on domestic IP addresses.
The census school counters that "professional panelists," people who join panels or respond to surveys for the incentives or because they're bored, skew results. In a 2005 presentation at the Marketing Research Association Conference, comScore chairman Gian Fulgoni said that the internet survey population has been over-fished: 0.25 percent of all internet households account for 30 percent of all surveys taken online, and, on average, a member of each of the leading panels belongs to at least eight other panels.
ComScore says it recruits from a wide variety of sites that other online surveyors don't use. It has software to screen out these serial panelists, for example, by counting how many surveys they fill out, and that its 2 million panelists outweigh the few pros.
Nielsen Online guards the purity of its panelists by only doing outbound recruiting. It generates statistically random lists of panelists that are geographically and demographically representative of the population, and then calls them up, inviting them to join.
"If you'd ask, 'How do I get on your panel,' I'd say, 'You can't,'" Bhatia says.
However, Neilson Online's recruiting relies on the telephone, and, because they're not allowed to call mobile phone numbers, scoffers say their panels may not represent the youngest, most tech-savvy citizens who don't have land lines.
And so it goes.
Part of this fight could be put down to culture. Companies born on the internet are supremely comfortable with data-crunching, and they tend to sneer at what they deem old-school practices of any kind. In the world of one-to-one, why wouldn't you just count actual users?
Fulgoni replies, "Just because something can be counted doesn't mean it's important. Just because you can count cookies, that doesn't mean it's the right number. It's completely flawed."
On the other hand, those that rely on the panel method may have been a bit smug about their methodology. They point out that the practice of extrapolating from a statistically significant sample has been used for nearly a century. But, "If it was good enough for television, it's good enough for the internet," doesn't play well with web publishers who are proud of inventing a brand new medium.
As well, it's just hard for non-math geeks to understand how a random panel of people can effectively stand in for their own unique selves. The panel guys shrug. "It just works," they say.
The numbers game
The situation is a can of worms for both buyers and sellers, says David Hallerman, a senior analyst for eMarketer.
"From the publishers' point of view, they may see different data from their own servers than a comScore or Nielsen might report," he says. "Since this data is used to help them sell ads, and most of the time agencies don't look at the web publishers' data but want a third party to give it to them, that creates the frustration."
Media agencies and publishers both want to get this right, according to Yankee Group's Taylor, because of all that money on the table. And it's not just about websites; the same measurement issues affect games, video and mobile media. On the one hand, publishers think: "We know we have less ad units, but can't bear losing the extra 5 or 10 percent we're losing because of audience measurement." On the other hand, Taylor says, they're wondering if maybe the cross-platform opportunities are worth a lot more than they're selling them for today.
You could say there was a standoff, or you could say that the industry had tacitly accepted that there was no real number.
Then, in 2006, the Interactive Advertising Bureau made waves with an open letter to comScore and Nielsen on behalf of its internet-centric members. The letter wasn't very nice. It said, "Despite a multiplicity of reported discrepancies in audience measurements, comScore and NRR have resisted numerous requests for audits by the IAB and the Media Rating Council [MRC], some dating back to 1999."
CEO Randall Rothenberg called the panel methodology antiquated, a leftover from the 1930s.
The response from comScore's Fulgoni: "If you don't believe in sampling theory, the next time you go to the doctor and he wants to take some blood, have him take it all."
Nielsen Online's Bhatia is more philosophical, if not more optimistic. "If we get the accreditation, the estimates still will not match the server data, so what have we done?" he asks. "The fundamental issue we have with clients is that our numbers don't match their server data. Certification won't help that."
Nevertheless, both comScore and Nielsen have submitted to the MRC audits, with results expected to be published in 2009. Meanwhile, the IAB formed an audience measurement working group and launched another annual conference.
David Doty, SVP of marketing for the IAB, says, "We believe everybody should be audited on all sides of this question for technology and process."
A solipsistic approach
So now, we have research on the research. In addition to the audits being conducted by the MRC, the Advertising Research Foundation (ARF) is working on standards and guidelines for research suppliers and buyers. In June, its Online Research Quality Council sent out an RFP for a Foundations of Quality Initiative, a major study of online market research. It will look at such things as whether there's a correlation between how fast someone completes a survey, and how much they thought about the answers.
ARF hopes to get 10 to 20 major panel research companies to contribute data that will be combed through by a consultant. Bill Cook, senior vice president of research and standards for ARF, says there was such intense interest at the council's first meeting that it had to be moved to a larger venue.
"It was quite a phenomenal meeting, and it indicated that a whole lot of people both on the provider and client sides are invested in understanding this," Cook says.
Wagging the dog
What's shaking things up again is the long tail. With bits of consumer-generated content dotting the internet like fleas on the old hound dog of "big media," agency folks are scratching their heads about how to keep track.
According to the head of marketing for a large internet company who asked that his name not be used in this article, "If you have a big mass product, going after a long-tail site is a waste of your time and efforts. For niche products that reach a specific audience, that longer tail is extremely important. Smaller brands are having the most difficulty, and that's where panel data is completely failing."
Dave Smith, founder of Mediasmith, agrees. "For the smaller and more targeted sites, NetRatings and comScore just don't cut it."
This leads to two newcomers: usurper Quantcast and Google, the King Kong of search, now angling for the lead role in "The Future of Madison Ave."
Census with a twist
At the user interface, Quantcast Media Planner and Google Ad Planner seem similar. Both allow media buyers to, in essence, search for websites that draw the kind of audience they want. On both, you can select the demographic elements you need and also search for sites whose audiences are similar to those of websites you already know work for your campaign.
Quantcast focuses on helping publishers understand and monetize their inventory while providing detailed information to media buyers. It uses a hybrid approach to measuring and identifying audiences. It claims it measures 10 million online media assets, including websites, blogs, games, widgets and videos; its audience modeling is based on 700 million global internet users, 200 million of whom are in the U.S.
Key to this reach is its free Quantified service. Any publisher can place an invisible tag on web pages, games, videos and widgets that lets Quantcast directly measure their traffic. In return, participating publishers gain access to the aggregated traffic data. The company combines this raw traffic information with a variety of other sources to infer more information about anonymous users as they travel from site to site.
For example, it uses IP mapping and business classification data to make inferences about the demographic attributes of a user visiting a tagged site, and mixes in some panel data from outside sources.
"We're now seeing billions of media consumption events a day across millions of unique destinations," says Adam Gerber, CMO of Quantcast. "That aggregated data set gives us ways to dimensionalize the audience."
Quantcast is also in the process of seeking accreditation from the MRC.
Brett Crosby, senior manager for Google Analytics, says Ad Planner is designed to work with the rest of Google's ad tools.
"With Ad Planner in place, you can not only go into Analytics to see what's working well, you can go into Ad Planner and look for sites like those, as well as compare how much traffic you're getting to their traffic," he says. "Then, analytics lets you see how effective your ads were."
Google probably uses an approach similar to Quantcast's -- the company refuses to disclose its recipe. A Google spokesman says Ad Planner uses census data, and the company is working with partners to mix in other kinds of data. This lack of clarity troubles some media folk; one concern is that a universe limited to sites running Google ads may not be enough to give an accurate picture.
"Google is not forthcoming about the basis for their numbers, and in research, that's a standard," Smith says, noting that Ad Planner appears to be getting some data from Nielsen. "We do know there are some numbers that are really weird. It's a good front end for trying to understand where you might buy Google long-tail display ads, but it still feels very much like somebody's Friday project."
There may be a perception in the industry that comScore and Nielsen have a lock on audience measurement. And, in fact, according to a recent interactive marketing survey of members of the Chicago Interactive Marketing Association by William Blair, they kind of do.
If you can extrapolate from that group, 54 percent of media and publishing professionals named comScore as their preferred data source, and another 34 percent preferred Nielsen.
But media buyers know there are more than two sides to this argument. Andy Fisher, director of analytics for Avenue A | Razorfish, speaks for the majority when he says, "We want to use the best toolset to drive results for clients. That usually consists of a set of platforms and analytics practices."
According to the IAB's Doty -- now making nice with both sides of the fence -- "The question really is how do panel data and census data work together to help marketers as all advertising goes digital? How are those pieces of information going to be put together by advertisers so they can understand what their meaning is?"
In fact, as you can see from the Audience Measurement Scorecard (see next page), most services are converging on the hybrid methodology. Even Nielsen, the forefather of audience panels, has begun to integrate census and panel information. Its Video Census product overlays server counts with panel-based methodologies, and NetView in Italy and France also use this approach. Eventually, Nielsen hopes to roll it out globally for NetView.
"We're hoping to get rid of that debate about my numbers versus your numbers," Nielsen Online's Bhatia says.
The sticking point, according to Bhatia, is the major web publishers. "With the panel method, we don't need cooperation from the website," he says. "To implement [the hybrid methodology], they need to enable us to collect the data."
Nielsen has done a pilot with Nickelodeon and a few other big clients, but some customers want to get comparable numbers from their competitors.
"We have clients who really want to move to this model, but we need critical mass," Bhatia says. "If CNN wants to do it, but MSNBC doesn't, that's an impediment."
Despite methodological convergence, there's no way that everyone's audience measurements -- or even any two services -- are going to match. Still, in this numbers game, the big winners will always be the measurement companies, because only they can provide any sense of credibility.
Audience Measurement Scorecard
Product: Media Metrix
Continuously measures internet usage of a global panel of more than 2 million people; the majority opts in online, with a control group randomly recruited via offline recruitment methods. Panelists download software that passively records internet activity; monthly offline surveys correct and add demographic information. The service also integrates offline data from a variety of sources and attitudinal data gathered through consumer surveys.
Product: Google Ad Planner
Methodology: Not fully disclosed; combines census and search data with other sources
A research and planning tool designed for media buyers lets them see traffic and demographics for specific sites, as well as find more sites like the ones that have performed well.
Product: Hitwise Competitive Intelligence
Methodology: Hybrid network-centric/panel
Hitwise combines traffic data from ISPs on 7.5 million users, combining it with data from an opt-in panel of 2.5 million and data from third-party sources to provide a weighted analysis of daily web traffic to 1 million websites.
NetView extrapolates audience size based on a 30,000-person panel of people using the internet at home and work, recruited via random-digit dialing. Audience metrics include unique audience, active reach, web page views and average time per person, as well as demographic information. Nielsen Online also offers a census-based audience measurement product, SiteCensus, and has begun implementing a hybrid, census/panel approach in international markets and in the U.S. for video measurement; is waiting for critical mass of publisher participation to launch in U.S.
Product: Media Planner
Methodology: Hybrid census/panel
Combines directly measured cookies from participating publishers with panel-based audience measurements and other third-party data sources to deliver metrics and profiles on digital media properties. Advertisers can view detailed audience reports on millions of websites, and search for sites that match their target audiences.
Susan Kuchinskas is a freelance writer who has written for Adweek, Business 2.0, M-Business and internetnews.com.