WEB ANALYTICS
Published: September 29, 2005
Tracking the Techie
 

(Page 2 of 2)

<Return to Page 1

Tactics

A unique individual is identified by the combination of user agent and IP address, with the option of examining cookies as well. The anti-metrics community has techniques to address each of these areas.

Cookies

We all know people block or delete cookies. Around half of average users delete some cookies once a month. However, techies do it much more often:

"I will block and/or delete cookies daily, disable java, javascript, flash, and other annoyances. There are plenty of others who do the same, and your lack of respect for our privacy will drive us away from your sites."

Deleting a cookie after a visit simply throws your repeat visitor count. Other tactics can throw your stats on a more profound level. People remap their browser to send cookies to non-existent directories. Others have systems to delete cookies as fast as they're set. Either of these tactics will lead to your system setting a new cookie for each page, so this one person will look like a new visitor on every page.

Session cookies also get blocked. I researched this myself last year and found that the level of session cookie blocking is related to the visitor's operating system; less than one percent for Mac users, two to three percent for Windows users, and 10 percent for Unix users.

Anti-cookie behavior is going to become more common and more sophisticated:

"I'm developing a browser plugin to collect multiple cookies and rotate amongst them, thus making one user appear to be a number of users."

"There have been discussions about browser plugins that use a p2p method of cookie sharing, akin to grocery store loyalty card sharing, to subvert this process."

Browsers

The anti-metrics community uses different browsers to defeat being tracked. They'll switch browsers in the middle of a single visit, and access web pages with different systems.

"I use several browsers, including text browser (elinks, lynx) and command line tools (curl, wget)."

IP

I've come to believe the visitor's IP address is the least reliable element in identification.

"I'm a typical corporate user. I use several proxies (whichever is fastest, we have one or more per country. Sometimes the UK proxy is less busy than the Dutch one). Then, I visit the same page from home, sometimes using the corporate VPN."

"If my company had computers in New York and Tokyo, I could ssh between them in much less than 60 minutes. . ."

"...anonymizing onion routers, such as TOR, make one user appear to source from a large number of globally distributed IPs."

Blocking systems

Most of the tactics I've described are behavioral -- people have to do something to prevent being tracked. Doing these things requires skill and effort. There are other systems that will do this work for you.

The easiest to use is some form of anonymous surfing. This is an industry in itself, and there are plenty of sites that will let you put a URL into a form and surf that site from within theirs.

Other systems can be installed on your machine, running many of the techniques I've described automatically.

Blocking systems don't require any technical know-how; they are available to everyone.

What will the future bring?

It's clear this is going to get more intense. Anti-tracking technologies are going to spread. Many marketing people have addressed the cookie deletion issue by talking of educating people as to the value of cookies. I think the solution lies on the other side of the fence, with our industry. The problem is caused by the limited number of tracking companies who make a mess of things for the rest of us by quietly deploying tactics that they know the public would find objectionable.

We all know who they are -- the companies who use third-party cookies to profile people surfing between sites, and the marketing companies who buy that data.

This situation will get worse. The next battleground is already upon us: IP traffic analysis. This involves placing tracking technology inside the servers themselves and analyzing IP header data for destination and source information. This is completely undetectable by the browser. The growth of such practices simply increases the level of distrust amongst our users and promotes the development of systems like TOR that randomize IP data.

At this stage, most anti-metrics behavior is concentrated in the techie community, but let's remember that these are the people who write the software and install networks. They are the ones who will build systems that will make it easy for ordinary people to emulate their behavior. They are the people who will advise others on risks and tell them how to behave online.

If we don't deal with this now, it will become a pervasive issue, and in five years we'll know less about our audience than we do today.

If we want our users -- especially our techie users -- to trust us, we must act in a trustworthy fashion.

Today's reality is that the metrics community has never stopped to consider the ethics of what it does, or censure sharp practice when we see it. The result is that we are alienating our users and our marketplace. If we want them to give us the data we need, we have to ensure we only use it for purposes they would approve of.

Brandt Dainow is CEO of Think Metrics, creator of the InSite Web reporting system. Read full bio.

<Return to Page 1

White Paper Library

View More Research »