ellipsis flag icon-blogicon-check icon-comments icon-email icon-error icon-facebook icon-follow-comment icon-googleicon-hamburger icon-imedia-blog icon-imediaicon-instagramicon-left-arrow icon-linked-in icon-linked icon-linkedin icon-multi-page-view icon-person icon-print icon-right-arrow icon-save icon-searchicon-share-arrow icon-single-page-view icon-tag icon-twitter icon-unfollow icon-upload icon-valid icon-video-play icon-views icon-website icon-youtubelogo-imedia-white logo-imedia logo-mediaWhite review-star thumbs_down thumbs_up

Design for Analysis

If you are interested in web analysis you know you need to measure your online activities. However, just buying log analyzers or plugging some tracking code into your web pages is not enough. There are some additional steps you need to take in the design of your ads and your web pages. If you miss these steps you won’t have enough detail to understand what is happening.


This article explains the basic steps you need to take to ensure you gather data in enough detail to measure and improve. This concept of “design for analysis” originated with Netgenesis, but is now a commonly accepted first principle of web analysis. 


Trackable URLs


The most important components in any tracking structure are your URLs. When you look at visitor behavior, conversion rates or target actions, the atomic unit of your analysis will be the URL. Visitors come from URLs, land on URLs and read URLs. Your target actions are URLs. If you do web analysis, you live and breathe URLs. Your URLs need to be sufficiently clean for analysis, sufficiently granular to provide the detail you need, but not so granular you lose site of the patterns in the detail.


Clean URLs


Here are two (genuine) URLs for different product groups from the same site. See if you can spot the difference:


servlet/%20strategic%26cm_pla%3Dused%20books_strategic_BR%26cm_item%3Dused%2520book&sa=l&ai=BUlpP2kikQ2kqwBofWvB7DMC&num=5


servlet/%20strategic%26cm_pla%3Dnew%20books_strategic_BR%26cm_item%3Dnew%2520book&sa=l&ai=BUlpP2kikQ3llpBofWvB7DMC&num=5


In actual fact, I’ve edited these down to half their length; the originals were much worse. If you read closely you can see that the first is for used books and the second for new books.


These URLs are extremely difficult to read. If you’re presented with a table of landing pages which consists of entries like this, it’s extremely difficult to make sense of it. While I recognize that some aspects of a page name are determined by your content management system, there is always some room for movement. In the example above “%20” represents a space. Spaces are not allowed in URLs, so they have to be translated into “escape sequences” which represent them by their code numbers. This means a designer created components in their site called “used books” and “new books.” Since this will always be converted into “used%20books” and “new%20books,” even by visitors to the site, this helped no one. The words should have been merged together as “usedBooks” or “used_books.”


The lesson is: Don’t use spaces in your names.


Here’s another example:


categoryID=211&prdctID=487


Any idea what the product we’re examining is? You may argue that someone working on this site will know what the numbers mean, but the numbers here suggest hundreds of categories with hundreds of products in each. In actual fact, this site is an electrical components wholesaler with literally tens of thousands of products online, and no one can remember what the numbers mean. If I want to analyze pages in this site, I need a code book as a reference while I try to look for patterns.


Some CMS systems demand numbers, but not all, so use meaningful words instead of numbers as much as possible.


Incoming links


Much of your analysis involves looking at traffic sources. You often need to understand behavior by source. In addition, if you’re buying visitors, you need to know what you’re getting for that spend.


Tracking incoming traffic requires that the link into your site contain a specific component that you can attribute to that activity. This is especially important with regard to search advertising. If you take nothing else from this article, at least do this. A typical tracking parameter I use involves adding “?src=gaw” to links into client sites, so the link in the ad reads something like: http://www.mysite.com?src=gaw


The ?src=gaw has no functional effect on the website, it’s merely there for tracking purposes.


If you are running Google ads, you will be getting traffic from those ads in Google, and in the sites that take syndicated Google ads. If you see visitors coming from www.foobarpop.com with ?src=gaw in the request you know foobarpop.com is running your Google AdWords. Without that parameter you may think the site has a link to you. 


This also means you can aggregate all the visitors, from any source, who enter with ?src=gaw, so you can get an accurate understanding of how the ad is performing across all sources.


As well as using tracking parameters to aggregate, you can use them to separate. If you are listed in the search engines -- and also doing search ads -- a tracking parameter is needed to separate visitors-from-the-listings from visitors-from-the-ads. If someone comes from MSN, was it the native listing or the Overture ad?


In preparation for this article, I ran a quick test to see how much ad tracking I could see. I ran searches in Google then examined the ads and landing pages to see how many tracking URLs I could see. For “buy books online,” four out of eight were tracking their ads. This is a mature and competitive industry, yet some extremely large online merchants were not measuring the effect of their spend. “Mortgage online” is one of the most expensive phrases you can bid for. Because the value of a mortgage over its lifetime is so high, mortgage providers can spend extremely large sums to buy visitors. However, only one out of the eight listings had any form of measurement. By now I wanted to find a sector where this was done properly, and we all know where the money is on the web. However, a search for “sex online” revealed that even here only five out of eight sites were tracking their ads.


If you want to do any form of segmentation, I recommend multiple tracking parameters in your ads. Use combinations that represent market segments and sources. For example, we could use “ub” for used books and “nb” for new books with “go” for Google and “ov” for Overture. If someone entered our site with a parameter of “?src=ubgo” we know they came from a Google AdWords for used books. Segmentation should go down to the level you wish to respond to. There’s no point separating two sets of visitors from each other if you’re going to funnel them through the same pages in exactly the same way.


Direct email campaigns and email newsletters should also contain tracking links so that when people click on the links you can identify them. Since email doesn’t report itself to the website as a referring source the way websites will, if you fail to do this you will be unable to separate email-sourced visitors from people who come direct to your site by typing in your URL directly.


Outgoing links


If you link to other people’s sites, and you want to track their outgoing traffic, you have two choices. The sophisticated option is to have some code in each outgoing link that records the click. This is great if your system can do this, and if the people creating these links remember to add the correct codes.


The cheaper alternative is to have the outgoing link go to a redirect page on your site which then sends the visitor to the external site. This redirect page view needs to get recorded by your system, so the way it redirects visitors needs to match your recording mechanism. Servers can redirect visitors when a specific page is requested. This will probably work if you are using log analysis. However, if you are using page-based tracking this would mean the redirect never got tracked. For page-based reporting systems you have a web page that contains something like a JavaScript onLoad function which activates a document.location function. Thus the viewing of this page is recorded before the visitor is sent on.


Record non-page activity


Not all of the important activity in your site involves reading web pages. Often a critical target action is sending an email or downloading something. In many sites this behavior is never recorded, which can make determining conversion rates and other KPIs impossible. 


Recording email


Even though clicking an email link on your pages does not result in a page view or a record in your log files, it is possible to record it on a system that doesn’t record every mouse click. As with outgoing traffic, you can have the email link call a page that redirects to email. The JavaScript command would be something like document.location=mailto:[email protected]?subject='Mail_Enquiry.'


As with outgoing traffic, it is this redirection page that gets recorded.


Record downloads


Many sites offer software, PDF files, Word documents, Excel spreadsheets and other material for downloading. If you are using log analysis all this activity will be recorded, but not if you’re using page-based tracking. These downloads can be tracked using the redirection technique. In other words, rather than link directly to the material, link to a page which then redirects to the material to be downloaded.



Brandt Dainow is CEO of Think Metrics, creator of the InSite Web reporting system. Read full bio.

Brandt is an independent web analyst, researcher and academic.  As a web analyst, he specialises in building bespoke (or customised) web analytic reporting systems.  This can range from building a customised report format to creating an...

View full biography

Comments

to leave comments.