Random header image... Refresh for more!

Google Analytics & AWStats Work Really Well Together

April 15th, 2008 · 4 Comments

This post is the first in a series discussing the effective use of AWStats alongside Google Analytics. If you like what you read, consider subscribing to my full feed RSS.

As long-time readers know, I see a lot of value in log file analyser stats packages like AWStats. Google Analytics gets all the attention these days; even so, there’s information buried in your log files that Google Analytics can’t get at. And even if it could, it’s incapable of extracting any useful information from it because Google Analytics is a fundamentally different tool.

So what are the important differences? Let’s explore this for a minute.

Google Analytics is a script-based analytics package. What I mean by this is that it relies on JavaScript, cookies and a remotely-hosted piece of code to collect process and interpret visitor data. AWStats is an offline log file analyser. What I mean by this is that the web server generates a log file, closes it off, and then AWStats gets to process it into meaningful information. Typically this happens once a day, but you can request an AWStats update at any time.

Google Analytics Collects Data in Real Time, But AWStats Doesn’t

Visitor data is collected in real time, and this adds a small but often noticeable delay to the load times of your web pages. I’m not suggesting for a second that you don’t use Google Analytics because of this performance hit, but unless there’s a real good reason you should follow Google’s recommendation of locating the code snippet right at the end of the webpage code. That way the visible part of the webpage loads first and your visitor stays happy. By the way even though Google Analytics collects data as visits happen, the statistics you access typically run 12-24 hours behind. It could be real time reporting, but it’s not.

JavaScript and Cookies

If your visitor’s browser doesn’t support cookies and/ or JavaScript, the visit won’t get counted by Google Analytics. This is actually a big deal, because it leads to one of the major shortcomings of script-based analytics. Search engine spiders (in fact, many automated agents or ‘bots’) don’t run JavaScript by design. So there’s no way Google Analytics can count visits by search engines, which in itself is a pretty important piece of data. You need a log file analyser like AWStats to get this insight.

There are some upsides to this, however. It’s a good bet that any visits analysed by Google Analytics are actually from real people running real browsers. Add to this Google Analytics’s ability to extract HTTP_USER_AGENT data from each GET request, and you have browser share data as well. A log file analyser, on the other hand, can only guess that a logged page view is for a real, live set of eyeballs. It only has the IP address (and HTTP_USER_AGENT data) to go on, and anyone behind a shared connection or forward proxy will not get counted as a unique visitor.

Google Analytics has a fantastic ability to collect data from inside the webpage itself. This is useful stuff for savvy website managers. With careful configuration you can analyse form completion/ abandonment rates (plus where the form got abandoned). You can also define a ‘path’ through your website that comprises your preferred sales process, and look at entry/ exit points along this path.

Online and Offline Databases

Because Google Analytics depends on its own repository of visitor data, it has no use for your web server’s log files. You may have years of data there, but Google Analytics won’t be able to process any of it, and you won’t have any basis for year-to-year comparison until you’ve been running Google Analytics for a while.

In these days of multi-gigabyte host storage, the space consumed by log files is no longer a big deal. But here Google Analytics has the upper hand because Google holds your data, not your own web server. It’s an easy process to offload your web server files to some backup location, and continue your analysis offline; with Google Analytics you need to be online to access your data (there is a way to offload data via Google Gears, but I’ve never done it and I bet none of you have either).

Accuracy – Google Analytics Low, AWStats High

If you’re able to run Google Analytics and AWStats side-by-side, the very first thing you’ll see is there are big discrepancies in data you’d assume was unambiguous and straightforward to collect. Like page views, for example. Or unique visitors. In every case I’ve seen Google Analytics shows lower figures than AWStats, and the real figures almost certainly lie somewhere between.

For our example, let’s look at page views. In an ideal world, a web page would be requested, the web server would provide it, the transaction would be logged by Google Analytics and AWStats, and everything should tally. However the situation gets complex when a page is requested:

  • By a browser or robot that doesn’t allow JavaScript (in which case AWStats would log it and Google Analytics wouldn’t)
  • By a proxy server (Google Analytics would log it and AWStats wouldn’t, unless the proxy supported Cisco’s WCCP 2.0 protocol, in which case both AWStats and Google Analytics would log the transaction)
  • By a group of users each on a separate PC, sharing an internet connection (Google Analytics would log each transaction correctly but AWStats would log each transaction as being from the same user)
  • By different users, each with a separate login, on the same PC (Google Analytics would log each transaction correctly but AWStats would log each transaction as being from the same user)

As you can see it gets complex, and the only way around it is to make assumptions as to what constitutes a unique visitor, and simply accept that inaccuracies are inevitable. Google Analytics reads low, AWStats reads high, and that’s the way it is.

The Next Post

I hope to publish a series of posts over the next week or so that delve a little deeper into the intricacies of AWStats. It’s a great analytics package that unfortunately is poorly documented and because of this, is not well understood.

If you want to read up on AWStats the project page is right here, and you can download it (free) from that link as well. Before you do have a look at your hosting account – AWStats is installed by default by lots of webhosts – the chances are good you already have it there, along with a full log file history from the day you set up your account!

If you would like to learn more about traffic generation,
search engine optimization and web analytics,
subscribe to my full feed RSS. My RSS feed is updated daily.

Tags: Analytics

4 responses so far ↓

  • 1 Evan // Apr 15, 2008 at 6:05 pm

    I’m afraid much of this technical language is beyond me. I have no idea what a log file is or does.

    It sounds like between them you get a reasonable guess at what is going on.

    Sometime soon I want to have advertising on my blog so I will need some reasonable stats. The Wordpress plug in quick stats provides figures that I find quite puzzling. So having something else would be very helpful.

  • 2 Mark // Apr 15, 2008 at 7:22 pm

    Hi Evan,

    You raise a point I’ve heard from a few people now… the technology underneath websites is very, very complex and even if you do have an understanding of what it all means, you then need to use it to improve your site so it’s better than the competition. That’s where analytics fits.

    I think the only way I could help would be to write some posts that went right to the bits of information that are truly important, and advised what you could do with the information.

    Evan, others… is this a good idea?

  • 3 Web Server Log Files Explained | Stratify Pty Ltd // Apr 15, 2008 at 9:51 pm

    […] RSS ← Google Analytics & AWStats Work Really Well Together […]

  • 4 AmyL // Apr 21, 2008 at 12:17 pm

    Yes, it’s a good idea. :) Look forward to reading the posts.

Leave a Comment