About Us

Bulletpoint StarImulus® is a technology focused design + interactive agency.

In addition to our client services we also have a few products in the works. Our office is always filled with chatter and this blog is an outlet for our creative energy, rants and ideas.

Podium

Support DetailsSupport Details
Find the real cause of client browser issues and get the lowdown on what your client's are using to see your site.
Support Details by Imulus

Featured Project

Jan13

Google Analytics is Under-Reporting, or is it?

For well over a year now, we’ve been using Google Analytics to report our own traffic and give our clients a feature rich, and free way to view their own website statistics. Prior to Google Analytics we used Urchin 5.0, which was acquired by Google and made free just 1 month after we purchased a $900 license for our server. The nice people at Google gave us 50 user account to ease the pain.

When it comes to statistics most viewers are interested in trend data. What are the most popular pages? Is there an increase in visitor traffic from month to month? Where are visitors coming from? Since I’m such a visual person, I find myself looking for trends in charts and graphs more so then the actual hard numbers. Recently, a client of ours with an engineering oriented mind pointed out a serious discrepancy between the actual numbers reported by Google Analytics vs the old Urchin reports their company was using. According to the data either their site lost 50% of it’s traffic or Google was under-reporting results!

My initial response was denial. If Google Analytics is built on Urchin why would it under-report visitors? There shouldn’t be a difference between Google Analytics and Urchin. After this discussion I went back and examined other client sites and across the board I found the same issue, Google is reporting less traffic then Urchin.

From what I see in our Google Analytics account, search bot traffic is not reported. Google Analytics seems to be looking and just human visitors. For those unfamilar about what I’m talking about, there are two primary types of traffic: human or search bot. Human traffic is obviously the visitors who come to your website, search bot traffic on the other hand is indexed engines like Google, Yahoo or MSN. The search engines visit websites to index the content periodically so that the search engine’s records are up to date.

The Evidence
I’m using 2 clients plus our own Imulus logs to illustrate this point. Client A is a software company that uses their website for lead generation. Client B is a hardware company that uses their website for ecommerce.

Client A
Total Website Visitor Sessions for November / December.

Client B
Total Website Visitor Sessions for November / December.

Imulus
Total Website Visitor Sessions for November / December.

Diving Deeper
Using our Imulus logs as the most extreme example, I assumed the difference had to be in the definition of how each system defines sessions. Essentially, I’m assuming the difference is in how each system characterizes a human versus a search bot. For those hard core web statistics people, the information in these reports is based on the default filters in Google Analytics. In Urchin, I’ve applied the filter to exclude robot traffic by cs_useragent based on finding the following values bot, seek, scan, search, di, agent, get, crawl, spider, scooter, lint, libwww, loader, mechanic, curl, link, catch, fly.

I decided to select a few well known ISPs from our log files. The ISPs I selected domains I know, or at least can safely assume, are not robot traffic. In addition, the filter is set in Urchin to exclude bots from the session information. Here are my results based on the Imulus log files for the month of November, 2006.
Traffic from Selected ISPs

Based on the results above, no pattern can be drawn. It seems completely random as to the difference in values. Note, I did notice in Urchin it is filtering the logs for cs_useragent, however in IIS 6.0 the log files don’t seem to have a cs_useragent but they do have a cs(User-Agent). I began to wonder if this is the primary reason for the under-reporting. If this was the case, and Urchin can’t differentiate the bots from the log files, I should see a lack of data under my Urchin report for “Browsers & Robots.” Apparently, Urchin is able to differentiate between cs_useragent and cs(User-Agent) because my “Browsers & Robots” report has each search bot broken out by total hits. Ideally if bots were reported as sessions, I might be able to better compare bot vs human traffic.

Below is how the traffic appeared to each reporting tool. I expected the pattern to be the consistently similar from day to day, just less traffic according to Google Analytics, again I was wrong.

Session Traffic for the month of November, 2006

The pattern, while approximate, is certainly not a mirror image. The most drastic difference is on 11/8/06. Urchin is showing a serious spike in traffic, while Google Analytics doesn’t recognize this day as a spike. In the table below I’ve pulled out the top visiting domains for November 8th, 2006. There is a discrepancy between Google and Urchin. It is interesting that although I have applied the cs_useragent filter, I still see the bot traffic in my “Domains & Users” report in Urchin.

Top Domain Visitors for November 8th, 2006.

Domain Google Urchin
no domain 17 121
comcast.net 14 16
qwest.net 9 32
keynote.com 3 0
rr.com 2 3
cox.net 2 2
verizon.net 2 4

Google reports total visitor traffic for November 8th, 2006 at 89 visitors. Urchin is telling me there are over 671 visitors for this day, and I can see in this number they are including bot traffic. If I manually remove the search domains from the Urchin reports I would pull out these values.

yahoo.com / 103
inktomisearch.com / 78
live.com / 31
google.com / 24
pnap.com /16
twtelecom.net / 6

Total 258 visits.

Conclusion

While this is by no means a scientific study of the two reporting tools, I personally believe Google’s numbers to be more accurate then Urchin.

I believe Urchin’s filter are not excluding robot traffic, at least while processing IIS 6.0 log files. In addition, we’ve developed our own Metrics tools which show us live visitor traffic throughout the day. Our homegrown analytics tools are giving us reports more akin to Google Analytics then Urchin.

Most people want to believe the numbers in Urchin because those numbers are higher and reflect better performance when presenting marketing reports to company executives. There is a serious danger in running with reports which are not accurate. If you believe your visitor level to be 14,000 visitors per week and you are converting only 40 visitors to leads or sales then you have a problem in your site’s ability to convert. Yet, if your website is really only receiving 1,400 visitors per week then your conversion look much better.

I’m hoping to follow up this report by looking at other analytics tools and how they compare with Google Analytics. I’d like to post a report using DeepMetrix, WebTrends and ClickTracks. If anyone is interest in receiving our log files for November / December 2006 I’d be glad to share those log files for you to run your own comparisons.

posted in: Google, metrics, search engines

This post was published on Saturday, January 13, 2007 at 6:04 pm

Leave a comment


Comments

1

George Morris

January 14, 2007 at 12:30 am

Just a side note. Google JUST announced a difference with NEW & RETURNING visitors and what they call ABSOLUTE VISITORS. These two variations should further confuse the reporting with Urchin.

Page from Google

2

Kevin Newcomb

January 17, 2007 at 2:23 pm

Hi George,

I’m wondering if you’re using Urchin Traffic Monitor and cookies? If not, it would seem that Urchin may be tracking visitors from ip-user-agent, which could get thrown off by ISPs that are using a squid-type proxy, resulting in multiple IP addresses for a single visitor.

Thanks,
Kevin

3

George Morris

January 18, 2007 at 11:27 am

Hello Kevin

Nope, no UTM and no cookies. Just raw log data. Hence my belief that Google Analytics is more accurate then using the default install of products like WebTrends and Urchin.

Plus, now that Google has Urchin I don’t think I can purchase the Urchin UTM add-on… but I might be wrong. I’d like to test that theory out and run another analysis with UTM on.

Thanks
George

4

china-ceo

January 26, 2007 at 12:36 am

I Love google..hoho

5

Thomas Zelikman

July 18, 2007 at 5:02 am

I used statcounter and google analytics together for a while and noticed the discrepancies.

Then I recorded all pageviews in a database and built queries to exclude robots and estimate unique visitors based on IP-address.

From that I found that statcounter overestimated visitors while google analytics was quite accurate. Because of this and occasionally slowish pageloads, I stopped using statcounter and since then completely relied on GA.

I believe GA works fine for pages with small traffic load. But could it be that GA does not scale well with heavy traffic?

Thomas

6

George

July 18, 2007 at 10:53 am

Perhaps it’s missing traffic on larger sites? I’m just not sure.

Personally I think it’s a difference between what GA is calling a human and calling a bot. I’d love to see what algorithm they use to distinguish between the two.

7

HipMojo.com - Main Street Meets Madison Avenue, Wall Street and Silicon Valley » Why Google’s Best and Brightest Are Leaving in Droves

October 23, 2007 at 5:45 pm

[...] can go on and on about how Analytics under-estimates traffic, but I’m not alone in suggesting that.  The point is either Analytics is off (and Google should look into it) or its [...]

8

Tracy Jackson

October 29, 2007 at 8:42 am

Thank you for this post! We just switched to Google Analytics from Urchin following the redesign of our website. We were horrified to see our visitors drop by more than 50%. We expected a small drop in visitors due to website address changes but not that much as we had placed many targeted redirects.

I have also noticed in my Urchin stats that people who linked to a picture in our website (i.e. in MySpace people have directly linked to our logo on their page) seemed to be counted as visitors to the site even though the pages viewed per visitor were zero. Did you see this in your data?

9

smith

February 8, 2008 at 11:55 pm

Still i am using statcounter. But when read this article i think google analytics quite accurate.
Thanks for your nice suggestions.

10

Bob

January 28, 2009 at 11:17 am

One day I setup a table and demoed my website to people. On that particular day I had over triple the conversions my site normally gets.

However when I went home and looked at google analytics, it reported only a 20% increase in conversions.

There was no way to deny google analytics failed to account for these conversions.