Do You Have Bots in Your Google Analytics Data?

Do You Have Bots in Your Google Analytics Data?

Posted by on Fri, Jun 8, 2012
Filed Under | Google Analytics


How to Identify and Block Web Monitoring Bots in Google Analytics

Did you know that if you are using a service like Keynote to monitor your website performance, that it is impacting your web metrics?  Services such as Keynote, Gomez, AlertSite, Pingdom, and many others use a real web browser to visit your site repeatedly throughout the day in order to measure load-time performance.  They are loading your site from multiple locations throughout the world with real browsers that execute the Google Analytics tracking javascript.

There are various kinds of bots that come to your site (Google Search Bot, Bing, etc) that do not execute javascript tracking code and do not influence your metrics in Google Analytics.  Our objective in this post is to educate users about the bots that do execute javascript tracking code and help you provide accurate data to your stakeholders.

For a recent client, we found that these bots were contributing roughly 18,000 visits (and bounces) a month.  In other words, it was a noticeable impact on the quality of metrics in Google Analytics.

How to Identify Bot Traffic / Are Bots Influencing Your Metrics?

In order to identify bot traffic, you are going to be looking for groups of visits, by a visitor’s service provider or visitor’s domain dimension that have a 100% (or very close to it) bounce rate and a 100% new visit rate.  You’ll want to do your due diligence to ensure that these visits that you find are not the result of a tagging issue on your site . Such as a page that has different tracking settings than another and are causing cookie resets.

The ‘Service Provider’ report in GA provides a great starting point for this analysis.  Head over to Standard Reporting > Audience > Technology > Network.  Once here, you’ll want to switch to the table view and then apply an advanced filter so that we can look at sizable, high-bounce traffic.  In this case, we are filtering the report to only look at bounce rates that are greater than 90% and also have visits greater than 100 during our monthly date range.

Additionally, if you can get a list of IP addresses from the web performance provider, you can exclude based on the IP address that they provide. Note though that these providers change and add new IPs all the time, so I find that this method requires more maintenance. View Google’s IP RegEx tool for more information on IP ranges or multiple IP/ranges and the use of regex in custom filters.

Exclude IP filter

Let us know if you have any questions.

Enjoy this post?

Join the discussion below, subscribe to our RSS feed or share it on the web.

This post was written by:

has written 29 posts on the Web Analytics Blog.

Joe is the Analytics Director and a Partner at Blast Analytics & Marketing. He understands Google Analytics like nobody else and is a master of many programming languages.

Add Joe to your circles on Google+


Tags: , , , ,
  • Amack18

    Is there a way to filter out a browser user agent setting with RegEx?

  • http://www.blastam.com Joe Christopher

    You can use RegEx on your page itself, around the GA script. You cannot access the user agent data dimension in GA profile filters though.

    Here’s an example of the regex to wrap around your GA code:

    var _gaq = _gaq || [];
    if (!navigator.userAgent.match(/.*(KHTE|KTXN|GomezAgent|AlertSite|Pingdom|YottaMonitor).*/gi)) {
    _gaq.push(['_setAccount', 'UA-XXXXXXXXX-X']);
    _gaq.push(['_trackPageview']);
    }

  • http://joaocorreia.pt/ João Correia

    Usually search bots don’t execute javascript, unless they are specifically designed to do that. I would advise you to search for more data in the actual server logs.

    Thanks

  • http://twitter.com/SeoArcher SeoArcher

    yes fake traffic is so bad, I sometimes look at our site and think, wow we are really getting up there, only to find out it was fake. :(

  • page1my.com

    thank so much for all this info. I agree, fake traffic is nothing but trouble.

  • http://www.facebook.com/people/Doug-Hardy/1015982887 Doug Hardy

    We have a different problem and hopefully someone can help. There’s a javascript bot roaming around the web that settles on a website and starts clicking on ad banners. We’ve seen it on a few sites now where we average a clickthrough rate of 0.08 to 0.18 for quite a while, and then it just spikes to 2.25 or 5.50 or whatever – we suddenly get hundreds and hundreds of clicks on every ad unit.

    We’ve been smart enough to avoid ad buys on a cost-per-click basis but this still messes up our data. We think it’s a javascript bot. But it’s difficult to figure out which IP these attacks are arriving from.



Goal Driven Online Marketing & Analytics
Copyright © 1999-2013 Blast Analytics & Marketing