How to Set a Client-Side Sample Rate in Google Analytics

How to Set a Client-Side Sample Rate in Google Analytics

Posted by on Thu, Aug 18, 2011
Filed Under | Google Analytics


First…are you violating the Google Analytics Terms of Service?

If your website has more than 10 million hits per month then you are violating the Google Analytics Terms of Service.

If you are substantially exceeding this traffic limit and desire to become compliant, then implementing the Google Analytics Sample Rate is a simple solution, but it does have drawbacks.

What is a sample rate?

_setSampleRate is a code-level/visitor-level change in Google Analytics that enforces a client-side sample rate.  When enabled, only a specific percentage of your site’s visitors will be included in your Google Analytics profile(s).

If you specify a sample rate in your GA code of 20%, then only 20 out of 100 visitors (approximately) that come to your site will send visit data to the Google Analytics servers.  This does mean that the other 80 visitors will not be tracked and won’t send data to GA.  Google’s ga.js code determines which visitors to include and exclude.  This occurs consistently across unique visitors, which means that once a visitor has been tagged as a participant, all of their future visits will also be included in your metrics.

Is this the same as Fast-Access Mode?

To be clear, this is different than the ‘fast-access mode’ that you may see in the reporting interface.  While fast access mode uses sampling of the entire data set to return a report to you in a reasonable amount of time, the client-side sample rate only sends visitor data for a specific percentage of site visitors.

Why should you implement sampling?

Well, you probably shouldn’t.  There are very few reasons why you should choose to implement sampling.  We recommend avoiding it if at all possible.  However, if you are violating the Google Analytics Terms of Service (ToS) you may decide to do this proactively or at some point you may be asked by Google to comply.

Here are a few reasons you might choose to implement Google Analytics sampling:

  • You may want to implement sampling if you get a LOT of traffic to your site and you need to have intra-day processing on your account.  If you are sending a lot of data to Google, they will not process your data as often as smaller accounts.
  • If you look at the Google Analytics Terms of Service (ToS), you’ll find that it states a 10 million hits per month limit.  If you exceed the ToS, there is no guarantee that Google will process the excess hits and they have the right to stop tracking altogether.  Yikes!
  • An additional reason may be to mitigate the fast-access mode that you’ll see in reports if you add a secondary dimension (or apply an advanced segment) that contains more than 500,000 visits.  This is an extreme reason since you’ll likely have to sample your data so much to get down to this level.

What are the effects on your metrics?

Typically, large websites are the ones that will implement sampling.  In these cases, there is enough data being collected that there is still value in the metrics and trends to provide actionable insight.

If you are sampling at 20% and you receive 2 million visitors a day, multiplying that by 5 will give you a value that is close to actual (but never exact).  You could use the Google Analytics API to pull the data into a database, where you then perform the inflation calculation across metrics.

How do you set a sample rate?

Determining your sample rate is something that should be done very carefully.  We recommend that you contact a Google Analytics Consultant like us and explain your situation so that we can provide solutions based on your specific objectives and requirements.

The required code change is quite simple.  There are two things you should be aware of when implementing a sample rate:

  • You should implement this new code on ALL pages of your site; any place that you define the tracking object.
  • It must be set prior to the _trackPageview call as in the examples below.
  • The rate must be an integer as a string value (which means put it inside single-quotes). So, _setSampleRate(’20′) would be a 20% sample rate.

Depending on the ga.js code snippet you are using, below is the code syntax to use to set a 20% sample rate.

Asynchronous Code Syntax Example:

<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-XXXXX-X']);
_gaq.push(['_setSampleRate', '20']);
_gaq.push(['_trackPageview']);

(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>

Traditional Code Syntax Example:

<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try{
var pageTracker = _gat._getTracker("UA-xxxxxx-x");
pageTracker._setSampleRate('20');
pageTracker._trackPageview();
} catch(err) {}
</script>

Since this is a client-side code change, this will immediately go into effect and you’ll see the results the next time data has been processed in your GA account.

If you have implemented sampling in Google Analytics, what was your reason?

  • Were you violating the Google Analytics Terms of Service (ToS) and proactively decided to implement sampling to comply with the ToS?
  • Or did Google ask you to implement Google Analytics sampling because you were violating the Terms of Service?
  • Or was it another reason?

Enjoy this post?

Join the discussion below, subscribe to our RSS feed or share it on the web.

This post was written by:

has written 29 posts on the Web Analytics Blog.

Joe is the Analytics Director and a Partner at Blast Analytics & Marketing. He understands Google Analytics like nobody else and is a master of many programming languages.

Add Joe to your circles on Google+


Tags: , ,
  • http://conversionscientist.com Brian Massey

    Joe, this is very helpful. I’m consulting on a site that is generating over 20m pageviews per month, which means the hits are going to be higher. Is there a statistically significant difference between sampling at 20% and, say 30%?

  • http://www.blastam.com/broadcast Blast Advanced Media

    Hi Brian,

    Great; I’m glad this article was helpful for you.

    I would recommend setting the sample rate as high as possible while keeping within the GA ToS (which is 10 million pageviews/month now and was previously 5 million pageviews).  Of course, selecting a percentage that is easy to multiply makes things quicker to calculate.  For example, instead of sampling at 23%, do 25%.  That way, you can easily multiply by 4 to get the approximate totals.

    Any time you sample, you run the risk of losing important data (ecommerce transactions for example).  It is a decision that should not be made lightly.  If you’d like to send me an email at joe at blastam dot/com, I’d be happy to continue this conversation and provide you some other options/ideas to consider.

    Thanks,
    Joe

  • http://twitter.com/blastam Blast Advanced Media

    If you are breaking the Google Analytics Terms of Service (ToS), due to high traffic volume above the 10 million pageviews/month limit, there is now a new solution. It is better than implementing the sample rate setting in your tracking.

    You can upgrade to Google Analytics Premium.

    Of course, this is targeted only at large scale websites with a high value for analytics. But all companies have a high value on analytics, right? ;-)

    If you are breaking the GA ToS, than there is a good chance this is you. If you want more information check out this article, “What is Google Analytics Premium?” at http://www.blastam.com/blog/index.php/2011/09/what-is-google-analytics-premium/

    Let us know if you have any comments or questions.

    Regards,
    Kayden

  • http://britainloans.co.uk/ Britain Loans

    I think people should use Google analytics to good effect and make the most of all the applications it has within it. Goal setting is just one of the options which can further help monitor SEO performance. This is an excellent new feature, and I’m glad it is in the free version. Being able to see traffic traversing back up the funnel has already helped with a client who was having big problems understanding drop outs and entrances in a very long funnel sequence – it is almost art.

  • http://www.blastam.com Joe Christopher

    @d822818b6fbb19eb10beb19be04ce06f:disqus Be careful in your decision to implement this. If you are not required to implement this, we don’t generally recommend it. I wrote this blog post so that people understand how this option works.



Goal Driven Online Marketing & Analytics
Copyright © 1999-2013 Blast Analytics & Marketing