Blast Analytics and Marketing

Analytics Blog

Supporting Leaders to EVOLVE
Category: Digital Analytics

How to Set a Client-Side Sample Rate in Google Analytics

August 18, 2011

First…are you violating the Google Analytics Terms of Service?

If your website has more than 10 million hits per month then you are violating the Google Analytics Terms of Service.

If you are substantially exceeding this traffic limit and desire to become compliant, then implementing the Google Analytics Sample Rate is a simple solution, but it does have drawbacks.

What is a sample rate?

_setSampleRate is a code-level/visitor-level change in Google Analytics that enforces a client-side sample rate. When enabled, only a specific percentage of your site’s visitors will be included in your Google Analytics profile(s).

If you specify a sample rate in your GA code of 20%, then only 20 out of 100 visitors (approximately) that come to your site will send visit data to the Google Analytics servers. This does mean that the other 80 visitors will not be tracked and won’t send data to GA. Google’s ga.js code determines which visitors to include and exclude. This occurs consistently across unique visitors, which means that once a visitor has been tagged as a participant, all of their future visits will also be included in your metrics.

Is this the same as Fast-Access Mode?

To be clear, this is different than the ‘fast-access mode’ that you may see in the reporting interface. While fast access mode uses sampling of the entire data set to return a report to you in a reasonable amount of time, the client-side sample rate only sends visitor data for a specific percentage of site visitors.

Why should you implement sampling?

Well, you probably shouldn’t. There are very few reasons why you should choose to implement sampling. We recommend avoiding it if at all possible. However, if you are violating the Google Analytics Terms of Service (ToS) you may decide to do this proactively or at some point you may be asked by Google to comply.

Here are a few reasons you might choose to implement Google Analytics sampling:

  • You may want to implement sampling if you get a LOT of traffic to your site and you need to have intra-day processing on your account. If you are sending a lot of data to Google, they will not process your data as often as smaller accounts.
  • If you look at the Google Analytics Terms of Service (ToS), you’ll find that it states a 10 million hits per month limit. If you exceed the ToS, there is no guarantee that Google will process the excess hits and they have the right to stop tracking altogether. Yikes!
  • An additional reason may be to mitigate the fast-access mode that you’ll see in reports if you add a secondary dimension (or apply an advanced segment) that contains more than 500,000 visits. This is an extreme reason since you’ll likely have to sample your data so much to get down to this level.

What are the effects on your metrics?

Typically, large websites are the ones that will implement sampling. In these cases, there is enough data being collected that there is still value in the metrics and trends to provide actionable insight.

If you are sampling at 20% and you receive 2 million visitors a day, multiplying that by 5 will give you a value that is close to actual (but never exact). You could use the Google Analytics API to pull the data into a database, where you then perform the inflation calculation across metrics.

How do you set a sample rate?

Determining your sample rate is something that should be done very carefully. We recommend that you contact a Google Analytics Consultant like us and explain your situation so that we can provide solutions based on your specific objectives and requirements.

The required code change is quite simple. There are two things you should be aware of when implementing a sample rate:

  • You should implement this new code on ALL pages of your site; any place that you define the tracking object.
  • It must be set prior to the _trackPageview call as in the examples below.
  • The rate must be an integer as a string value (which means put it inside single-quotes). So, _setSampleRate(’20’) would be a 20% sample rate.

Depending on the ga.js code snippet you are using, below is the code syntax to use to set a 20% sample rate.

Asynchronous Code Syntax Example:

[code lang="js"]<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-XXXXX-X']);
_gaq.push(['_setSampleRate', '20']);
_gaq.push(['_trackPageview']);

(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
[/code]

Traditional Code Syntax Example:

[code lang="js"]<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try{
var pageTracker = _gat._getTracker("UA-xxxxxx-x");
pageTracker._setSampleRate('20');
pageTracker._trackPageview();
} catch(err) {}
</script>
[/code]

Since this is a client-side code change, this will immediately go into effect and you’ll see the results the next time data has been processed in your GA account.

If you have implemented sampling in Google Analytics, what was your reason?

  • Were you violating the Google Analytics Terms of Service (ToS) and proactively decided to implement sampling to comply with the ToS?
  • Or did Google ask you to implement Google Analytics sampling because you were violating the Terms of Service?
  • Or was it another reason?
Joe Christopher
About the Author

As Vice President of Analytics at Blast Analytics, Joe leads a team of talented analytics consultants responsible for helping clients understand and take action on their vast amounts of data, to continuously improve and EVOLVE their organizations. With over 20 years of experience in analytics and digital marketing, Joe offers a high-level of knowledge and guidance to clients across all industries. He is an expert in all major analytics platforms including Google Analytics and Adobe Analytics, as well as various tag management systems such as Tealium and Adobe Launch. He also consults on data visualization, data governance, and data quality strategies. Having extensive expertise in many areas, has enabled Joe to become a well known thought leader and speak at industry events such as Tealium’s Digital Velocity series. Joe remains on the pulse of various information technology, programming languages, tools and services, keeping Blast and its clients on the leading edge.

Connect with Joe on LinkedIn. Joe Christopher has written on the Blast Digital Customer Experience and Analytics Blog.