Are Rogue Sites Influencing Your Google Analytics Data?

Are Rogue Sites Influencing Your Google Analytics Data?

Posted by on Mon, Jun 27, 2011
Filed Under | Google Analytics


Did you know that if someone puts your Google Analytics tracking code on their site (the same UA-#), visits to their site will show up in your Google Analytics profiles?

It is true, but thankfully, there is a way to fix this issue.

We won’t get deep into why someone would do this, but it generally stems from someone lifting your design or embedding your content within their site — both nefarious. The people that do this are often unaware or too lazy to remove the tracking code.

In this Google Analytics Tips article, you’ll learn two valuable lessons;

  1. How to identify external sites and URLs that contain your own site’s tracking code
  2. How to filter out these visits so they do not impact your data analysis

How to Identify Rogue Sites

Note: We are using the new v5 interface to demonstrate this technique. At the top right of the Google Analytics interface, there is a link that says ‘New Version’. Be sure to switch to the new version prior to following these instructions.

Let’s create a custom ‘Hostname’ report that will have a dimension of ‘Hostname’ and a drill-down to ‘Landing Page.’ On this custom report, we’ll show metrics of ‘Visits’ and ‘Unique Visitors.’ Feel free to add additional metrics such as bounce rate, goal conversions, and revenue. Goals and revenue can be great indicators, to make sure that you aren’t going to filter out traffic that is valuable to you.

To make things easy, just click this custom report share link to add it to your GA login (or follow the instructions below to set this up yourself).

  1. Load up GA and go into your website’s profile (in the new version).
  2. Click on the ‘Custom Reports’ tab at the top.
  3. From the ‘Overview’ option on the left, click on ‘+ New Custom Report’.
  4. Enter a report name, name the report tab and metric group.
  5. Add a dimension of ‘Hostname’ and a drilldown dimension of ‘Landing Page’.
  6. Add the ‘Visits’ and ‘Unique Visitors’ metrics.
  7. Optionally, add a context filter to the custom report to exclude any hostnames that include your domain name. This step is not required and you may want to not exclude your domain since it can provide a better picture as to what percentage of traffic the rogue site is contributing.
  8. Save the report and run it.

Custom Report - Hostnames

On this report, hopefully you don’t see any hostnames that you don’t recognize. One word of caution is that you’ll likely see two Google related hostnames: translate.googleusercontent.com and webcache.googleusercontent.com. Both of these should not be looked at as rogue. The translate hostname shows up when someone comes to your site and uses the Google translate service. The webcache hostname shows up when someone clicks on the ‘Cached’ option on the lower left of your organic search result. This can be an indication that perhaps your website was experiencing downtime for your visitor (not always, but a good indicator).

Now, if you do see some hostnames that you don’t recognize, click on the hostname to drill down to the visitor’s landing page. This shows the landing page URI on that hostname. From here, you can paste in the hostname and landing page URI in a new browser tab to see what site is using your traffic. You should also view the page source code and hit ctrl/cmd+f to find your UA-#.

If you are convinced that this domain is rogue, you can of course take action as appropriate against them, but who knows how long it will take them to remove your content/tracking code. Instead, let’s go ahead and filter them out now.

How to Filter Against Rogue Sites

There are two methods of filtering out rogue sites using your Google Analytics tracking code. One is a proactive approach while the other is reactive. We personally prefer the proactive approach, but I’ll share both.  Both of these approaches are accomplished by using profile level filters.

To add a filter in Google Analytics, you need to go into your profile settings and add a new filter.  You can find additional help on adding filters by reading this Google Analytics help article.

Proactive Exclusion

If your website domain is ‘www.yourwebsite.com’ we can setup the below filter to ONLY include visits to the site where the domain matches the following regular expression: ‘yourwebsite|googleusercontent’. The | character denotes an ‘OR’ condition. I like to keep hostnames that contain googleusercontent because that shows me how many people visit from cache and how many people visit from the translate service.

It is worth mentioning that the profile filter type we are using here is an ‘Include’ filter. The include filter will ONLY keep data that matches the expression you enter. If the hostname does not match, you won’t be seeing this data in your reports.

Filter Hostnames - Proactive

Reactive Exclusion

After analyzing your new custom hostname report, make a list of the hostnames you want to block. Either add several hostname exclusion filters or add one and use the | character as an OR condition.

So for example, let’s say you wanted to exclude the following (hopefully fictitious) hostnames from your profile: www.istoleyourcontent.com, www.iknowishouldntsteal.com. The filter expression would be to exclude the hostname filter field with a filter pattern of ‘istoleyourcontent|iknowishouldntsteal’ (don’t use the single-quote around the pattern). I don’t include the www and I also don’t include .com. If they own .com, .net, or other domains, I still don’t want them to show up.

Filter Hostnames - Reactive

Important Notes About Google Analytics Filters

  • When you apply a filter to your Google Analytics profile, it only filters NEW data. Your historical data will not be re-processed. At the same time, if you incorrectly apply a filter that ends up excluding all of your traffic, you can’t undo it — you are stuck with your mistake. For this reason, we STRONGLY recommend that you initially apply any filters to a new, test profile and then monitor the data for a few days and only apply the filter to your master profile when you are comfortable with your decision. If you are nervous about applying this or any type of Google Analytics filter, we do offer Google Analytics consulting services to ease your mind.
  • A Google Analytics best practice is to always create an additional profile that remains unfiltered. That way, if you did mess up, you’ll at least have data from the affected time period.
  • Be very careful when applying an ‘Include’ filter — if you enter the wrong filter pattern, you can easily end up excluding ALL traffic. Another general tip about include filters is that you can use multiple include filters if the filter fields are mutually exclusive. There is a great blog post about using multiple include filters from Lunametrics (another Google Analytics Certified Partner).

Enjoy this post?

Join the discussion below, subscribe to our RSS feed or share it on the web.

This post was written by:

has written 31 posts on the Web Analytics Blog.

Joe is the Analytics Director and a Partner at Blast Analytics & Marketing. He understands Google Analytics like nobody else and is a master of many programming languages.

Add Joe to your circles on Google+


Tags: ,
  • Pingback: Google Analytics Basics: Confronting Data Trust Issues | Conversion Rate Optimization & Marketing Blog | FutureNow

  • Pingback: Google Analytics Basics: Confronting Data Trust Issues | Free SEO Advice

  • Greg Moore

    Hi,

    We sometimes link to external web sites.  Clinking the link is a Goal, but the real Goal happens on the websites of the business partners we link to.  They are open to allowing us to place some GA JavaScript on their site, on a Thank You page.

    Are their problems with doing this?  Might it be a good way for us to track conversions?  These are not rogue sites, but the idea of having our code on their site is similar.  

    Thanks.

  • http://www.blastam.com Joe Christopher

    Hi Greg,

    If you simply need to know if visitors clicked out to the 3rd party site, I’d recommend tracking outbound links with the script I provide on this blog post: http://www.blastam.com/blog/index.php/2011/04/how-to-track-downloads-in-google-analytics/

    If they need to do something on the other site and get to a thank you page on the other domain, then yes, you’ll want to see if you can add tracking code on those pages and then deploy cross domain tracking to ensure everything is being tracked under the same session in GA. In this case, you would want to add that 3rd party domain/hostname to your filters to ensure that it is NOT excluded.

    Hope that helps and thanks for reading our blog!

    Joe



Goal Driven Online Marketing & Analytics
Copyright © 1999-2013 Blast Analytics & Marketing