Recently we were asked by a client if traffic from their test site was being posted to a Google Analytics (GA) account. It turns out it is. Not only that, so is any version of the site hosted on their development and test servers, and any developer’s local site.
We tested this by placing several pages on various websites and subdomains. For example pages were placed on www.popart.com, test.popart.com, and blogs.popart.com as well as www.clientsite.com and test.clientsite.com all with the same GATC.
Data was collected under that profile for each of these pages in Google Analytics regardless of domain or subdomain.
How Can We Eliminate Development Traffic From Our Statistics?
Removing the GATC from all but the live site was one option, but having two sets of files would require us to remember to insert this code when pages go live raising the concern of releasing a page without a GATC and losing data.
Also this solution would not address the possibility that someone could accidentally or intentionally copy our code to their site, meaning we’d be collecting stats from their site as well.
Instead by using filters we can target and collect only the site traffic we’re interested in. Moving forward when creating a GA website profile, in addition, we will create a second GA web profile to which we can apply our filters. This allows one profile to be untouched with all of our master data, and one that only collects data from our target domains.
Creating a Second Web Profile
- Select Create New Website Profile, and then choose Add a Profile for an existing domain.
- Select the Domain from the drop down for http://www.mysite.com and give a profile name that suggests it’s the filtered version of your data.
- Select continue.
This sets up a second profile that uses your original GATC – not a new one. So any data collected moving forward will collect in BOTH of your profiles.
If you are doing this to a site profile that already contains data – be aware that the second profile does NOT pull the historical data. While it is a profile that uses the same GATC it is not a duplicate. Any previous filters, reports or goals are NOT included in this second profile – you’d need to add those manually.
The filter works only on stats collected moving forward. So the goal for this second profile is to only include web traffic from http://www.mysite.com, and http://mysite.com. What we want to exclude is everything else. Excluding the chaff from development environments and different domains was easy – subdomains prove a bit trickier. But we found a filter that seems to work for these purposes.
Adding a Filter
- From the home view that allows you to see your Web profiles listed, you’ll want to select Edit.
- Scroll down to Filters Applied to Profile and click Add Filter. Give the filter an intuitive name such as “Only www.mysite.com traffic” or some such. Under filter type select Custom Filter, Include.
- Under Filter Field select Hostname.
- Under Filter Pattern we’ve used the following reg ex: ^mysite\.com|www\.mysite\.com The caret tells analytics to ignore subdomains, then we add an ‘or’ that adds back www.mysite.com.
- Now note if there are any other subdomains that you WANT to include in this profile such as blogs.mysite.com, you would want to add that at the end after a | (pipe). So any additional subdomains you WANT to track should make the pattern look like this.^mysite\.com|www\.mysite\.com|nameosubdomain\.mysite\.com“What kind of special characters can I use?” gives a more in depth explanation of the characters used and other filter options.
- Case Sensitive – No.
- Save Changes.
It takes at least 24 hours for this profile to show data, however it begins collecting data immediately. Depending on the speed of setting up the filter, you might get some traffic from other domains. But the majority should be just from the domains you specified.
This is Great But What About My Historical Data?
GA does allow you to view your data by Hostname and it also allows you to create custom reporting. While this may not solve all of your problems it can give you insight to specific questions. In the scenario presented by our client they had numbers on one of their goal pages that was higher than the number of completed e-commerce transactions. By doing the following they should be able to view the page and exclude numbers from different domains.
Viewing by Hostname
- In your original profile that contains all data (including historical data) navigate the following in the sidebar:
- In this case we’re interested in the numbers of unique visitors to our goal page from both www.mysite.com and mysite.com.
- Select www.mysite.com to view the details.
- After clicking the domain, use the dropdown next to Dimension to select Landing Page, locate the name of the goal page under the listed pages. You can narrow the time frame to only select the days you’re interested in.
- You’d need to do this again under mysite.com, and then add the two numbers to get an overall view of the goal page data.
Another way to do this that might be helpful should you need to refer to this data again in the future is to create a report.
Creating a Custom Report
- Click Custom Reporting in the sidebar, then in the upper right of that page, select Create new Custom report.
- The one I created I titled “Pageviews by hostname, page and day”. Give it a title that means something to you.
- Then under Metrics, select Content – I chose Unique Pageviews but select the metric you are interested in tracking. Drag it over to the blue box.
- Under Dimensions you’ll choose three:
- Systems/Hostname – drag over to first subdimension box,
Content/Page – drag to second subdimensions box
Visitors/Day – drag to third
I’m not sure the importance of order here, I think it allows you the drilldown so if it makes more sense for you to pick the day before the page then you’d probably want to switch the order on this.
- Systems/Hostname – drag over to first subdimension box,
- Save the report and then you should be able to click on this under your Custom Reporting anytime in the future. You will still need to add the two numbers together from your two domains, but this shows you a day by day break out as you click to drill down into the numbers.
This may be useful to you in case you don’t want to add the filter, or you find this is a better way to view historical data for comparison. You could essentially do both – since the filtered profile is separate from this one.
While I looked into a good portion of this on my own, the idea of creating a second website profile and applying a filter that just pulls traffic from your site was only one example from a chapter called “Best Practices Configuration Guide” from Advanced Web Metrics with Google Analytics by Brian Clifton. I would highly recommend this book to anyone who wants a more thorough understanding of what GA can offer.
This was originally posted on my work blog, and I’m re-posting it here for archival purposes.