Diagnosing and debugging issues with Google Analytics
Diagnosing and debugging issues with Google Analytics setups
In this article Georgi Georgiev, one of the first certified Google Analytics specialists in the world shares his advice and a set of tools of the trade used for diagnosing and debugging tracking issues one might encounter when using Google Analytics. Georgi was a lecturer at several Google-organized educational events for digital agencies in his capacity as a Google regional trainer. He is also the founder of Analytics-Toolkit.com – the toolkit for the analytics professional.
Google Analytics is currently the de-facto standard tool for analyzing user behavior on websites and mobile apps. It is used by a large proportion of websites since it provides enterprise-level functionalities for free. Its range of use spans from personal blogs to large e-commerce stores and web portals. Thousands of business decisions are made based on Google Analytics data and thus rely on its accuracy.
In my many years in the business I’ve seen that a lot of online businesses have issues with the quality and accuracy of the information they collect via Google Analytics, which is obviously a problem, since an analysis based on bad data will almost inevitably lead to the wrong conclusions. I’ve seen even very large technology businesses and e-stores with glaring issues in their analytics setups. Let us go over the most common mistakes I see in the wild and then I’ll cover the tools I use to diagnose and fix these issues.
The “set and forget” fallacy
This is one of the significant underlying issues: people think Google Analytics is very easy to set up and once that’s done there is no need to check the quality of the data or update the code. I place some of the blame for this attitude on Google themselves as Analytics is advertised as a very easy to use piece of software: you place a code snippet and that’s it - it simply works! While this may be true for a decent proportion of websites out there: simple, both in structure and technology and requiring no specialized data or a measurement plan. For such websites it can indeed be said that Google Analytics can be configured in 3 minutes, but these are also the ones least in need for proper tracking. In fact, a simple visitor counter, maybe accompanied by pageviews per page and traffic source is all that they need.
The situation is much different for the real clients of Google Analytics: those which require solid data to guide business decisions. These clients are gravely mistaken if they are misled by the promise of a quick and easy integration. Such sites often span several domains and subdomains, employ AJAX, dynamic content loading and may in fact be a SPA (Single Page App). They usually need to track a host of different user actions such as phone clicks, email clicks, scrolling, video plays, e-commerce related actions and detailed purchase information, etc. All of these require a detailed measurement plan, execution and data-quality audit before they can become trusted business metrics.
Alas, there are still a lot of businesses which only use 5-10% of the capabilities of Google Analytics and never make it to the next level, despite an obvious need. In other cases, the lack of understanding of the platform leads to integrations with a significant number of issues, some of which we’ll cover below.
A part of the issue is the lack of regular audits aimed at assuring data quality and detecting issues, as well as keeping the integration up to date with the ever-changing and improving Google Analytics platform.
With the above in mind, the advice I give my clients is to audit the tracking setup in all of these cases:
Launch of a new websites (obviously)
When they introduce significant changes to how the site works, adding/removing features, etc.
Before releasing a redesign to production
When there is a CMS or CMS plugin update
When there are changes to the core business model or the primary and secondary website goals
If this doesn’t happen one ends up with what I see all day long: websites using a Google Analytics library which has been deprecated 10 years ago. Using such an old library leads to inaccurate or misleading statistics. For example, the older ga.js library doesn’t even designate Google searches performed on Android as searches and it appears as referral traffic. Using an old library also means your business is not able to use many of the more recent improvements and additions to the platform.
When upgrading to a newer Google Analytics version it is important to make sure all integration points are upgraded. Failure to do so will lead to loss of tracking or worse.
You will be surprised how often I encounter cases where on a single page there are two Google Analytics trackers sending traffic to the same tracker ID. The two trackers work parallel to each other and since Google Analytics does no checks against such a situation, so clients end up with twice the pageviews, an unrealistic bounce rate (like 1%), and unrealistic page-based metrics such as pages per session. Fixing such an issue leads to the unpleasant discovery that not 1%, but 70% of users bounce.
Double-counting often happens when a developer installs a new Analytics code snippet while forgetting that there is already one present: either directly embedded in the code or delivered via Google Tag Manager. In Wordpress-based websites there could be more than one plugin placing Google Analytics code snippets. One might also end up with double-tracking if there is one code embedded directly and then another instance being inserted by a plugin.
Fortunately, this problem is one of the easiest to identify as the incredibly low bounce rate is a dead giveaway. In more complex cases only some pages of a larger website have double-counting: detecting these requires page-level bounce rate analysis. The tools I list in the tool section can all help detect double-tracking one way or another.
Broken bounce rate metric
Double-counting is not the only way to break your bounce rate metric. Bounce rate is often defined as the percentage of sessions during which the user only saw a single page from your website. This is often cited as the definition but the actual, technical definition, is a bit different: the proportion of sessions during which only one hit of type pageview or event is registered for which the nonInteraction flag is not set.
The bounce rate metric is often “broken” when you want to measure user actions within the page by sending an event to Google Analytics which informs it of information not available during the page load. Scroll events or timing events (read time) are common, and so are events that push custom dimensions.
If you use such events and do not set the nonInteraction flag for them you will see an unrealistic bounce rate. It may be just as bad as double-counting if the event fires for almost all sessions, or it could be somewhat better if it only fires for some. The outcome is not as bad as with double-counting, but can be frustrating enough, especially given the fact that there is no way to fix the data retroactively.
Measuring internal traffic, bot traffic or sessions from test/dev/staging versions
I often analyze setups in which the Google Analytics code is present not only on the production version of a website but also on testing / development / staging ones. These are usually only accessed by personnel working on them, so there is no reason to have their visits included in your main reporting view. Even if the ratio of internal visits to actual potential clients is very large, it can lead to significant discrepancies in the data since some metrics are much more prone to being skewed than the overall number of sessions or users.
For example, a single web developer working on the checkout process might end up registering 50 orders in Google Analytics in a given day which would lead to a significantly skewed data even for a store with 1000 orders per day.
The solution is to employ view-level filters which exclude traffic from these hostnames. Even better if you can implement it: deploy a filter which only includes traffic from the production domain(s) of the website.
Another issue which is significant in case the company has many employees visiting the site is that there are no filters for such visits. This leads to skewed data and may bias the outcome of A/B tests, marketing efforts and others. The best way to address it will depend on the particular case.
During the last couple of years we’ve also seen an increased volume of bot traffic: software which automatically visits to your website or which simulates a visit for Google Analytics (without actually visiting the site). The purpose is often to promote a scam online marketing service. At some point in time the volume of this traffic was quite significant with new spammers popping up each data. Nowadays the issue is more episodic in nature, but can still lead to significant pollution of your data unless you take precautions.
If you have a filter which only includes traffic to your own domain(s) will filter out some of the bots while for the rest there a number of solutions, including fully-automated ones suitable for digital agencies and analytics specialists managing dozens or hundreds of Google Analytics setups. One of these you can find on Analytics-Toolkit.com (the Auto Spam Filters tool).
Inaccurate goal tracking or e-commerce tracking
Other mistake I often see are poor goal and goal funnel configurations. Sometimes these are wrong from the very beginning, in other cases the site has changed (redesigned, CMS updated, etc.) while the goals stayed the same, leading to issues such as inaccurate data or data missing entirely over periods of time.
The best way to guard against such issues is to make the tracking audit an integral part of releasing any kind of significant changes to the website, such as the example list provided above. Since at least part of the data, e.g. e-commerce transactions, are stored in your backend it is a good idea to check of any unusual discrepancies between that data and the numbers in Google Analytics.
Google Analytics debugging tools
As with any other occupation the quality of your work will improve if you have better tools. These are some tools I use on a daily basis which I’m sure you’ll also find useful in diagnosing and debugging Googlе Analytics integrations:
1. Chrome Tag Assistant – this is an easy to use noob-friendly tool which will show you all tags present on a page and will alert you to some common integration issues. Its “Recording” function will allow you obtain a log of all commands issue to Google Analytics while you browse the site. You can use it to see what Google Analytics is tracking and what is not being tracked.
2. Chrome Google Analytics Debugger is a plugin for the more advanced: it shows in real time what hits are sent to Google Analytics with all of their parameters and errors.
3. Google Analytics data itself – a lot of issues can be inferred from the data. For example, an extremely low bounce rate can be indicative of double tracking or events which fire without user interaction but do not set the nonInteraction flag. The real-time reports can show you immediately how different changes to your setup affect your data.
4. The Network tab of the Developer Toolbar of your browser can also help – in it you will be able to see all requests and responses to and from the GA servers. This tool is appropriate only for advanced users with a good understanding of the HTTP protocol as the data will be quite cryptic otherwise.
5. The Google Analytics Health Check tool will be useful if you need to do more than a couple of audits per month. It will save you huge amounts of time and will prevent human errors even if you already have a good checklist to go through. It can also be used to frequently re-analyze the health of a single integration. I use it in all my audits, obviously.
Mastering the above tools will allow you to detect and debug almost all possible issues with a Google Analytics setup. In this article I chose not to cover issues related to the Measurement Protocol as they are diagnosed using a different set of methods entirely.