How to Chart Web Traffic using Google Analytics and Displayr
In an earlier blog I showed you how to set up the authentication protocol between Displayr and Google Analytics using the googleAuthR package. Now I will show you how to pull visitor data from your website via the Google Analytics API directly to Displayr so you can create an easy visualization.
The R package googleAnalyticsR has been built specifically for R users using the Google Analytics Reporting API v4. I have previously outlined the best authentication process between Displayr and the API (see How to connect Displayr to the Google Analytics API for more details), but will do a quick re-cap. Essentially what we need to do is log into Google Analytics and set up a Google project and service account then download a secret JSON key containing authentication credentials which we push through to the API via R code.
Once authentication has been set up in your R Output via Insert > R Output (Analysis Group), the next thing we need is the View ID of the website you want to pull data from. To determine your View ID, ensure you are logged into your Google Analytics account under the specific website you want to view (if you have multiple sites monitored), click Admin on the bottom left, go to the View column and click View Settings. The View ID will be visible under Basic Settings.
Call the API
In the below example, I will call four different metrics – users, new users, sessions and page views – from my website for all records last quarter split by date:
library(googleAnalyticsR) view_id = XXXXXX # replace this with your View ID df = google_analytics(view_id, date_range = c("2018-07-01", "2018-09-30"), metrics = c("users", "newUsers", "sessions", "pageViews"), dimensions = c("date"), max = -1)
Here I have used max = -1 so that it will pull all the data, but you can also cap this at a specific number if you wish. Once you press Calculate, you will see a result with a structure like this:
If you also want to make this call take place daily at a specific time (e.g. 9am), we can add the below lines to the top of the R Output:
library(flipTime) UpdateAt("01-11-2018 09:00", units = "days", frequency = 1, options = "wakeup")
It’s important to note that for standard Google Analytics accounts, data sampling occurs when you reach the limit of 500k sessions at the property level for the specified date range (see data sampling) in order to fetch results faster. This means that if your API call is requesting more than 500k rows, some of the rows will be estimates rather than measured values. Of course, the number of records requested by an API call will depend on the popularity of your site, the specified date range, and other factors.
If you are using a wide date range it may be prudent to split the date ranges into separate calls so as to avoid hitting the session sampling threshold and then combine them together later using a simple rbind command, for example. You can easily compare the outputs with those produced by Google Analytics to ascertain the correct split.
Another option is to use the anti_sample = TRUE setting in your API call, but it won’t work in every situation. If you click Show raw R output under OUTPUT on the Object Inspector when using this option, you can read logs outlining how much sampling is taking place. By default, anti-sampling already exports all records so you don’t need to set a value of max. Not using anti-sampling will also allow you to use date shortcuts such as “90daysAgo” or “yesterday” for both start and end date. Otherwise, you will need to specify the exact dates. For a list of all the metrics and dimensions you can call via the API, see API names.
Visualize your data
Now that we have the data as a table, we can hook this up to one of Displayr’s cool visualizations. I have chosen the area chart (Insert > Visualization (Analysis group) > Area Chart) which is essentially a line chart with the background colored in. I just need to select the R output under DATA SOURCE > Output in ‘Pages’ on the Inputs tab of the Object Inspector and change some settings.
First, I will tick Show as small multiples (panel chart) to split this into separate charts for each metric, then I will add a smoother line on the Chart tab under TREND LINES > Line of best fit. I’ve chosen Friedman’s super smoother, changed Line type to dot and ticked Ignore last data point.
You can make an area chart for free using Displayr’s area chart maker! Plus now that you know how to link up Google Analytics to Displayr, you can use your own website data!
About Oliver Harrison
After completing a PhD in German history and literature, Oliver swapped old dusty books for computer screens and logic. He then enjoyed the next 10 years as a survey programmer and data analyst in the Australasian market research industry. Today Oliver is passionate about problem-solving and helping customers achieve their goals as a member of the Customer Success team at Displayr.