How to Build a Brand Funnel in Displayr using R

Have you ever been in a situation where you need to update a KPI summary but you want everything within the one table? This is particularly handy for tracking where you can automatically pull together different bits of information into one summary table. All this can be accomplished in Displayr - and it’s easier than you might have suspected!

To illustrate this how this is done, I’m going to create a table that represents the concept of a brand funnel (similar to a purchase funnel). The final result will look like this:

Brand funnel summary table

The data therein is often represented visually with the image of a funnel – as we do in this example dashboard. I have three separate questions that feed into the table: Awareness, Consideration and Preferred Cola (called Main Brand in my table above). Then I calculate a Ratio as a fourth line (using the Consider figure divided by the Main figure for each brand).

Potentially, you could apply this procedure to any KPI summary or creation of a brand index. It just involves a bit of simple R code with the application of some basic logic. In this post, I’ll explain step by step how I made the above table. The key steps we’ll go through are:

  • Setting up your source tables
  • Creating an R output
  • Writing some instructions (in code)

For that reason, I’ve broken up the fourth step above into bits to show you how it works, line by line. As a result, this post also serves as an introduction to working with R to those who are new to it! (Don’t worry, it’s not THAT scary). You can interact with the examples in this post in our example document.

Add your data

The first step to any analysis is to get your data into the software. After logging in, create a new document using the + Add New button, and then add data to your document using Home > Data > Data Set. Use the My Computer option to upload a file from your PC, or use one of the other sources like R, SQL, URL, etc. More on the kind of data sets you can use in Displayr is discussed here.

Set up your source tables

The easiest way to create the statistics we need for the funnel is to create tables in Displayr. This is done by dragging from the Data Sets section on to your page.

We later reference certain cells within those tables and rearrange them in the final summary table. For the worked example, I created three tables upfront, shown below:

Create an R output

To set up the code we need for the funnel, select Insert > R Output. An R Output allows you to use R code to create tables, charts, and text based on other outputs and data in your document.

You can then type in your R code in the R CODE section on the right-hand side. Try typing in just the number 1 and pushing Calculate. It should return just the digit  in the output area (in the middle of screen).

Then you could try typing 1 + 1 and pushing Calculate. It should return… *drumroll*2! Easy.

For now, make sure the Automatic toggle box is deselected (otherwise the code will recompute every time you type something in). We can set it to automatic later, after we’ve finished the code.Some very basic R

Now try typing:  c(“Coca-Cola”,”Pepsi”,”Fanta”,”Pepsi Max”)and pushing Calculate.

Code snippet

That should return the below as an output.

Output combine function

The little c together with the brackets is known as a function. In this case, the function c() combines whatever is in the brackets into a vector. I had to put the double quote marks “” around each brand so that the R code knows that I’m talking about text (and not referring to other variables!). In programming speak that makes each of these brands a string.  The comma in between each string tells R that they are separate items.

You can then store the vector as a variable – so you don’t need to type it all again later. Let’s store it as a variable called brands. Then later we could recall the 3-string-vector just by typing brands. Putting the code together for all this looks like the below:

brands = c("Coca-Cola","Pepsi","Fanta","Pepsi Max")

R works from top to bottom, line by line on the code. The last thing in the list is what ends up in the output area. Thus, if you type in:

brands = c("Coca-Cola","Pepsi","Fanta","Pepsi Max")
1+1

You’ll get 2 as an output. But if you type in the below, including the 3rd line, you’ll get our vector of brands as the output.

brands = c("Coca-Cola","Pepsi","Fanta","Pepsi Max")
1+1
brands

We can take this a step further and reference a specific item within a vector. In the case of our brands vector, I might want the 2nd item. So I could add a [2] to the brands on the last line:

brands = c("Coca-Cola","Pepsi","Fanta","Pepsi Max") 
1+1 
brands[2]

And that will return… Pepsi! Try it.

Furthermore, I can reference multiple things by typing the following – in this case returning the 2nd to the 4th item in brands (including the 3rd)

brands[2:4]

And then, just for fun, I could do the following to mess with the brand order, for example:

brands[c(1,4,3,2)]

Reference the source tables

R, when used in Displayr, can bring in data from tables (and variables) in your project. Our first step in building our brand funnel is to get all the Names for all the tables that we want to use. These names will be used to refer to the tables within the code.

To check or set the Name of an output:

  1. Select the output.
  2. On the right, select Properties > GENERAL > Name.
  3. (Optional) Change the entry to something shorter (no spaces allowed!).

I do this for the other tables as well so that I have: aware_tab, consider_tab, main_tab

With that bit of housekeeping done, I can reference these tables in my R output. If you enter one of those reference names it will return the entire table. Try typing in aware_tab into the R CODE input box on the right and click Calculate. You will get an output that displays the same numbers from the original awareness table.

When the item in the code highlights in blue it means it is successfully referencing an output in the document If you hover your mouse over the blue highlight, it should then pop-out an tool-tip showing the source table for your reference.

blue hover

Important: R (like all programming languages) is very specific about spelling, punctuation (including spaces at the end of words) and is case-sensitive. Whatever you put into the code has to be exact.

Reference a specific cell in the source tables

Next, we will reference individual cells, rows, or columns within a table. This is called subsetting which uses indices to find the positions in the table that you want to extract. It’s a simple concept – in fact, we’ve already worked with subsets and indices when we selected which brands to show from brands.

To return a specific cell, use its index.  This is what we did above when we typed in brands[2]. But we’re now dealing with tables, so we have some different options. If I want to return the statistics of Preferred Cola for Diet Coke, I can do so by typing main_tab[2] and that will return 10.3 (ie: 10%). That’s because I referenced the figure in second position (index) in the Preferred Cola table.

That’s all well and good, but what if my source table changed? If I add in a NET Coke line to my source table and move that to the top of the table then Diet Coke shifts down. It’s now, instead, in third position, and my result is no longer correct.

Instead, we can reference the data using the row name (category name), which in our case is the brand name. That way, it’s impervious to any changes in position within the source table. To do so, I just replace the number 2 with some text within double quote marks “”. In this case, it’s:

aware_tab["Diet Coke"]

That returns 10.3. Try dragging and dropping the categories in the source table to change their order and then click Calculate again in the Object Inspector of your R output. The referencing is now secure (in terms of position).

Again, much like referencing tables, you need to be exact in referencing categories within tables. That means that “Coca-Cola” does not equal “Coca Cola”, or indeed “Coca-Cola “. Can you spot the difference in the last one? There is a rogue space between the end of Cola and the last full stop! That’s important.

Reference lots of cells at once in the source tables

Now we can reference lots of brands at once by using a vector as the index (rather than just one number or text label). The benefit is that I’m always referencing brands in the same order and I don’t have to worry about position of brands in my source tables. For that reason, I’m going to set up a vector of  brands as the following.

brands = c("Coca-Cola","Diet Coke","Coke Zero","Pepsi","Diet Pepsi","Pepsi Max")

I’m then going to use this to reference all the brands in the Awareness table but using brand as the index within the square brackets. So it’s

aware_tab[brands]

which is essentially identical to me writing:

aware_tab[c("Coca-Cola","Diet Coke","Coke Zero","Pepsi","Diet Pepsi","Pepsi Max")]

… but a lot neater! I don’t have to keep repeating the brands once I’ve combined it all into the variable brands.

If you now return aware_tab[brands] you’ll get all the awareness figures from the source table in the desired order. Let’s store that as another variable called aware as the second line of our code:

brands = c("Coca-Cola","Diet Coke","Coke Zero","Pepsi","Diet Pepsi","Pepsi Max")
aware = aware_tab[brands]

Now in the second source table, consider_tab, I want the figures from the Top 2 Box column. This table is slightly different to the first table. It’s a matrix (because it has rows and columns). Because of this, I need to reference both row and column co-ordinates with my indices, separated by a comma. To obtain the Top 2 Box scores for each of the brands, that would be:

consider = consider_tab[brands,"NET: Top 2 Box"]

Finally, as with the aware table, I also grab the data from the main brand table and place that into a variable called main:

main = main_tab[brands]

Now that I have the data from my three tables stored into separate variables, I can combine them together using the function cbind().  Here, the c stands for column, and the bind part is descriptive in that it binds the referenced vectors together.  This will output a table where each column is one of my variables, and each row represents a brand:

table = cbind(aware, consider, main)

Notice that I’ve stored this as a variable called table. Again, it’s just short-hand so that I can reference the entire thing later. Try pushing Calculate. And voilà! It’s made the summary table!

But the names of the columns are just the variable names. I could in fact give them proper titles, so let’s do that by adding in the following to the code to make the columns a bit more descriptive.

table = cbind("Awareness" = aware, "Consider" = consider, "Main brand" = main)

In this line of code, I’m saying that a new column called Awareness should filled with aware, and then next to that a column called Consider should be filled with consider, etc.

Finally, I’m going to transpose it so that it looks like a funnel with Awareness on top. To do so, I use the transpose function t() and I store that as funnel.

funnel = t(table)

The brand funnel is almost done!

Calculate the ratio of consideration and main brand scores

Now that we have our summary table, I’m going to add one final line below it. This will create a ratio between the Consider line of the summary table and the Main brand line. To do so I’m going to reference data in the summary table itself rather than the source tables. As I’ve saved that using the reference name funnel, I’ll reference the Consider line by writing the following.

funnel["Consider",]

The funnel variable won’t highlight in blue, because it’s not in the outputs area. It’s a vector that’s local (ie: inside) the R output. Also notice in the above that I’ve left nothing to the right of the  comma. That’s because I’m saying, in R, that I want all the columns  (i.e. I’ve skipped specifying a column or columns). But I only want all the columns for the rows called “Consider”, so we need to subset our table again to get a specific slice of my table.

The ratio I’m after is a division of this slice by the corresponding figures in the Main Brand row. The code for the ratio is going to be the following (which I’m storing in a vector called ratio).

ratio = funnel["Consider", ] / funnel["Main brand", ]

We’re nearly there! I’m now going to add this ratio line in to the summary table, this time using rbind() (which stands for “row-bind”) and calling the new line Ratio.

funnel = rbind(funnel, "Ratio" = ratio)

Finally, you’ll notice that in the above, I’m writing the new funnel (including the ratio) over the top of itself. This changes the variable funnel. As it’s the last line, this is returned as the output.

The final, complete block of code is relatively straight-forward:

brands = c("Coca-Cola","Diet Coke","Coke Zero","Pepsi","Diet Pepsi","Pepsi Max")
aware = aware_tab[brands]
consider = consider_tab[brands,"NET: Top 2 Box"]
main = main_tab[brands]
table = cbind("Awareness" = aware, "Consider" = consider, "Main brand" = main)
funnel = t(table)
ratio = funnel["Consider",]/funnel["Main brand",]
funnel = rbind(funnel, "Ratio" = ratio)

Brand funnel summary table

Try it yourself

You can access a copy of the data, tables, and calculations from this post by clicking here.

About Matt Steele

Matt has over 14 years of experience in the marketing research arena, with a combination of research experience (qualitative and quantitative), marketing training, academic psychology (cognitive), creative leadership, geekiness and artistic flair. He currently works for Displayr (the home of Q and Displayr) and is based in London: supporting, selling, marketing and training for Q research software and associated software packages (eg: Displayr). He holds a Honours degree in Psychology from UNSW, a Grad Cert. in Marketing from UTS, and a Grad Dip in Directing from NIDA (all based in Sydney, Australia).

Keep updated with the latest in data science.