Ways to save time
If you're new to branding research, you'll get a lot out of the principles and techniques. If you’re an old hand, the value will come from how to automate things to go fast.
Spontaneous awareness data
We asked a sample of 2,090 Americans which phone companies we they'd heard of.
And, like most samples, it includes a lot of bad spellers. Look how they've spelled AT&T in row 2072.
Let's automatically code this text into categories.
1) Automatic categorization of spontaneous awareness
Advanced Analysis > Text Analysis > Automatic Categorization > List of items
Drag across > Cell Phones > Week
So, an algorithm specifically designed for spontaneous awareness data is coding the text automatically.
You can find out more about how this works in the webinar on text analysis.
Once result has appears: As you can see, it's identified 1325 mentions of Verizon, and 18 different variants.
The linked webinar explains how to tune this.
2) Semi-automatic categorization of text data
Often in branding research we ask people to explain something to us. Here, it's why they chose a particular quick service restaurant.
Automatic categorization doesn’t do the job well with data like this as you need context to work out what's interesting and why.
But, we don't have to do it manually. We can use semi-automatic categorization.
Click on Q6 Reasons Chose
+ > Text Categorization > Semi-Automatic > Overlapping > New
This will take a while, so I've done one before
You can see the list of categories created on the right. It's created 37 categories? Why so many? Because we then review each of these, and merge or split them as we need to, and thus we get to use our judgment to improve the categorization.
For example, let's look at category 12, convenience.
As you'd expect, they've all said convenience.
What about Convenient 2.
So, these are people that have referred to convenience or have made spelling mistakes. To the large language model that this is built in, this distinction is relevant. But, in this situation it's immaterial, so we can just merge them together.
As you can see, it's finished creating the categorization. For more on this, please check out the webinar.
3) Compare brand health metrics
It turns out that the correct approach was invented before the term text analysis even become popular.
4) Calculate conversion between adjacent brand health stages
Branding research studies typically collect various brand health metrics. Many of these can be ordered. For example, people can only eat a brand, if they're aware of it, so ever eaten comes after aided awareness.
Consideration comes after Ever Eaten. And, Eaten Recently usually comes after Consideration.
A lot of insight can be found by calculating the conversion between adjacent stages. For example, we can divide the Ever Eaten data by the Aided awareness data.
Calculation > Divide
For example, you can see hat 96% of people that have heard of Arnolds have gone on to purchase. But, Bread Basket's conversion from aware to ever eaten is much worst at 53%. So, as a brand Bread Basket needs to find a way of encouraging trial.
It'd be great if there was a nice tables that put all this together.
5) Automatically create brand health tables\
Table > Specialty Tables > Brand Health Table
I'm going to get a list of the tables shown on the previous page by expanding it out in the pages tree.
And, I'll drag these tables to here. So, now I've got all the data in a nice table.
And, I'll add all the conversion data
Note that category 14 is Tasty. Let's look at them.
6) Data reduction principles
In the webinar called Finding the Story in your Survey Data I explain the eight basic principles of data reduction.
One of these is to delete uninteresting analyses. But, how do you know if a result is uninteresting?
7) General theory of what makes a result "interesting"
A result is interesting when there's a big difference between the observed result and the expected result. But, how do we know what to expect? We use the theory of double jeopardy.
8) Double Jeopardy
Double jeopardy says that if a brand is poor on one metric, it will generally be poor all metrics. This general pattern has been found in many hundreds of markets.
Double jeopardy gives us clues about how to do branding research.
9) Order your data by market share
Remember, double jeopardy says that if a brand is poor one metric it will be poor on others. In most markets, market share is a key metric. So, double jeopardy says that a brand that's poor on market share should be expected to be poor on lots of other metrics. It follows from this if you think about it for a while that we should routinely sort our brand data by market share.
Let's go back to our brand health data and test this.
We don't have market share, but eaten last month is a good proxy. So, let’s sort by this.
% Eaten list month > Sort > Descending
Now, let's look at our brand health table. And we will sort this table by the Eaten last month data.
Looking at the bottom row, we can see that Pre-a-pane really standards out as its conversion from Eaten last month to Consider is really lower than the adjacent brands once we've sorted.
10) Show brand health as funnels
Here's our data, simplified and prettified as conversion, with color coding pointing out the interesting results.
The conversion numbers that we calculated earlier often give a lot of insight. But, they make an implicit assumption. When we divide one metric by another, we're implicitly assuming that we expect one metric to be a proportion of another. But, double jeopardy says nothing about proportionality. It just says that if one metric's bad, another is probably bad.
Patterns consistent
So, when we are looking at conversion percentages, where implicitly assuming that double jeopardy manifests itself in terms of the proportional pattern shown at the top right. But, in practice this does happen, but lots of these other patterns can occur. And, as a result, when exploring double jeopardy I find it's often best to perform the comparisons using labeled scatterplots of paired metrics.
11) Gain deeper insights by looking at labeled scatterplots of pairs of metrics
Visualization > Scatter > Labeled Scatterplot
We will drag in the data from he earlier brand health tables
In this example, I've plotted one survey question's data by another. We can and should also create such plots using constructed metrics.
12) Use constructed metrics
In this example, I've plotted relative market share by their growth rates.
And, turn them into interactive dashboards so that clients can self-service.
Let's look at Latin America.
Wow! Burger Shack is really killing it in Latin America!
13) Use pairs of metrics
So far we've been using pairs of metrics to compare brands.
We can also compare customers for a single brand using pairs of metrics
For example, the basic brand vulneraility matrix contrasts brand usage with attitude.
Let's look at it populated with data.
Basic brand
Here we're looking at frequency of visiting Arnolds by Brand attitude.
Take a look at the bottom right. 13% of the market Like Arnold's, but have not been I the past month. We can see that these are primarily older people than mainly want dinner and takeway. Maybe Arnold's can grow by focusing on older people for dinner?
14) Duplication of proportion law
The duplication of purchase law says that the proportion of buyers that switch to a particular brand is determined by the brand's market share.
15)Look at brand switching data
16) Use correspondence analysis of square tables
This table shows brand switching data for the US cell phone market, telling us what brand users with with, in the rows, and what brand they switched to, in the columns.
The bottom row shows market share. So, applying the duplication of purchase law, it tells us that we should expect that when people switch from one brand to another, 21% will choose AT&T.
Remember before, we need to compare observed with expected.
Let's look at the Verizon row of the table. It shows us that 29% of people switched from Verizon to AT&T. This is above the market share of 21%, so it's telling us that AT&T is doing relatively well at acquiring customers from Verizon, suggesting that they're competitors.
But, the percentage switching from Straight Talk to AT&T is 13%, which is well below the expected 21% for AT&T, telling us that AT&T is bad at acquiring customers from Straight Talk and they're less competitive as customers.
Note that with this switching data we have the same names in the rows and columns. A special tpe of correspondence analysis has been developed for summarizing the patterns in such data
Inputs > Visualization > Dimension Reduction > Corespondence Analysis of a square table
STATISTICS > Cells > Total
The closer two brands are together, the more likely they are to compete, all else being equal.
17) Use software designed for brand association tables
In branding research it's common to have brand association tables, showing the perceptions that people have of different brands. In this example, I've got 8 brands and 7 attributes, so that means the underlying data contains 56 variables.
Displayr and Q are designed for such data. And, as you will shortly see, this is going to save us a heap of time.
Let's say we want to see how the brand associations are trending over time.
Visualization > Time Series > Small Multiples with Tests for Trend
Looking at Arnold's in the first row, we can see that over time it's trending down in terms of Good Value for Money.
We can also use the same style of visualization for sub-groups. Let's look at Gender.
Drag gender to columns
Chart > SERIES > Label Points
We an now easily see that females were more likely to see Arnold's as being for when on the go, value for money, healthy, and affordable.
18) When analyzing brand associations, calculate chi-square "expected" values and "residuals"
If you compare Arnold/s and Nuovo, you will see that Arnold's is bigger on everything. To make this easier to see I've calculated the Sum of the rows. This sum is sometimes called the Brand Effect.
Remember we earlier discussed Double Jeopardy. If one brand is bigger on one thing, it's probably bigger on everything. We're seeing this effect here.
And, there's a second attribute effect as well. We can see that some attributes are more likely to be associated with brands than others.
The brand and attribute effects are often not really that interesting, so we can get insight by trying to remove them from the data. This is done by calculating chi-square expected values and residuals.
As we discussed earlier, often interesting results are calculated by comparing the observed data with the expected data.
We've observed a score of 25% for Arnolds.
A simple formula for calculating the expected effect is:
Expected = Quality * Arnold / Total
That is, our expected score is based on the row effect and the column effect. And the difference between these, often called the Residual, is then the observed minus the expected.
Residual = Observed - Expected
So, this tells us that once the brand effect is removed, Arnold's is actually pretty weak on Quality, evn though it's the second highest score.
For those of you that like to write a bit of code, we can automate all of this using this code:
table.Q14.3 - chisq.test(table.Q14.3)$expected
I've shown you the code because we get this question a bit in support. Don't worry, there's a better approach.
19) Use the in-built residuals
On any table in Displayr and Q, you can just modify the cell statistics and choose the residuals.
Statistics - Cells > Residual %
Remove: %
That's a lot faster isn't it. On the previous page I showed you the chi-square residuals. Technically, though, the chi-square residuals are invalid as they assume that the rows and columns are mutually exclusive. We use a more advanced log-linear model to calculate the residuals, so you will get slightly different results. But, you can use the more common approach by writing the code I just showed you.
Let's turn the stat testing on.
Stat testing - arrows and columns
The stat testing is hooked up to the residuals, with the size of the arrow showing statistical significance. We can see, that Arnold's two key differentiators are then Easy to to when you're on the go and Healthy Food.
Now, a problem with residuals is that sometimes they're all significant, so we need a way of summarizing them. Correspondence analysis was invented for this.
20) Moonplots
We will turn it back to the normal percentages. Note that the stat testing of the residuals stays in place.
I'm going to start with my favorite way of visualizing correspondence analysis, which is via the moonplot
Inputs > Visualization > Dimension Reduction > Moonplot
We can see that Lucky and Nero own Is Affordable and Good value, whereas Pret owns Has healthy food options.
Some people find this plot a bit scary, and prefer the traditional plot
Visualization > Dimension Reduction > Correspndence analysis of a table
21) Use Driver Analysis
Another standard way of analyzing image grids is to combine them with preference data to perform driver analyses, telling us the improtance of each brand.
You can find more about this in the webinar linked below.
22) Market maps
We can use bubbles to show the size of the attributes, and turn our earlier correspondence analysis into a market map.
Techniques
We've got another webinar which talks about ways to improve data visualization.
It discusses a technique known as supernormalize, which is about changing data so that visualizations form recognizable shapes.
Cola brand attributes
Here I've got a radar chart showing some attributes for cola brands.
It's a small multiples of radar charts, which is col. But, it's not entirely clear what the pattern is. There's an easy way to make it much better.
We want to reorder the rows of the table so that the shapes above are simpler. So that they are more like footprints if you like.
The webinar on finding the story in data mentioned earlier describes the general principle which is known as diagonalization, so I'm just going to show you the outcome.
23) Reorder attributes and brands so that radar plots create recognizable and ordered footprints
I've rearranged the data. Note how in the first column Coke and Pepsi have the same shape. This tells us that they have the same positioning. When we look at the attributes, we sek that these brands are traditional and older.
Because Coke's got a bigger footprint, the brand's stronger. We can see that Pepsi Max and Coke Zero also have the same shape footprints.
- Rebellious
- Open to new experiences
- Health and weight conscious
And Diet Coke and Diet Pepsi are also very similar, with Health, Weight consciousness, along with femininity and, for Diet Pepsi, innocence as well.
24) Show shape with palm trees
If your software allows it , palm trees are even better than radar charts for comparing brands.
We can see the shapes really clearly. Coke and Pepsi, for example, both look a bit like rabbits.
But, the brand effect is communicated by height, which is easier to see than by area as is shown in the radar chart.