8 Tips for Interpreting R-Squared

Hopefully, if you have landed on this post you have a basic idea of what the R-Squared statistic means. The R-Squared statistic is a number...

What is Driver Analysis?

Driver analysis, which is also known as key driver analysis, importance analysis, and relative importance analysis, quantifies the importance of a series of predictor variables...

4 Visualizations For Your Customer Satisfaction Data

Customer satisfaction is a valuable customer feedback metric. Here are the four visualizations to find stories in your customer satisfaction data.

How to Filter a Dashboard Based on User Logins

You can restrict what people see on a Displayr dashboard based on their department, geographic region, or some other user characteristic. For example, you can...

How to Dynamically Change a Question Based on a Control Box

Control boxes are a popular way for users to change things on a Displayr page. This post will show you how to use a control...

How to Interpret Logistic Regression Coefficients

This post describes how to interpret the coefficients, also known as parameter estimates, from logistic regression (aka binary logit and binary logistic regression). It does...

RECENT POSTS

Correspondence Analysis of Square Tables

Square tables are data tables where the rows and columns have the same labels, commonly seen as a crosstab of brand switching or brand repertoire data.…

Automatically Fitting the Support Vector Machine Cost Parameter

In an earlier post I discussed how to avoid overfitting when using Support Vector Machines. This was achieved using cross validation. In cross validation, prediction accuracy is…

Put PowerPoint into Cruise Control: How to Automatically Update Your Reports

The ability to automatically update PowerPoint slides with new data can save time, money, error, and your sanity. Some analysis software packages allow your reporting to go into...

Customization of Bubble Charts for Correspondence Analysis in Displayr

When you insert a bubble chart in Displayr (Insert > Visualization > Bubbleplot), you can customize some aspects of its appearance from the controls that appear in the object…

Using Bubble Charts to Show Significant Relationships and Residuals in Correspondence Analysis

While correspondence analysis does a great job at highlighting relationships in large tables, a practical problem is that correspondence analysis only shows the strongest relationships, and sometimes…

Why Capability Trumps Character for Supporters of the US President

American supporters of Donald Trump believe that financial skills are more important in a president than decency and ethics, a new survey shows. Data science…

Machine Learning: Pruning Decision Trees

In machine learning and data mining, pruning is a technique associated with decision trees. Pruning reduces the size of decision trees by removing parts of...

Comparing Partial Least Squares to Johnson’s Relative Weights

In this post I explore two different methods for computing the relative importance of predictors in regression: Johnson's Relative Weights and Partial Least Squares (PLS) regression. Both techniques solve a problem with...

Using Partial Least Squares to Conduct Relative Importance Analysis in R

Partial Least Squares (PLS) is a popular method for relative importance analysis in fields where the data typically includes more predictors than observations. Relative importance analysis...

The Problem with Using Multiple Linear Regression for Key Driver Analysis: a Case Study of the Cola Market

A key driver analysis investigates the relative importance of predictors against an outcome variable, such as brand preference. Many techniques have been developed for key…

Using Partial Least Squares to Conduct Relative Importance Analysis in Displayr

Partial Least Squares (PLS) is a popular method for relative importance analysis in fields where the data typically includes more predictors than observations. Relative importance analysis…

The Magic Trick that Highlights Interesting Results on Any Table

This post describes the single biggest time saving technique that I know about for highlighting significant results on a table. The table below, which shows…

How to Link Documents in Displayr

Sometimes it is helpful if one Displayr document can refer to information in another document. For example, one document may contain an analysis of sales…

Gradient Boosting Explained – The Coolest Kid on The Machine Learning Block

Gradient boosting is a technique attracting attention for its prediction speed and accuracy, especially with large and complex data.

Using Support Vector Machines in Displayr

Support vector machines (SVMs) are a great machine learning tool for predictive modeling. In this post, I illustrate how to use them. For most problems SVMs…

Using Cross-Validation to Measure MaxDiff Performance

This post compares various approaches to analyzing MaxDiff data using a method known as cross-validation. Before you read this post, make sure you first read How MaxDiff…

How to Analyze MaxDiff Data in Displayr

This post discusses a number of options that are available in Displayr for analyzing data from MaxDiff experiments. For a more detailed explanation of how to analyze…

How MaxDiff Analysis Works (Simplish, but Not for Dummies)

This post explains the basic mechanics of how preferences can be measured using the data collected in a MaxDiff experiment. Before you read this post, make sure you...

When to Use, and Not Use, Correspondence Analysis

Correspondence analysis is one of those rare data science tools which make things simpler. You start with a big table that is too hard to…

Correspondence Analysis Versus Multiple Correspondence Analysis: Which to Use and When?

Let me cut to the chase. Multiple correspondence analysis sounds better than correspondence analysis. But, for 99% of real-world data problems, correspondence analysis is the...

How Correspondence Analysis Works (A Simple Explanation)

Correspondence analysis is a data science tool for summarizing tables. This post explains the basics of how it works. It focuses on how to understand the underlying…

How to Interpret Correspondence Analysis Plots (It Probably Isn’t the Way You Think)

Correspondence analysis is a popular data science technique. It takes a large table, and turns it into a seemingly easy-to-read visualization. Unfortunately, it is not quite…

Easily Add Images to a Correspondence Analysis Map in Displayr

You can take your correspondence analysis plots to the next level by including images. Better still, you don’t need to paste in the images after…

How to Create a MaxDiff Experimental Design in Displayr

Creating the experimental design for a MaxDiff experiment is easy in Displayr. This post describes how you can create and check the design yourself. If you…

Easily Add Images to a Correspondence Analysis Plot in R
You can take your correspondence analysis plots to the next level by including images. Better still, you don’t need to paste in the images after…

An Introduction to MaxDiff

MaxDiff is a research technique for measuring relative preferences. It is typically used in situations where more traditional question types are problematic. Consider the problem...

5 Ways to Deal with Missing Data in Cluster Analysis

If you have ever tried to perform cluster analysis when you have missing data, there is a good chance your experience was ugly. Most cluster analysis...

Where Pictographs Beat Bar Charts: Proportional Data

Pictographs are exceptionally good for some types of data. In my earlier post, I discussed how they are great for showing counts. In this post, I show...

Where Pictographs Beat Bar Charts: Count Data

Don’t forget you can create free pictographs using Displayr’s pictograph maker. Pictographs are often subject to ridicule. They are seen to compromise interpretability in favor…

Ranking Plots: Illustrating Data with Different Magnitudes

A Ranking Plot, also known as a Rank Flow Plot, is particularly useful for comparing data that differs in magnitude

Creating tables with multiple variables (filters and multiway tables)

It is super-simple to create a table involving one variable in Displayr: just drag it from Data Sets (bottom-left) of the screen onto a page,…

5 Ways to Visualize Relative Importance Scores from Key Driver Analysis

Key driver analysis techniques, such as Shapley Value, Kruskal Analysis, and Relative Weights, are useful for working out the most important predictor variables for some outcome...

Labeled Scatter Plots and Bubble Charts in R

This post explores how the R package for labeled scatterplots tries to solve the problem of scatterplots and bubble plots or bubble charts in R.

When to Use Relative Weights Over Shapley

Shapley regression is a popular method for estimating the importance of predictor variables in linear regression. This method can deal with highly correlated predictor variables that are...

The Difference Between Shapley Regression and Relative Weights

Shapley regression and Relative Weights are two methods for estimating the importance of predictor variables in linear regression. Studies have shown that the two, despite being constructed in very different…

Too Hot to Handle? The Problem with Heatmaps

Heatmaps are cool. Most people like them. They are so much prettier than a bar chart. The one below, created in Making your data hot:…

The Secret of “Chartjunk”: Why Misleading Visualizations Aren’t Always Bad

There is a war in the world of visualization. It is about chartjunk. Designers like to create charts like the one above. Many data viz experts…

The NPS Recoding Trick: The Smart Way to Compute the Net Promoter Score

The Net Promoter Score is most people's go-to measure for evaluating companies, brands, and business units. However, the the standard way of computing the NPS - subtract the...

Assigning Respondents to Clusters/Segments in New Data Files in Displayr

Once you have created segments or clusters, it is often useful to assign people in other data sets to the segments (this is also known as segment…

Creating Custom Sankey Diagrams Using R

I have previously shown how Sankey or alluvial diagrams can easily be used to visualize response patterns in surveys and to display decision trees. Following…

Visualizing Response Patterns and Survey Flow With Sankey Diagrams

If you have spent much time analyzing customer feedback survey data, then you have probably spent a lot of time validating it. This normally entails…

Making Your Data Hot: Heatmaps for the Display of Large Tables

Don’t forget that you can easily use Displayr’s heatmap maker to create your free heatmap! Sometimes tables are just too big to read. The table below shows…

It Is Not the Size That Counts: Small Visualizations Are Preferable to Large Visualizations

All else being equal, small visualizations are better than big visualizations. There is no need to take my word for it. You can prove it…

A Pie Chart for Pi Day: The Data Scientist Pie Eating Challenge

Today is national pi day. The number, not the food. As mentioned in a previous post, I love pie charts. And, as luck would have it, I recently chanced…

Decision Tree Visualizations using Sankey Diagrams or Charts

Sankey diagrams are perfect for displaying decision trees (e.g., CHART, CHAID). I used to think that Sankey diagrams were just one of those cool visualizations…

Why Pie Charts Are Better Than Bar Charts

Ok, ok, this blog's title would be a bit more accurate if the word "often" appeared in the title. In my defense, all the anti-pie...

Text Analysis: Predicting Engagement from Tweets

Why do some tweets sizzle while others fizzle? Sometimes it’s obvious. But if you have a large quantity of tweet text, or other text for…

Text Analysis: Hooking up Your Term Document Matrix to Custom R Code

I have previously written about some of the text analysis options that are available in Displayr: sentiment analysis, text cleaning, and the predictive tree. As…

How to Set up Your Text Analysis in Displayr

Text data can be an unwieldy beast. Whether you're analyzing tweets, reviews, or open-ended responses from a survey, you will usually need to do some...

Using Text Analytics to Tidy a Word Cloud

It is common when people create word clouds that they want more control. Limit the word cloud to frequently occurring words. Join together words in phrases. Automatically…

The Best Tool for Creating a Word Cloud

Word clouds are one of the simplest of data visualizations to create and understand. Everyone instinctively understands that the size of words in a word…

Twitter Sentiment Analysis Example

Sentiment analysis allows you to quickly gauge the mood of the responses in your data. Twitter provides a sea of information, and it can be hard...

How to Add an Interactive R Visualization to a Blog in Under 30 Seconds

Check out the interactive R visualization below. Click on one of the circles with lots of lines connected to it and drag it around. I…

The 5 Second Rule and the Need to Create Instantly Recognizable Visualizations

Most people are busy. Many are bored. Designers take the view that they have a small amount of time, perhaps 5 seconds, to engage the...

2 Rules for Coloring Heatmaps so That Nobody Gets Burnt

Don’t forget that you can easily use Displayr’s heatmap maker to create your free heatmap! The other day, my local paper showed what it called a “photo” of…

Using Correspondence Analysis to Find Patterns in Tables

There are lots of great visualizations designed for analyzing big quantities of data. Heatmaps, for example, are super-popular. However, when I am in a rush,…

An Overview of Displayr for Excel Users

It will not be immediately obvious, but Excel is a key inspiration for the design of Displayr. However, when you jump into Displayr you will be…

10 Ways to Create New Variables in Displayr

Most data scientists have pretty clear picture of how variables should be created – and it almost certainly involves writing code. While you can take this approach in Displayr,…

Using Palm Trees to Visualize Performance Across Multiple Dimensions (Egypt’s Scary Palm Tree)

Palm trees are my favorite visualization. They look great. They are easy to understand. There is no other visualization that is as effective at decomposing…

Introducing Displayr: The Data Science and Reporting App for Everyone

In the world of startups, it is said that if you’re not embarrassed by the first version of your product, then you’ve launched too late.…