Blog.

# Statistics.

Understanding the Math of Correspondence Analysis with Examples in R
08 August 2017 | by Tim Bock

Correspondence analysis is a popular tool for visualizing the patterns in large tables. To many practitioners it is probably a black box. Table goes in, chart comes out. In this post I explain the mathematics…

Continue reading

Singular Value Decomposition (SVD): Tutorial Using Examples in R
02 August 2017 | by Tim Bock

If you have ever looked with any depth at statistical computing for multivariate analysis, there is a good chance you have come across the singular value decomposition (SVD). It is a workhorse for techniques that decompose data, such as correspondence analysis and principal…

Continue reading

8 Tips for Interpreting R-Squared
31 July 2017 | by Tim Bock

Hopefully if you have landed on this post you have a basic idea of what the R-Squared statistic is. The R-Squared statistic is a number between 0 and 1, or, 0% and 100%, that quantifies…

Continue reading

Correspondence Analysis of Brand Switching and Other Square Tables
25 July 2017 | by Jake Hoare

Correspondence analysis is a powerful technique that enables you to visualize a complex table of results as a much simpler chart. In this post I discuss the special case of square tables, which often arise in…

Continue reading

Machine Learning: Pruning Decision Trees
04 July 2017 | by Jake Hoare

Machine learning is a problem of trade-offs. The classic issue is overfitting versus underfitting. Overfitting happens when a model memorizes its training data so well that it is learning noise on top of the signal….

Continue reading

Comparing Partial Least Squares to Johnson’s Relative Weights
19 June 2017 | by Tim Bock

In this post I explore two different methods for computing the relative importance of predictors in regression: Johnson’s Relative Weights and Partial Least Squares (PLS) regression. Both techniques solve a problem with Multiple Linear Regression, which can perform poorly when there are correlations…

Continue reading

Using Partial Least Squares to Conduct Relative Importance Analysis in R
19 June 2017 | by Jake Hoare

Partial Least Squares (PLS) is a popular method for relative importance analysis in fields where the data typically includes more predictors than observations. Relative importance analysis is a general term applied to any technique used for…

Continue reading

The Problem with Using Multiple Linear Regression for Key Driver Analysis: a Case Study of the Cola Market
18 June 2017 | by Tim Bock

A key driver analysis investigates the relative importance of predictors against an outcome variable, such as brand preference. Many techniques have been developed for key driver analysis, to name but a few: Preference Regression, Shapley Regression,…

Continue reading

Using Partial Least Squares to Conduct Relative Importance Analysis in Displayr
16 June 2017 | by Jake Hoare

Partial Least Squares (PLS) is a popular method for relative importance analysis in fields where the data typically includes more predictors than observations. Relative importance analysis is a general term applied to any technique used for…

Continue reading

Using Cross-Validation to Measure MaxDiff Performance
23 May 2017 | by Justin Yap

This post compares various approaches to analyzing MaxDiff data using a method known as cross-validation. Before you read this post, make sure you first read How MaxDiff analysis works, which describes many of the approaches mentioned in…

Continue reading

How MaxDiff Analysis Works (Simplish, but Not for Dummies)
23 May 2017 | by Tim Bock

This post explains the basic mechanics of how preferences can be measured using the data collected in a MaxDiff experiment. Before you read this post, make sure you first read A beginner’s guide to MaxDiff. I have worked hard…

Continue reading

When to Use, and Not Use, Correspondence Analysis
23 May 2017 | by Tim Bock

Correspondence analysis is one of those rare data science tools which make things simpler. You start with a big table that is too hard to read, and end with a relatively simple visualization. In this…

Continue reading

Keep updated with the latest in data science.