# Statistics.

Correspondence analysis is a popular tool for visualizing the patterns in large tables. To many practitioners it is probably a black box. Table goes in, chart comes out. In this post I explain the mathematics…

Continue reading

If you have ever looked with any depth at statistical computing for multivariate analysis, there is a good chance you have come across the singular value decomposition (SVD). It is a workhorse for techniques that decompose data, such as correspondence analysis and principal…

Continue reading

Hopefully if you have landed on this post you have a basic idea of what the R-Squared statistic is. The R-Squared statistic is a number between 0 and 1, or, 0% and 100%, that quantifies…

Continue reading

Correspondence analysis is a powerful technique that enables you to visualize a complex table of results as a much simpler chart. In this post I discuss the special case of square tables, which often arise in…

Continue reading

Machine learning is a problem of trade-offs. The classic issue is overfitting versus underfitting. Overfitting happens when a model memorizes its training data so well that it is learning noise on top of the signal….

Continue reading

In this post I explore two different methods for computing the relative importance of predictors in regression: Johnson’s Relative Weights and Partial Least Squares (PLS) regression. Both techniques solve a problem with Multiple Linear Regression, which can perform poorly when there are correlations…

Continue reading

Partial Least Squares (PLS) is a popular method for relative importance analysis in fields where the data typically includes more predictors than observations. Relative importance analysis is a general term applied to any technique used for…

Continue reading

A key driver analysis investigates the relative importance of predictors against an outcome variable, such as brand preference. Many techniques have been developed for key driver analysis, to name but a few: Preference Regression, Shapley Regression,…

Continue reading

Partial Least Squares (PLS) is a popular method for relative importance analysis in fields where the data typically includes more predictors than observations. Relative importance analysis is a general term applied to any technique used for…

Continue reading

This post compares various approaches to analyzing MaxDiff data using a method known as cross-validation. Before you read this post, make sure you first read How MaxDiff analysis works, which describes many of the approaches mentioned in…

Continue reading

This post explains the basic mechanics of how preferences can be measured using the data collected in a MaxDiff experiment. Before you read this post, make sure you first read A beginner’s guide to MaxDiff. I have worked hard…

Continue reading

Correspondence analysis is one of those rare data science tools which make things simpler. You start with a big table that is too hard to read, and end with a relatively simple visualization. In this…

Continue reading