Machine Learning.

Decision Trees Are Usually Better Than Logistic Regression
25 October 2018 | by Tim Bock

Logistic regression is a standard approach to building a predictive model. However, decision trees are an alternative which are clearer and often superior.

Continue reading

Feature Engineering for Categorical Variables
24 October 2018 | by Tim Bock

There are two types of predictors in predictive models: numeric and categorical. There are several methods of transforming categorical variables.

Continue reading

Feature Engineering for Numeric Variables
24 October 2018 | by Tim Bock

When building a predictive model, it is often practical to improve predictive performance by modifying the numeric variables. This is called transformation.

Continue reading

view of a forest
How Random Forests Fit to Data
06 August 2018 | by Jake Hoare

A random forest is a collection of decision trees, which is used to learn patterns in data and make predictions based on those patterns.

Continue reading

tree on a hill
How is Splitting Decided for Decision Trees?
02 August 2018 | by Jake Hoare

Decision trees work by repeatedly splitting the data to lead to the option which causes the greatest improvement. We explain how these splits are chosen.

Continue reading

How is Variable Importance Calculated for a Random Forest?
30 July 2018 | by Jake Hoare

A random forest is an ensemble of decision trees. Like other machine-learning techniques, random forests use training data to learn to make predictions.

Continue reading

What is a ROC Curve and How to Interpret It
05 July 2018 | by Carmen Chan

A Receiver Operator Characteristic (ROC) curve is a graphical plot used to show the diagnostic ability of binary classifiers and can be used to compare algorith...

Continue reading

Predict Customer Churn with Gradient Boosting
03 July 2018 | by Jake Hoare

Customer churn is a crucial factor in the long term success of a business. Predict your customer churn with a predictive model using gradient boosting.

Continue reading

Machine Learning: Using t-SNE to Understand Middle Eastern Politics
06 October 2017 | by Tim Bock

How the machine learning technique of t-SNE can be used to summarize visualizations and extract additional insight from them.

Continue reading

shepard diagram
Goodness of Fit in MDS and t-SNE with Shepard Diagrams
28 September 2017 | by Jake Hoare

Shepard diagrams are a great way to assess goodness of fit for data reduction methods such as MDS and t-SNE.This post shows you how and includes the data.

Continue reading

Leaf categories
How t-SNE works and Dimensionality Reduction
05 September 2017 | by Jake Hoare

t-SNE is a method for visualizing high dimensional space. It often produces more insightful charts than the alternatives like PCA.

Continue reading

Machine learning pruning
Machine Learning: Pruning Decision Trees
04 July 2017 | by Jake Hoare

Machine learning is a problem of trade-offs. Here I look at pruning and early stopping for managing these trade-offs in the context of decision trees.

Continue reading

Gradient boosting
Gradient Boosting Explained – The Coolest Kid on The Machine Learning Block
06 June 2017 | by Jake Hoare

Gradient boosting is attracting attention for its prediction speed & accuracy, especially with large & complex data. Here I show what it is and how to

Continue reading