Monthly Archives: July 2013


The subject of this weeks post is probably one of the most polarizing algorithms in the data world: It seems that most experts either swear by K-means or absolutely hate it. The difference of opinion boils down to one of … Continue reading

Posted in Modeling, Unsupervised learning | 8 Comments

Gaussian kernels

In order to give a proper introduction to Gaussian kernels, this week’s post is going to start out a little bit more abstract than usual. This level of abstraction isn’t strictly necessary to understand how Gaussian kernels work, but the … Continue reading

Posted in Normalization/Kernels | 11 Comments

Mixture models

In the last few posts, we’ve been looking at algorithms that combine a number of simple models/distributions to form a single more complex and sophisticated model. With both neural networks and decision trees/random forests, we were interested in the classification … Continue reading

Posted in Modeling | 7 Comments

Random forests

In last week’s post, I described a classification algorithm called a decision tree that defines a model/distribution for a data set by cutting the data space along vertical and horizontal hyperplanes (or lines in the two-dimensional example that we looked … Continue reading

Posted in Classification | 9 Comments

Decision Trees

In the last few posts, we explored how neural networks combine basic classification schemes with relatively simple distributions, such as logistic distributions with line/plane/hyperplane decision boundaries, into much more complex and flexible distributions. In the next few posts, I plan … Continue reading

Posted in Classification | 5 Comments