Category Archives: Clustering

K-modes

I recently read an interesting Wired story about Chris McKinlay (a fellow alum of Middlebury College), who used a clustering algorithm to understand the pool of users on the dating site OkCupid (and successfully used this information to improve his … Continue reading

Posted in Clustering, Feature extraction | 17 Comments

Modularity – Measuring cluster separation

We’ve now seen a number of different clustering algorithms, each of which will divide a data set into a number of subsets. This week, I want to ask the question: How do we know if answer that a clustering algorithm … Continue reading

Posted in Clustering, Unsupervised learning | Leave a comment

Spectral clustering

In the last few posts, we’ve been studying clustering, i.e. algorithms that try to cut a given data set into a number of smaller, more tightly packed subsets, each of which might represent a different phenomenon or a different type … Continue reading

Posted in Clustering, Unsupervised learning | 9 Comments

Mapper and the choice of scale

In last week’s post, I described the DBSCAN clustering algorithm, which uses the notion of density to determine which data points in a data set form tightly packed groups called clusters. This algorithm relies on two parameters – a distance … Continue reading

Posted in Clustering, Unsupervised learning | 4 Comments

Clusters and DBScan

A few weeks ago, I mentioned the idea of a clustering algorithm, but here’s a recap of the idea: Often, a single data set will be made up of different groups of data points, each of which corresponds to a … Continue reading

Posted in Clustering, Unsupervised learning | 7 Comments