Monthly Archives: March 2013
General regression and over fitting
In the last post, I discussed the statistical tool called linear regression for different dimensions/numbers of variables and described how it boils down to looking for a distribution concentrated near a hyperplane of dimension one less than the total number … Continue reading
Posted in Modeling, Regression
14 Comments
The geometry of linear regression
In this post, we’ll warm up our geometry muscles by looking at one of the most basic data analysis techniques: linear regression. You’ve probably encountered it elsewhere, but I want to think about it from the point of view of … Continue reading
Posted in Modeling, Regression
28 Comments
Filling in the gaps – Probability Distributions
In the last post, I discussed how one can analyze a data set from the point of view of geometry, by thinking of each data point as coordinates in a high dimensional space. The thing is, when we’re analyzing these … Continue reading
Posted in Introduction
9 Comments
What is data?
In this post, we’ll start working on understanding how computers and their programmers actually go about analyzing large, highdimensional data sets. I’ll start by describing three different ways that one can think about data, each of which suggests a different … Continue reading
Posted in Introduction
10 Comments
Different names for data analysis
Welcome to the Shape of Data blog. Over the next few months, I plan to write a number of posts illustrating how understanding the geometry behind data analysis can lead to deeper insights and a more intuitive understanding of the … Continue reading
Posted in Introduction
12 Comments