## General regression and over fitting

In the last post, I discussed the statistical tool called linear regression for different dimensions/numbers of variables and described how it boils down to looking for a distribution concentrated near a hyperplane of dimension one less than the total number

## The geometry of linear regression

In this post, we'll warm up our geometry muscles by looking at one of the most basic data analysis techniques: linear regression. You've probably encountered it elsewhere, but I want to think about it from the point of view of

## Filling in the gaps – Probability Distributions

In the last post, I discussed how one can analyze a data set from the point of view of geometry, by thinking of each data point as coordinates in a high dimensional space. The thing is, when we're analyzing these

## What is data?

In this post, we'll start working on understanding how computers and their programmers actually go about analyzing large, high-dimensional data sets. I'll start by describing three different ways that one can think about data, each of which suggests a different

## Different names for data analysis

Welcome to the Shape of Data blog. Over the next few months, I plan to write a number of posts illustrating how understanding the geometry behind data analysis can lead to deeper insights and a more intuitive understanding of the

