In past posts, I’ve described the geometry of artificial neural networks by thinking of the output from each neuron in the network as defining a probability density function on the space of input vectors. This is useful for understanding how a single neuron combines the outputs of other neurons to form a more complex shape. However, it’s often useful to think about how multiple neurons behave at the same time, particularly for networks that are defined by successive layers of neurons. For such networks – which turn out to be the vast majority of networks in practice – it’s useful to think about how the set of outputs from each layer determine the set of outputs of the next layer. In this post, I want to discuss how we can think about this in terms of linear transformations (via matrices) and how this idea leads to a tool called word embeddings, the most popular of which is probably word2vec.

Recent Posts
Recent Comments
Zhiyuan Chen on Kernels Clustering categoric… on Kmodes Clustering continuou… on Kmodes Jeff Whyte on Goals of Interpretability James on Gaussian kernels Archives
 December 2016
 November 2016
 October 2016
 June 2016
 April 2016
 January 2016
 November 2015
 October 2015
 July 2015
 June 2015
 May 2015
 January 2015
 September 2014
 June 2014
 May 2014
 March 2014
 February 2014
 January 2014
 December 2013
 October 2013
 September 2013
 August 2013
 July 2013
 June 2013
 May 2013
 April 2013
 March 2013
Categories
Meta
Advertisements