The Mathematics of Deep Learning Parts I & II
Wednesday, June 22, 2016
2:00 pm - 4:50 pm
Fitzpatrick Center Schiciano Auditorium
Helmut Bölcskei Professor, ETH Zurich
Part I - Deep convolutional neural networks have led to breakthrough results in numerous machine learning tasks that require feature extraction, yet a comprehensive mathematical theory explaining this success seems distant. The mathematical analysis of deep neural networks for feature extraction was initiated by Mallat, who considered so-called scattering networks based on the wavelet transform and modulus non-linearities. In this short course, we show how Mallat's theory can be developed further by allowing for general semi-discrete shift-invariant frames (including Weyl-Heisenberg, curvelet, shearlet, ridgelet, and wavelet frames) and general Lipschitz-continuous non-linearities (e.g., rectified linear units, shifted logistic sigmoids, hyperbolic tangents, and modulus functions), as well as pooling through subsampling. For the resulting feature extractor, we prove deformation stability for a large class of deformations, establish a new translation invariance result which is of vertical nature in the sense of the network depth determining the amount of invariance, and show energy conservation under certain technical conditions. On a conceptual level our results establish that deformation stability, vertical translation invariance, and to a certain degree also energy conservation are guaranteed by the network structure per se rather than the specific convolution kernels and non-linearities.