Title: On The Role of Depth in Deep Learning
Abstract: Deep learning is transforming scientific research, education, healthcare, and more. However, our understanding of how even the simplest neural networks learn to make surprisingly accurate predictions remains limited. For instance, although shallow ReLU neural networks can in principle approximate any continuous function, deeper networks often perform better in practice. Why does this happen? What benefit does depth provide? In this talk, I will describe a framework for understanding the benefit of depth using ideas from function-space perspectives on neural networks and statistical learning theory. I will show that even though shallow networks can approximate any continuous function, learning a good approximation from finite training data can be significantly easier using a deeper network. In particular, I identify a family of functions that are provably hard to learn without sufficient depth. Along the way, we will see how deeper models naturally introduce a bias towards functions with latent low-dimensional structure, providing insights into how depth helps solve challenging learning problems.