nonconvex optimization figure

Many current machine learning algorithms lack provable guarantees on one or more of the following metrics:  solution quality, sample complexity,  running time.  The goal of our group is to provide algorithms with such performance guarantees. This can involve designing fundamentally new algorithms, or proving such guarantees for existing algorithms.

We are especially interested in unsupervised learning, where the algorithm is given data that is unlabeled/unannotated, and the goal is to learn structure in the data. Often this involves hypothesizing the existence of such structure via a model. Learning corresponds to fitting such a model to the data.

Areas of interest to us include language models (including topic models and word embeddings), matrix and tensor factorization, deep nets, sparse coding, etc.

From the viewpoint of theoretical CS and traditional algorithm design, the novelty in this project is the need to make modeling assumptions ---often statistical in nature---about the structue of data, since solving the problems on worst-case instances is NP-hard. We need to go beyond worst-case analysis.

From the viewpoint of machine learning theory, the novelty in this project is the focus on unsupervised settings (by contrast, many traditional frameworks in learning theory such as PAC, SVMs, Online Learning, etc., focus on supervised settings). Methodologically, learning theory often proceeds by convexifying the problem, whereas our group wishes to stay with the nonconvex formalism. Today there is overwhelming empirical evidence that nonconvex approaches work at large scale in machine learning, and it is an important challenge for theory to explain this.