## Title: On the Computational and Statistical Interface and "Big Data"

- Michael Jordan (University of California, Berkeley)
- 11:00 - 12:00

The rapid growth in the size and scope of datasets in science and technology has created a need for novel foundational perspectives on data analysis that blend the computational and statistical sciences. That classical perspectives from these fields are not adequate to address emerging problems in "Big Data" is apparent from their sharply divergent nature at an elementary level---in computer science, the growth of the number of data points is a source of "complexity" that must be tamed via algorithms or hardware, whereas in statistics, the growth of the number of data points is a source of "simplicity" in that inferences are generally stronger and asymptotic results can be invoked. Indeed, if data are a statistician's principal resource, why should more data be burdensome in some sense? Shouldn't it be possible to exploit the increasing inferential strength of data at scale to keep computational complexity at bay? I present three research vignettes that pursue this theme, the first involving the deployment of resampling methods such as the bootstrap on parallel and distributed computing platforms, the second involving large-scale matrix completion, and the third introducing a methodology of "algorithmic weakening," whereby hierarchies of convex relaxations are used to control statistical risk as data accrue. [Joint work with Venkat Chandrasekaran, Ariel Kleiner, Lester Mackey, Purna Sarkar, and Ameet Talwalkar].

## Learning with matrix and tensor based models using low-rank penalties

- Johan Suykens (KU Leuven, ESAT-SCD)
- 14:00 - 15:00

In many problems of machine learning, statistics and data mining it is common to estimate models that operate on input data vectors. When employing parametric models this leads to the estimation of an unknown parameter vector. However, in many applications it is more natural to consider input data matrices and more generally input data tensors. Typical examples are in web mining, computer vision, mass spectroscopy and biomedical signal processing. We therefore outline a more general framework for learning with matrix and tensor-based models, beyond matrix and tensor completion. It is applicable both for inductive and transductive learning. For low-rank penalties it results into a class of convex optimization problems. A scalable template algorithm is presented based upon an inexact operator splitting method. [Joint work with Marco Signoretto, Quoc Tran Dinh, Lieven De Lathauwer].

## Large-scale convex optimization for machine learning

- Francis Bach (INRIA - Ecole Normale Superieure)
- 15.30 - 16.30

Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes over the data. In this talk, I will present several recent results, showing that in the ideal infinite-data setting, online learning algorithms based on stochastic approximation should be preferred, but that in the practical finite-data setting, an appropriate combination of batch and online algorithms leads to unexpected behaviors, such as a linear convergence rate with an iteration cost similar to stochastic gradient descent. [Joint work with Nicolas Le Roux, Eric Moulines and Mark Schmidt].

## Important:

- If you plan to attend one or more of these lectures please register before March 01, 14:00 hrs. It is free but mandatory.