Research

I am interested in statistical learning with a particular focus on online learning. I carried out my PhD thesis with Yannig Goude and Gilles Stoltz on individual sequences. I am currently working as a junior researcher at INRIA. You can find below a list of my publications.

Articles

Submitted articles and preprints

Structured prediction in online learning. Pierre Boudard, Pierre Gaillard, and Alessandro Rudi.
Adaptive approximation of monotone functions. Pierre Gaillard, Sébastien Gerchinovitz, Etienne de Montbrun.
Online Markov Decision Processes with Terminal Law Constraints. Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane.
Minimax Adaptive Online Nonparametric Regression over Besov spaces. Paul Liautaud, Pierre Gaillard, Olivier Wintenberger.
Enjoying Non-linearity in Multinomial Logistic Bandits. Pierre Boudart, Pierre Gaillard, Alessandro Rudi.
Optimised projection-free algorithms for online learning: construction and worst-case analysis. Julien Weibel, Pierre Gaillard, Wouter M Koolen, Adrien Taylor.

Articles in proceedings of international conferences

Online Episodic Convex Reinforcement Learning. Bianca Marin Moreno, Khaled Eldowa, Margaux Brégère, Pierre Gaillard, and Nadia Oudjane. ICML, 2025.
Counterfactual Learning of Stochastic Policies with Continuous Actions. Houssam Zenati, Alberto Bietti, Matthieu Martin, Eustache Diemert, Pierre Gaillard, Julien Mairal. TMLR, 2025
Finally Rank-Breaking Conquers MNL Bandits: Optimal and Efficient Algorithms for MNL Assortment. Aadirupa Saha and Pierre Gaillard. ICLR, 2025.
Minimax-optimal and Locally-adaptive Online Nonparametric Regression. Paul Liautaud, Pierre Gaillard, Olivier Wintenberger. ALT, 2025.
Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit. Julien Zhou, Pierre Gaillard, Thibaud Rahier, and Julyan Arbel. ALT, 2025.
MetaCURL: Non-stationary Concave Utility Reinforcement Learning. Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, and Nadia Oudjane. NeurIPS, 2024.
Covariance-Adaptive Least-Squares Algorithm for Stochastic Combinatorial Semi-Bandits. Julien Zhou, Pierre Gaillard, Thibaud Rahier, Houssam Zenati, and Julyan Arbel. NeurIPS, 2024.
Efficient Model-Based Concave Utility Reinforcement Learning through Greedy Mirror Descent. Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane. AISTATS, 2024.
Sequential Counterfactual Risk Minimization. Houssam Zenati, Eustache Diemert, Matthieu Martin, Julien Mairal, Pierre Gaillard. ICML, 2023.
One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits. Pierre Gaillard, Aadirupa Saha and Soham Dan. AISTATS, 2023.
Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences. Aadirupa Saha, Pierre Gaillard. ICML, 2022.
Efficient Kernel UCB for Contextual Bandits. Houssam Zenati, Alberto Bietti, Eustache Diemert, Julien Mairal, Matthieu Martin, Pierre Gaillard. AISTATS, 2022.
Dueling Bandits with Adversarial Sleeping. Aadirupa Saha, Pierre Gaillard. NeurIPS, 2021.
A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip. Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Hadrien Hendrikx, Laurent Massoulié, Adrien Taylor. NeurIPS (Outstanding paper award, <1‰ of submitted papers), 2021.
Mixability made efficient: Fast online multiclass logistic regression. Rémi Jézéquel, Pierre Gaillard, and Alessandro Rudi. NeurIPS (Spotlight, <3% of submitted papers), 2021.
Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits. Reda Ouhamma, Rémy Degenne, Vianney Perchet, and Pierre Gaillard. NeurIPS (Spotlight, <3% of submitted papers), 2021.
Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model. Raphaël Berthier, Francis Bach, Pierre Gaillard. NeurIPS, 2020.
Efficient improper learning for online logistic regression. Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi. COLT, 2020.
Improved Sleeping Bandits with Stochastic Actions Sets and Adversarial Rewards. Aadirupa Saha, Pierre Gaillard, Michal Valko. ICML, 2020.
- HAL
- arXiv
- ShowHide abstract
- ShowHide bibtex
In this paper, we consider the problem of sleeping bandits with stochastic action sets and adversarial rewards. In this setting, in contrast to most work in bandits, the actions may not be available at all times. For instance, some products might be out of stock in item recommendation. The best existing efficient (i.e., polynomial-time) algorithms for this problem only guarantee an $O(T^{2/3})$ upper-bound on the regret. Yet, inefficient algorithms based on EXP4 can achieve $O(\sqrt{T})$ . In this paper, we provide a new computationally efficient algorithm inspired by EXP3 satisfying a regret of order $O(\sqrt{T})$ when the availabilities of each action $i \in \mathcal{A}$ are independent. We then study the most general version of the problem where at each round available sets are generated from some unknown arbitrary distribution (i.e., without the independence assumption) and propose an efficient algorithm with $O(\sqrt{2^KT})$ regret guarantee. Our theoretical results are corroborated with experimental evaluations.
```
@inproceedings{saha2020improved,
      title={Improved sleeping bandits with stochastic action sets and adversarial rewards},
      author={Saha, Aadirupa and Gaillard, Pierre and Valko, Michal},
      booktitle={International Conference on Machine Learning},
      pages={8357--8366},
      year={2020},
      organization={PMLR}
    }
```
Efficient online learning with Kernels for adversarial large scale problems. Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi. NeurIPS, 2019.
- HAL
- arXiv
- Slides
- ShowHide abstract
- ShowHide bibtex
We are interested in a framework of online learning with kernels for low-dimensional, but large-scale and potentially adversarial datasets. We study the computational and theoretical performance of online variations of kernel Ridge regression. Despite its simplicity, the algorithm we study is the first to achieve the optimal regret for a wide range of kernels with a per-round complexity of order $n^\alpha$ with $\alpha < 2$. The algorithm we consider is based on approximating the kernel with the linear span of basis functions. Our contributions are twofold: 1) For the Gaussian kernel, we propose to build the basis beforehand (independently of the data) through Taylor expansion. For $d$-dimensional inputs, we provide a (close to) optimal regret of order $O((\log n)^{d+1})$ with per-round time complexity and space complexity $O((\log n)^{2d})$. This makes the algorithm a suitable choice as soon as $n \gg e^d$ which is likely to happen in a scenario with small dimensional and large-scale dataset; 2) For general kernels with low effective dimension, the basis functions are updated sequentially, adapting to the data, by sampling Nyström points. In this case, our algorithm improves the computational trade-off known for online kernel regression.
```
@incollection{JezequelEtAl2019,
              title = {Efficient online learning with kernels for adversarial large scale problems},
              author = {J\'{e}z\'{e}quel, R\'{e}mi and Gaillard, Pierre and Rudi, Alessandro},
              booktitle = {Advances in Neural Information Processing Systems 32},
              editor = {H. Wallach and H. Larochelle and A. Beygelzimer and F. d\textquotesingle Alch\'{e}-Buc and E. Fox and R. Garnett},
              pages = {9432--9441},
              year = {2019},
              publisher = {Curran Associates, Inc.},
              url = {http://papers.nips.cc/paper/9140-efficient-online-learning-with-kernels-for-adversarial-large-scale-problems.pdf}
            }
```

Target Tracking for Contextual Bandits: Application to Demand Side Management. Margaux Brégère, Pierre Gaillard, Yannig Goude, Gilles Stoltz. ICML, 2019.

HAL
arXiv
Slides
Poster
ShowHide abstract
ShowHide bibtex

We propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies is measured in quadratic losses through a regret criterion. We offer $\sqrt{T}$ upper bounds on this regret (up to poly-logarithmic terms), for strategies inspired by standard strategies for contextual bandits. Simulations on a real data set gathered by UK Power Networks, in which price incentives were offered, show that our strategies are effective and may indeed manage demand response by suitably picking the price levels.

@InProceedings{pmlr-v97-bregere19a,
              title =    {Target Tracking for Contextual Bandits: Application to Demand Side Management},
              author =   {Br{\'e}g{\`e}re, Margaux and Gaillard, Pierre and Goude, Yannig and Stoltz, Gilles},
              booktitle =    {Proceedings of the 36th International Conference on Machine Learning},
              pages =    {754--763},
              year =   {2019},
              editor =   {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
              volume =   {97},
              series =   {Proceedings of Machine Learning Research},
              address =    {Long Beach, California, USA},
              month =    {09--15 Jun},
              publisher =    {PMLR},
              pdf =    {http://proceedings.mlr.press/v97/bregere19a/bregere19a.pdf},
              url =    {http://proceedings.mlr.press/v97/bregere19a.html},
              abstract =   {We propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies is measured in quadratic losses through a regret criterion. We offer $T^{2/3}$ upper bounds on this regret (up to poly-logarithmic terms)—and even faster rates under stronger assumptions—for strategies inspired by standard strategies for contextual bandits (like LinUCB, see Li et al., 2010). Simulations on a real data set gathered by UK Power Networks, in which price incentives were offered, show that our strategies are effective and may indeed manage demand response by suitably picking the price levels.}
            }

Uniform regret bounds over R^d for the sequential linear regression problem with the square loss. Pierre Gaillard, Sébastien Gerchinovitz, Malo Huard, Gilles Stoltz. ALT, 2019.

HAL
arXiv
ShowHide abstract
ShowHide bibtex

We consider the setting of online linear regression for arbitrary deterministic sequences, with the square loss. We are interested in obtaining regret bounds that hold uniformly over all competitor vectors. When the feature sequence is known at the beginning of the game, they provided closed-form regret bounds of $2d B^2 \ln T + O_T(1)$, where $T$ is the number of rounds and $B$ is a bound on the observations. Instead, we derive bounds with an optimal constant of $1$ in front of the $d B^2 \ln T$ term. In the case of sequentially revealed features, we also derive an asymptotic regret bound of $d B^2 \ln T$ for any individual sequence of features and bounded observations. All our algorithms are variants of the online non-linear ridge regression forecaster, either with a data-dependent regularization or with almost no regularization.

@InProceedings{pmlr-v98-gaillard19a,
          title =    {Uniform regret bounds over $\mathbb{R}^d$ for the sequential linear regression problem with the square loss},
          author =   {Gaillard, Pierre and Gerchinovitz, S{\'e}bastien and Huard, Malo and Stoltz, Gilles},
          booktitle =    {Proceedings of the 30th International Conference on Algorithmic Learning Theory},
          pages =    {404--432},
          year =   {2019},
          editor =   {Garivier, Aur\'elien and Kale, Satyen},
          volume =   {98},
          series =   {Proceedings of Machine Learning Research},
          address =    {Chicago, Illinois},
          month =    {22--24 Mar},
          publisher =    {PMLR},
          pdf =    {http://proceedings.mlr.press/v98/gaillard19a/gaillard19a.pdf},
          url =    {http://proceedings.mlr.press/v98/gaillard19a.html},
          abstract =   {We consider the setting of online linear regression for arbitrary deterministic sequences,
         with the square loss. We are interested in the aim set by Bartlett et al. (2015):
         obtain regret bounds that hold uniformly over all competitor vectors.
         When the feature sequence is known at the beginning of the game, they provided closed-form
         regret bounds of $2d B^2 \ln T + \mathcal{O}_T(1)$, where $T$ is the number of rounds and $B$ is a
         bound on the observations. Instead, we derive bounds with an optimal constant of $1$ in front of
         the $d B^2 \ln T$ term. In the case of sequentially revealed features, we also derive an asymptotic
         regret bound of $d B^2 \ln T$ for any individual sequence of features and bounded observations.
         All our algorithms are variants of the online non-linear ridge regression forecaster, either with a
         data-dependent regularization or with almost no regularization.}
        }

Efficient online algorithms for fast-rate regret bounds under sparsity. Pierre Gaillard, Olivier Wintenberger. NeurIPS, 2018.
- HAL
- arXiv
- Poster
- ShowHide abstract
- ShowHide bibtex
We consider the problem of online convex optimization in two different settings: arbitrary and i.i.d. sequence of convex loss functions. In both settings, we provide efficient algorithms whose cumulative excess risks are controlled with fast-rate sparse bounds. First, the excess risks bounds depend on the sparsity of the objective rather than on the dimension of the parameters space. Second, their rates are faster than the slow-rate $\smash{1/\sqrt{T}}$ under additional convexity assumptions on the loss functions. In the adversarial setting, we develop an algorithm BOA+ whose cumulative excess risks is controlled by several bounds with different trade-offs between sparsity and rate for strongly convex loss functions. In the i.i.d. setting under the Łojasiewicz's assumption, we establish new risk bounds that are sparse with a rate adaptive to the convexity of the risk (ranging from a rate $1/\sqrt{T}$ for general convex risk to ${1}/{T}$ for strongly convex risk). These results generalize previous works on sparse online learning under weak assumptions on the risk.
```
@incollection{GaillardEtAl2018,
            title = {Efficient online algorithms for fast-rate regret bounds under sparsity},
            author = {Gaillard, Pierre and Wintenberger, Olivier},
            booktitle = {Advances in Neural Information Processing Systems 31},
            editor = {S. Bengio and H. Wallach and H. Larochelle and K. Grauman and N. Cesa-Bianchi and R. Garnett},
            pages = {7026--7036},
            year = {2018},
            publisher = {Curran Associates, Inc.},
            url = {http://papers.nips.cc/paper/7934-efficient-online-algorithms-for-fast-rate-regret-bounds-under-sparsity.pdf}
            }
```
Online nonparametric learning, chaining, and the role of partial feedback. Nicolò Cesa-Bianchi, Pierre Gaillard, Claudio Gentile, Sébastien Gerchinovitz. COLT, 2017.
- HAL
- arXiv
- Poster
- ShowHide abstract
- ShowHide bibtex
We investigate contextual online learning with nonparametric (Lipschitz) comparison classes un- der different assumptions on losses and feedback information. For full information feedback and Lipschitz losses, we characterize the minimax regret up to log factors by proving an upper bound matching a previously known lower bound. In a partial feedback model motivated by second-price auctions, we prove upper bounds for Lipschitz and semi-Lipschitz losses that improve on the known bounds for standard bandit feedback. Our analysis combines novel results for contextual second-price auctions with a novel algorithmic approach based on chaining. When the context space is Euclidean, our chaining approach is efficient and delivers an even better regret bound.
```
@unpublished{cesabianchi:hal-01476771,
                TITLE = {{Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning}},
                AUTHOR = {Cesa-Bianchi, Nicol{\`o} and Gaillard, Pierre and Gentile, Claudio and Gerchinovitz, S{\'e}bastien},
                URL = {https://hal.archives-ouvertes.fr/hal-01476771},
                NOTE = {This document is the full version of an extended abstract accepted for presentation at COLT 2017.},
                YEAR = {2017},
                MONTH = Jun,
                KEYWORDS = {bandits ; online learning ; nonparametric ; chaining},
                PDF = {https://hal.archives-ouvertes.fr/hal-01476771/file/NonparametricLearning.pdf},
                HAL_ID = {hal-01476771}
              }
```
Sparse Accelerated Exponential Weights. Pierre Gaillard, Olivier Wintenberger. AISTATS, 2017.
- HAL
- arXiv
- Slides
- ShowHide abstract
- ShowHide bibtex
We consider the stochastic optimization problem where a convex function is minimized observing recursively the gradients. We introduce SAEW, a new procedure that accelerates exponential weights procedures with the slow rate $1/\sqrt{T}$ to procedures achieving the fast rate $1/T$. Under the strong convexity of the risk, we achieve the optimal rate of convergence for approximating sparse parameters in $\mathbb{R}^d$. The acceleration is achieved by using successive averaging steps in an online fashion. The procedure also produces sparse estimators thanks to additional hard threshold steps.
```
@inproceedings{gaillard:hal-01376808,
              TITLE = {{Sparse Accelerated Exponential Weights}},
              AUTHOR = {Gaillard, Pierre and Wintenberger, Olivier},
              URL = {https://hal.archives-ouvertes.fr/hal-01376808},
              BOOKTITLE = {{20th International Conference on Artificial Intelligence and Statistics (AISTATS)}},
              ADDRESS = {Fort Lauderdale, United States},
              YEAR = {2017},
              MONTH = Apr,
              PDF = {https://hal.archives-ouvertes.fr/hal-01376808/file/GaillardWintenberger2016-SAEW.pdf},
              HAL_ID = {hal-01376808}
            }
```
A Chaining Algorithm for Online Nonparametric Regression. Pierre Gaillard, Sébastien Gerchinovitz. COLT, 2015.
- HAL
- arXiv
- Poster
- Slides
- Proceedings
- Video lecture
- ShowHide abstract
- ShowHide bibtex
We consider the problem of online nonparametric regression with arbitrary deterministic sequences. Using ideas from the chaining technique, we design an algorithm that achieves a Dudley-type regret bound similar to the one obtained in a non-constructive fashion by Rakhlin and Sridharan (2014). Our regret bound is expressed in terms of the metric entropy in the sup norm, which yields optimal guarantees when the metric and sequential entropies are of the same order of magnitude. In particular our algorithm is the first one that achieves optimal rates for online regression over Hölder balls. In addition we show for this example how to adapt our chaining algorithm to get a reasonable computational efficiency with similar regret guarantees (up to a log factor).
```
@Inproceedings{GaillardGerchinovitz2015,
              author = {Gaillard, Pierre and Gerchinovitz, Sébastien},
              title = {A Chaining Algorithm for Online Nonparametric Regression},
              year = {2015},
              booktitle = {Proceedings of COLT'15},
              publisher = {JMLR: Workshop and Conference Proceedings},
              volume = {40},
              pages = {764--796}
            }
```
A second-order bound with excess losses. Pierre Gaillard, Gilles Stoltz, Tim van Erven. COLT, 2014.
- HAL
- arXiv
- Slides
- Video lecture
- ShowHide abstract
- ShowHide bibtex
We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted average algorithm) with multiple learning rates. These bounds are in terms of excess losses, the differences between the instantaneous losses suffered by the algorithm and the ones of a given expert. We then demonstrate the interest of these bounds in the context of experts that report their confidences as a number in the interval $[0,1]$ using a generic reduction to the standard setting. We conclude by two other applications in the standard setting, which improve the known bounds in case of small excess losses and show a bounded regret against i.i.d. sequences of losses.
```
@INPROCEEDINGS{GaillardStoltzEtAl2014,
              author = {Gaillard, Pierre and Stoltz, Gilles and van Erven, Tim},
              title = {A Second-order Bound with Excess Losses},
              booktitle = {Proceedings of COLT'14},
              year = {2014},
              volume = {35},
              pages = {176--196},
              publisher = {JMLR: Workshop and Conference Proceedings}
            }
```
Mirror descend meets fixed share (and feels no regret). Nicolò Cesa-Bianchi, Pierre Gaillard, Gàbor Lugosi, and Gilles Stoltz. NIPS, 2012.
- HAL
- arXiv
- Poster
- Slides (fr)
- ShowHide abstract
- ShowHide bibtex
Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension. This is done using either a carefully designed projection or by a weight sharing technique. Via a novel unified analysis, we show that these two approaches deliver essentially equivalent bounds on a notion of regret generalizing shifting, adaptive, discounted, and other related regrets. Our analysis also captures and extends the generalized weight sharing technique of Bousquet and Warmuth, and can be refined in several ways, including improvements for small losses and adaptive tuning of parameters.
```
@Inproceedings{Cesa-BianchiGaillardLugosiStoltz2012,
              author = {Cesa-Bianchi, Nicolò and Gaillard, Pierre and Lugosi, Gábor and Stoltz, Gilles},
              title = {Mirror Descent Meets Fixed Share (and feels no regret)},
              booktitle = {Proceedings of NIPS},
              pages = {989-997},
              year = 2012
            }
```

Articles in international journals

Efficient and Near-Optimal Online Portfolio Selection. Rémi Jézéquel, Dmitrii M. Ostrovskii, Pierre Gaillard. Mathematics of Operations Research. 2025.
- arXiv
- Slides
(Online) Convex Optimization for Demand-Side Management: Application to Thermostatically Controlled Loads. Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane. Journal of Optimization Theory and Applications. Volume 205, article number 43, 2025.
- arXiv
- Slides
Online Convex Optimization for Survival Analysis: An Adaptive and Stochastic Approach. Camila Fernandez, Pierre Gaillard, Joseph de Vilmarest, and Olivier Wintenberger. Statistical Papers. Volume 66, article number 86, 2025.
Aggregation methods and comparative study in time-to-event analysis models. Camila Fernandez, Chung Shue Chen, Pierre Gaillard, and Alonso Silva. International Journal of Data Science and Analytics. 2024.
Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations. Raphaël Berthier, Francis Bach, Pierre Gaillard, SIAM Journal on Mathematics of Data Science. Volume 2, Issue 1, 2020.
Consider a network of agents connected by communication links, where each agent holds a real value. The gossip problem consists in estimating the average of the values diffused in the network in a distributed manner. We develop a method solving the gossip problem that depends only on the spectral dimension of the network, that is, in the communication network set-up, the dimension of the space in which the agents live. This contrasts with previous work that required the spectral gap of the network as a parameter, or suffered from slow mixing. Our method shows an important improvement over existing algorithms in the non-asymptotic regime, i.e., when the values are far from being fully mixed in the network. Our approach stems from a polynomial-based point of view on gossip algorithms, as well as an approximation of the spectral measure of the graphs with a Jacobi measure. We show the power of the approach with simulations on various graphs, and with performance guarantees on graphs of known spectral dimension, such as grids and random percolation bonds. An extension of this work to distributed Laplacian solvers is discussed. As a side result, we also use the polynomial-based point of view to show the convergence of the message passing algorithm for gossip of Moallemi & Van Roy on regular graphs. The explicit computation of the rate of the convergence shows that message passing has a slow rate of convergence on graphs with small spectral gap.
```
@article{berthier2020accelerated,
        title={Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations},
        author={Berthier, Rapha{\"e}l and Bach, Francis and Gaillard, Pierre},
        journal={SIAM Journal on Mathematics of Data Science},
        volume={2},
        number={1},
        pages={24--47},
        year={2020},
        publisher={SIAM}
      }
```
Additive models and robust aggregation for GEFCom2014 probabilistic electric load and electricity price forecasting. Pierre Gaillard, Yannig Goude, and Raphaël Nedellec. International Journal of Forecasting, Volume 32, Issue 3, pages 1038–1050, 2016.
We sum up the methodology of the team Tololo on the Global Energy Forecasting Competition 2014 (GEFCom2014) for the electric load and electricity price forecasting tracks. During the competition, we used and tested many statistical and machine learning methods such as random forests, gradient boosting machines, or generalized additive models. In this paper, we only present the methods that have shown the best performance. For electric load forecasting, our strategy consisted first in producing a probabilistic forecast of the temperature and then plugging the obtained temperature scenario to produce a probabilistic forecast of the load. Both steps are performed by fitting a quantile generalized additive model (quantGAM). Concerning the electricity price forecasting, we investigate three methods that we used during the competition. The first method follows the spirit of the one used for electric load. The second one is based on combining a set of individual predictors and the last one fit a sparse linear regression on a large set of covariates. We chose to present in this paper these three methods, because they all exhibit good performance and present a nice potential of improvements for future research.
```
@article{GaillardGoudeNedellec2015,
            title={Additive models and robust aggregation for GEFCom2014 probabilistic electric load and electricity price forecasting},
            author={Gaillard, Pierre and Goude, Yannig and Nedellec, Rapha{\"e}l},
            journal={International Journal of Forecasting},
            volume={32},
            number={3},
            pages={1038--1050},
            year={2016},
            publisher={Elsevier}
          }
```
Forecasting electricity consumption by aggregating specialized experts. Marie Devaine, Pierre Gaillard, Yannig Goude, and Gilles Stoltz. Machine Learning, Volume 90, Issue 2, pages 231–260, 2013.
We consider the setting of sequential prediction of arbitrary sequences based on specialized experts. We first provide a review of the relevant literature and present two theoretical contributions: a general analysis of the specialist aggregation rule of Freund et al. (1997) and an adaptation of fixed-share rules of Herbster and Warmuth (1998) in this setting. We then apply these rules to the sequential short-term (one-day-ahead) forecasting of electricity consumption; to do so, we consider two data sets, a Slovakian one and a French one, respectively concerned with hourly and half-hourly predictions. We follow a general methodology to perform the stated empirical studies and detail in particular tuning issues of the learning parameters. The introduced aggregation rules demonstrate an improved accuracy on the data sets at hand; the improvements lie in a reduced mean squared error but also in a more robust behavior with respect to large occasional errors.
```
@article{DevaineGaillardGoudeStoltz2013,
            author = {Devaine, Marie and Gaillard, Pierre and Goude, Yannig and Stoltz, Gilles},
            title = {Forecasting electricity consumption by aggregating specialized experts - A review of the sequential aggregation of specialized experts, with an application to Slovakian and French country-wide one-day-ahead (half-)hourly predictions},
            journal = {Machine Learning},
            number = 2,
            pages = {231--260},
            volume = 90,
            year = 2013
          }
```

Book chapters

Online learning and game theory. A quick overview with recent results and applications. Mathieu Faure, Pierre Gaillard, Bruno Gaujal, and Vianney Perchet. ESAIM: Proceedings and Surveys, 51, 246-271, 2015.
We study one of the main concept of online learning and sequential decision problem known as regret minimization. We investigate three di erent frameworks, whether data are generated accordingly to some i.i.d. process, or when no assumption whatsoever are made on their generation and, finally, when they are the consequences of some sequential interactions between players. The overall objective is to provide a comprehensive introduction to this domain. In each of these main setups, we define and analyze classical algorithms and we analyze their performances. Finally, we also show that some concepts of equilibria that emerged in game theory are learnable by players using online learning schemes while some other concepts are not learnable.
```
@article{FaureGaillardGaujalPerchet2015,
          title={Online Learning and Game Theory. A Quick Overview with recent results and applications},
          author={Faure, Mathieu and Gaillard, Pierre and Gaujal, Bruno and Perchet, Vianney},
          journal={ESAIM: Proceedings and Surveys},
          volume={51},
          pages={246--271},
          year={2015},
          publisher={EDP Sciences}
        }
      
```
Forecasting electricity consumption by aggregating experts; how to design a good set of experts. Pierre Gaillard, Yannig Goude. In A. Antoniadis et al. editors, Modeling and Stochastic Learning for Forecasting in High Dimensions, volume 217 of Lecture Notes in Statistics, pages 95–115. Springer, 2015.
Short-term electricity forecasting has been studied for years at EDF and different forecasting models were developed from various fields of statistics or machine learning (functional data analysis, time series, semi-parametric regression, boosting, bagging). We are interested in the forecasting of France's daily electricity load consumption based on these different approaches. We investigate in this empirical study how to use them to improve prediction accuracy. First, we show how combining members of the original set of forecasts can lead to a significant improvement. Second, we explore how to build various and heterogeneous forecasts from these models and analyze how we can aggregate them to get even better predictions.
```
@INCOLLECTION{GaillardGoude2014,
        author = {Gaillard, Pierre and Goude, Yannig},
        title = {Forecasting electricity consumption by aggregating experts; 
        how to design a good set of experts},
        booktitle = {Modeling and Stochastic Learning for Forecasting in High Dimensions},
        publisher = {Springer},
        year = {2015},
        editor = {Antoniadis, Antoniadis and Brossat, Xavier and Poggi, Jean-Michel},
        volume = {217},
        series = {Lecture Notes in Statistics},
        pages = {95--115},
        doi = {10.1007/978-3-319-18732-7},
        url = {www.springer.com/us/book/9783319187310}
      }
    
```

Others

A consistent deterministic regression tree for non-parametric prediction of time series. Pierre Gaillard, Paul Baudin.
We study online prediction of bounded stationary ergodic processes. To do so, we consider the setting of prediction of individual sequences and build a deterministic regression tree that performs asymptotically as well as the best L-Lipschitz constant predictors. Then, we show why the obtained regret bound entails the asymptotical optimality with respect to the class of bounded stationary ergodic processes.
```
@Unpublished{GaillardBaudin2014,
      author = {Gaillard, Pierre and Baudin, Paul},
      title = {A consistent deterministic regression tree for non-parametric prediction of time series},
      year = 2014
    }
```
Contributions to online robust aggregation: work on the approximation error and on probabilistic forecasting. Applications to forecasting for energy markets. Pierre Gaillard. Université Paris-Sud 11, 2015.
We are interested in online forecasting of an arbitrary sequence of observations. At each time step, some experts provide predictions of the next observation. Then, we form our prediction by combining the expert forecasts. This is the setting of online robust aggregation of experts. The goal is to ensure a small cumulative regret. In other words, we want that our cumulative loss does not exceed too much the one of the best expert. We are looking for worst-case guarantees: no stochastic assumption on the data to be predicted is made. The sequence of observations is arbitrary. A first objective of this work is to improve the prediction accuracy. We investigate several possibilities. An example is to design fully automatic procedures that can exploit simplicity of the data whenever it is present. Another example relies on working on the expert set so as to improve its diversity. A second objective of this work is to produce probabilistic predictions. We are interested in coupling the point prediction with a measure of uncertainty (i.e., interval forecasts,\dots). The real world applications of the above setting are multiple. Indeed, very few assumptions are made on the data. Besides, online learning that deals with data sequentially is crucial to process big data sets in real time. In this thesis, we carry out for EDF several empirical studies of energy data sets and we achieve good forecasting performance.
```
@PHDTHESIS{Gaillard2015,
      author = {Gaillard, Pierre},
      title = {Contributions à l'agrégation séquentielle robuste d'experts~: travaux sur l'erreur d'approximation et la prévision en loi. Applications à la prévision pour les marchés de l'énergie},
      school = {Université Paris-Sud 11}, 
      year = {2015}
    }
  
```
Online nonparametric regression with Sobolev kernels. Oleksandr Zadorozhnyi, Pierre Gaillard, Sebastien Gerschinovitz, Alessandro Rudi.

Software

Opera: Online Prediction by ExpeRt Aggregation. Pierre Gaillard, Yannig Goude. R package, 2016.
Opera is an R package for prediction of time series based on online robust aggregation of a finite set of forecasts (machine learning method, statistical model, physical model, human expertise…). More formally, we consider a sequence of observation y(1),…,y(t) to be predicted element by element. At each time instance t, a finite set of experts provide prediction x(k,t) of the next observation y(t). Several methods are implemented to combine these expert forecasts according to their past performance (several loss functions are implemented to measure it). These combining methods satisfy robust finite time theoretical performance guarantees. We demonstrate on different examples from energy markets (electricity demand, electricity prices, solar and wind power time series) the interest of this approach both in terms of forecasting performance and time series analysis.

PhD Students and postdocs

PhD students currently supervised

Pierre Boudart, co-advised with Alessandro Rudi (Inria), 2023-...
Julien Zhou, industrial PhD with Criteo, co-advised with Julyan Arbel (Inria) and Thibaud Rahier (Criteo), 2022-...
Bianca Moreno, industrial PhD with EDF R&D, co-advised with Nadia Oudjane (EDF R&D) and Margaux Brégère (EDF R&D), 2022-...
Paul Liautaud, co-advised with Olivier Wintenberger, 2022-...

Former PhD students and postdocs

Camila Fernandez, industrial PhD with Nokia Bell labs, co-advised with Olivier Wintenberger, Chung Shue Chen (Nokia Bell labs), and Alonso Silva (Nokia Bell Labs). 2020-2024.
Houssam Zenati, industrial PhD with Criteo, co-advised with Julien Mairal and Eustach Ziemert (Criteo). 2019-2023. Now postdoc at Inria Paris-Saclay.
Rémi Jézéquel, co-advised with Alessandro Rudi, 2019-2023. Now research scientist at CFM.
Rémy Degenne, postdoc, january-august 2020. Now researcher at Inria Lille.
Raphaël Berthier, co-advised with Francis Bach, 2018-2021.
Margaux Brégère, industrial PhD with EDF R&D, co-advised with Gilles Stoltz and Yannig Goude (EDF R&D), 2017-2020. Now researcher at EDF R&D.

Talks

Forecasting for Renewable Energy Production workshop , EDF Lab Saclay, February 2018.
Smile in Paris seminar, Paris, January 2018.
Statistics seminar of Marseille University , Marseille, December 2017.
Statistics Seminar of Cambridge University, Cambridge, October 2017.
CWI-INRIA workshop, Amsterdam, September 2017.
An overview of Machine Learning , Talk at HSBC, Paris, September, 2017
Journées des statistiques, Avignon, June 2017.
- Scientific talk
- Testimony of a yound researcher
Testimony at EDF fellows day, Paris, March 30, 2017.
SEQUEL Seminar, Lille, February 10, 2017.
- Slides
- ShowHide abstract
In this talk, I will consider the online learning problem where a convex function is minimized observing recursively the gradients. I will introduce SAEW, a new procedure that accelerates exponential weights procedures with the slow rate $1/\sqrt{T}$ to procedures achieving the fast rate $1/T$. Under the strong convexity of the risk, it achieves the optimal rate of convergence for approximating sparse parameters in $\mathbb{R}^d$. The acceleration is achieved by using successive averaging steps in an online fashion. The procedure also produces sparse estimators thanks to additional hard threshold steps.
Machine Learning : enjeux et opportunités. Paris Sciences & Data, event organized by Cap Digital, Inria, and PSL, February 2, 2017
Prix de thèse AMIES 2016, December 14, 2016.
- Slides
- ShowHide abstract
This talk is a short summary of my PhD thesis that was given for the AMIES ceremony.
Journées des doctorants, EDF R&D, December 8, 2016.
Local seminar of the statistics group, Toulouse Mathematics Institute, Université Paul-Sabatier, Toulouse, November 22, 2016.
- ShowHide abstract
In this talk, I will consider the online learning problem where a convex function is minimized observing recursively the gradients. I will introduce SAEW, a new procedure that accelerates exponential weights procedures with the slow rate $1/\sqrt{T}$ to procedures achieving the fast rate $1/T$. Under the strong convexity of the risk, it achieves the optimal rate of convergence for approximating sparse parameters in $\mathbb{R}^d$. The acceleration is achieved by using successive averaging steps in an online fashion. The procedure also produces sparse estimators thanks to additional hard threshold steps.
Prix de thèse Paul Caseau 2016, October 11, 2016.
- Slides
- ShowHide abstract
This talk is a short summary of my PhD thesis.
Lunch Seminar of the Statistics Group, Copenhagen University, April 18, 2016.
- Slides
- ShowHide abstract
This talk is an introduction to the research area in which I carried my PhD out. More specifically, I present robust online aggregation of forecasts with applications to electricity consumption.
SIERRA Seminar, Paris, December 2, 2015.
- Slides
- ShowHide abstract
We consider the problem of online nonparametric regression with arbitrary deterministic sequences. Using ideas from the chaining technique, we design an algorithm that achieves a Dudley-type regret bound similar to the one obtained in a non-constructive fashion by Rakhlin and Sridharan (2014). Our regret bound is expressed in terms of the metric entropy in the sup norm, which yields optimal guarantees when the metric and sequential entropies are of the same order of magnitude. In particular our algorithm is the first one that achieves optimal rates for online regression over Hölder balls. In addition we show for this example how to adapt our chaining algorithm to get a reasonable computational efficiency with similar regret guarantees (up to a log factor).
Local Seminar of the Insurance and Economy group, Copenhagen University, October 21, 2015.
- Slides
Journées MAS, Toulouse, August 27, 2014.
- Slides (fr)
- ShowHide abstract (fr)
Je m'intéresse dans cet exposé à prévoir séquentiellement une suite d'observations. À chaque instant, un nombre fini d'experts me proposent des prévisions de la prochaine observation. Je forme alors ma prévision en mélangeant celles des experts. C'est le cadre de l'agrégation séquentielle d'experts. L'objectif classique est d'assurer un faible regret cumulé. En d'autres mots, je souhaite que ma perte cumulée ne dépasse pas trop celle du meilleur expert. Je donnerai une stratégie garantissant une borne déterministe du second ordre sur le regret cumulé. J'expliquerai l'intérêt de cette borne, par exemple dans le contexte d'experts spécialisés ou dans un cadre stochastique. Je finirai par une application à la prévision de consommation électrique.
COLT, Barcelona, June 14, 2014.
- Slides
- ShowHide abstract
We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted average algorithm) with multiple learning rates. These bounds are in terms of excess losses, the differences between the instantaneous losses suffered by the algorithm and the ones of a given expert. We then demonstrate the interest of these bounds in the context of experts that report their confidences as a number in the interval $[0,1]$ using a generic reduction to the standard setting. We conclude by two other applications in the standard setting, which improve the known bounds in case of small excess losses and show a bounded regret against i.i.d. sequences of losses.
46èmes Journées de Statistique, Rennes, June 05, 2014.
- Extended abstract
- ShowHide abstract
Dans cette étude empirique, nous nous intéressons à la prévision à court terme (horizon journalier) de la consommation électrique française. L'électricité se stockant difficilement, il s'agit d'un enjeu important pour une entreprise comme EDF qui doit équilibrer la production avec la demande d'électricité. EDF R&D a ainsi développé ces dernières années plusieurs modèles de prévisions basés sur des approches statistiques très différentes (analyse de données fonctionnelles, de séries temporelles, régression semi-paramétrique). Nous explorons ici comment utiliser ces modèles afin d'améliorer la qualité de prévision. Nous commençons par agréger les prévisions initialement à disposition. Afin de réduire encore l'erreur de prévision du mélange, nous proposons ensuite différentes approches pour construire de nouvelles prévisions à partir des méthodes initiales.
Colloque JPS, Forges-les-Eaux, April 9, 2014.
- Slides (fr)
- ShowHide abstract (fr)
Je m'intéresse dans cet exposé à prévoir séquentiellement une suite d'observations. À chaque instant, un nombre fini d'experts me proposent des prévisions de la prochaine observation. Je forme alors ma prévision en mélangeant celles des experts. C'est le cadre de l'agrégation séquentielle d'experts. L'objectif classique est d'assurer un faible regret cumulé. En d'autres mots, je souhaite que ma perte cumulée ne dépasse pas trop celle du meilleur expert. Je donnerai une stratégie garantissant une borne déterministe du second ordre sur le regret cumulé. J'expliquerai l'intérêt de cette borne, par exemple dans le contexte d'experts spécialisés ou dans un cadre stochastique. Je finirai par une application à la prévision de consommation électrique.
STAT Seminar, Luminy, Marseille, June 17, 2013.
- Slides
- ShowHide abstract
We consider the empirical goal of sequential short-term (one-day ahead) forecasting of electricity consumption. For this purpose, we look at a recent data set used by EDF R&D to forecast the French market. It contains the observed electricity load as well as some side information (temperature, date,...). We intend to work in real operating conditions and we restrict ourselves to operational constraints which come with it (delays for data acquisition). First of all we build in this study a set of base forecasters, that aim to be as heterogenous as possible and to exhibit varied enough behaviors. They aspire also to be optimized for specific situations depending the side information (weather, special events). To do so, we first test on our dataset several recent machine learning algorithms, such as Boosting or Random-Forests. We furthermore consider statistical models used by EDF R&D that come from three main categories: parametric, semi-parametric, and non-parametric models. So as to specialize our forecasters, we focus their training sets on subsets depending on the specialization they aim to acquire. In a second step, we apply the framework of prediction with experts advice. We combine in a sequential fashion all these base forecasts, so as to optimize our forecasting of electricity consumption.
The Wipfor workshop, EDF R&D, Clamart, June 6, 2013.
- Slides
- Extended abstract
45èmes Journées de Statistique, Toulouse, May 28, 2013.
- Slides
- Extended abstract
- ShowHide abstract
Online convex optimization consists in forecasting in a sequential fashion the values of an unknown sequence of a convex set. The player’s goal is to ensure in worst case a performance slightly worse than that of the best strategy (the oracle) of a set of reference strategies. The bigger the reference set, the lower is the oracle’s error, but the harder it is for the player to get closer and the greater the regret will be. This is the famous stimation-approximation trade-oﬀ. In the literature, the initial target is not always the minimization of the cumulative loss, the bias-variance tradeoﬀ is furthermore managed in various ways and diﬀerent sets of references are considered. This has led to various notions of regret, such as shifting regret, adaptive regret, or discounted regret. This talk suggests a new notion of regret more powerful, which tends to unify the previous three. We additionally take the opportunity to show that algorithms such as ﬁxed-share or Mirror-descent obtain satisfactory bounds on this generalized regret.

School

Here, you can find some reports I wrote during my studies.

Prévision de la consommation électrique (à court terme) par agrégation séquentielle d'experts spécialisés, 2011. Under the supervision of Yannig Goude and Gilles Stoltz
- Slides (fr)
- Technical report
Invert sparse matrices with gaussian belief propagation algorithm, 2010. Under the supervision of Devavrat Shah
The Lasso, or how to choose among a large set of variables with few observations, 2009. Pierre Gaillard and Anisse Ismaili. Under the supervision of Sylvain Arlot