Share

Publications

Publications

CMAP Theses  are available by following this link:
Discover CMAP theses

Listed below, are sorted by year, the publications appearing in the HAL open archive.

2021

  • GECCO 2021 Tutorial on Benchmarking Multiobjective Optimizers 2.0
    • Brockhoff Dimo
    • Tušar Tea
    , 2021, pp.664-668. (10.1145/3449726.3461421)
    DOI : 10.1145/3449726.3461421
  • Augmented Lagrangian, penalty techniques and surrogate modeling for constrained optimization with CMA-ES
    • Dufossé Paul
    • Hansen Nikolaus
    , 2021. In this paper, we investigate a non-elitist Evolution Strategy designed to handle black-box constraints by an adaptive Augmented Lagrangian penalty approach, AL-(µ/µw, λ)-CMA-ES, on problems with up to 28 constraints. Based on stability and performance observations, we propose an improved default parameter setting. We exhibit failure cases of the Augmented Lagrangian technique and show how surrogate modeling of the constraints can overcome some difficulties. Several variants of AL-CMA-ES are compared on a set of nonlinear constrained problems from the literature. Simple adaptive penalty techniques serve as a baseline for comparison. (10.1145/3449639.3459340)
    DOI : 10.1145/3449639.3459340
  • Hypervolume in Biobjective Optimization Cannot Converge Faster Than $\Omega(1/p)$
    • Marescaux Eugénie
    • Hansen Nikolaus
    , 2021. The hypervolume indicator is widely used by multi-objective optimization algorithms and for assessing their performance. We investigate a set of $p$ vectors in the biobjective space that maximizes the hypervolume indicator with respect to some reference point, referred to as $p$-<I>optimal distribution</I>. We prove explicit lower and upper bounds on the gap between the hypervolumes of the $p$-optimal distribution and the $\infty$-optimal distribution (the Pareto front) as a function of $p$, of the reference point, and of some Lipschitz constants. On a wide class of functions, this optimality gap can not be smaller than $\Omega(1/p)$, thereby establishing a bound on the optimal convergence speed of any algorithm. For functions with either bilipschitz or convex Pareto fronts, we also establish an upper bound and the gap is hence $\Theta(1/p)$. The presented bounds are not only asymptotic. In particular, functions with a linear Pareto front have the normalized exact gap of $1/(p + 1)$ for any reference point dominating the nadir point. We empirically investigate on a small set of Pareto fronts the exact optimality gap for values of $p$ up to 1000 and find in all cases a dependency resembling $1/(p + CONST)$. (10.1145/3449639.3459371)
    DOI : 10.1145/3449639.3459371
  • Benchmarking: state-of-the-art and beyond
    • Auger Anne
    • Hansen Nikolaus
    , 2021, pp.339-340. (10.1145/3449726.3461424)
    DOI : 10.1145/3449726.3461424
  • Master equation for the finite state space planning problem
    • Bertucci Charles
    • Lasry Jean-Michel
    • Lions Pierre-Louis
    Archive for Rational Mechanics and Analysis, Springer Verlag, 2021, 242 (1), pp.327-342. We present results of existence, regularity and uniqueness of solutions of the master equation associated with the mean field planning problem in the finite state space case, in the presence of a common noise. The results hold under monotonicity assumptions, which are used crucially in the different proofs of the paper. We also make a link with the trajectories induced by the solution of the master equation and start a discussion on the case of boundary conditions. (10.1007/s00205-021-01687-8)
    DOI : 10.1007/s00205-021-01687-8
  • CMA-ES and advanced adaptation mechanisms
    • Akimoto Youhei
    • Hansen Nikolaus
    , 2021, pp.636-663. (10.1145/3449726.3462748)
    DOI : 10.1145/3449726.3462748
  • Revisiting the framework for intermittency in Lagrangian stochastic models for turbulent flows: a way to an original and versatile numerical approach
    • Letournel Roxane
    • Goudenège Ludovic
    • Zamansky Rémi
    • Vié Aymeric
    • Massot Marc
    Physical Review E : Statistical, Nonlinear, and Soft Matter Physics [2001-2015], American Physical Society, 2021, 104, pp.015104. The characterization of intermittency in turbulence has its roots in the K62 theory, and if no proper definition is to be found in the literature, statistical properties of intermittency were studied and models were developed in attempt to reproduce it. The first contribution of this work is to propose a requirement list to be satisfied by models designed within the Lagrangian framework. Multifractal stochastic processes are a natural choice to retrieve multifractal properties of the dissipation. Among them, following the proposition of \cite{Mandelbrot1968}, we investigate the Gaussian Multiplicative Chaos formalism, which requires the construction of a log-correlated stochastic process $X_t$. The fractional Gaussian noise of Hurst parameter $H = 0$ is of great interest because it leads to a log-correlation for the logarithm of the process. Inspired by the approximation of fractional Brownian motion by an infinite weighted sum of correlated Ornstein-Uhlenbeck processes, our second contribution is to propose a new stochastic model: $X_t = \int_0^\infty Y_t^x k(x) d x$, where $Y_t^x$ is an Ornstein-Uhlenbeck process with speed of mean reversion $x$ and $k$ is a kernel. A regularization of $k(x)$ is required to ensure stationarity, finite variance and logarithmic auto-correlation. A variety of regularizations are conceivable, and we show that they lead to the aforementioned multifractal models. To simulate the process, we eventually design a new approach relying on a limited number of modes for approximating the integral through a quadrature $X_t^N = \sum_{i=1}^N \omega_i Y_t^{x_i}$, using a conventional quadrature method. This method can retrieve the expected behavior with only one mode per decade, making this strategy versatile and computationally attractive for simulating such processes, while remaining within the proposed framework for a proper description of intermittency. (10.1103/PhysRevE.104.015104)
    DOI : 10.1103/PhysRevE.104.015104
  • Porting the Mathematical Components library to Hierarchy Builder
    • Affeldt Reynald
    • Allamigeon Xavier
    • Bertot Yves
    • Canu Quentin
    • Cohen Cyril
    • Roux Pierre
    • Sakaguchi Kazuhiko
    • Tassi Enrico
    • Théry Laurent
    • Trunov Anton
    , 2021. We report on the porting of the Mathematical Components library (hereafter MC) to the Hierarchy Builder [2] tool (hereafter HB).
  • A New Class of Stochastic EM Algorithms. Escaping Local Maxima and Handling Intractable Sampling
    • Allassonnière Stéphanie
    • Chevallier Juliette
    Computational Statistics and Data Analysis, Elsevier, 2021, 159, pp.107159. The expectation-maximization (EM) algorithm is a powerful computational technique for maximum likelihood estimation in incomplete data models. When the expectation step cannot be performed in closed form, a stochastic approximation of EM (SAEM) can be used. The convergence of the SAEM toward critical points of the observed likelihood has been proved and its numerical efficiency has been demonstrated. However, sampling from the posterior distribution may be intractable or have a high computational cost. Moreover, despite appealing features, the limit position of this algorithm can strongly depend on its starting one. To cope with this two issues, we propose here a new stochastic approximation version of the EM in which we do not sample from the exact distribution in the expectation phase of the procedure. We first prove the convergence of this algorithm toward critical points of the observed likelihood. Then, we propose an instantiation of this general procedure to favor convergence toward global maxima. Experiments on synthetic and real data highlight the performance of this algorithm in comparison to the SAEM and the EM when feasible. (10.1016/j.csda.2020.107159)
    DOI : 10.1016/j.csda.2020.107159
  • AMF: Aggregated Mondrian forests for online learning
    • Mourtada Jaouad
    • Gaïffas Stéphane
    • Scornet Erwan
    Journal of the Royal Statistical Society: Series B, Royal Statistical Society, 2021, 83 (3), pp.505-533. Random Forests (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such tree-ensemble methods comes from a combination of several characteristics: a remarkable accuracy in a variety of tasks, a small number of parameters to tune, robustness with respect to features scaling, a reasonable computational cost for training and prediction, and their suitability in high-dimensional settings. The most commonly used RF variants however are "offline" algorithms, which require the availability of the whole dataset at once. In this paper, we introduce AMF, an online random forest algorithm based on Mondrian Forests. Using a variant of the Context Tree Weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function. Numerical experiments show that AMF is competitive with respect to several strong baselines on a large number of datasets for multi-class classification. (10.1111/rssb.12425)
    DOI : 10.1111/rssb.12425
  • Strong approximation in h-mass of rectifiable currents under homological constraint
    • Chambolle Antonin
    • Ferrari Luca Alberto Davide
    • Merlet Benoît
    Advances in Calculus of Variation, Walter de Gruyter GmbH, 2021, 14 (3), pp.343--363. Let h : R → R+ be a lower semi-continuous subbadditive and even function such that h(0) = 0 and h(θ) ≥ α|θ| for some α > 0. The h-mass of a k-polyhedral chain P =∑j θjσj in R n (0 ≤ k ≤ n) is defined as M h (P) := j h(θj) H k (σj). If T = τ (M, θ, ξ) is a k-rectifiable chain, the definition extends to M h (T) := M h(θ) dH k. Given such a rectifiable flat chain T with M h (T) < ∞ and ∂T polyhedral, we prove that for every η > 0, it decomposes as T = P + ∂V with P polyhedral, V rectifiable, M h (V) < η and M h (P) < M h (T) + η. In short, we have a polyhedral chain P which strongly approximates T in h-mass and preserves the homological constraint ∂P = ∂T. These results are motivated by the study of approximations of M h by smoother functionals but they also provide explicit formulas for the lower semicontinuous envelope of T → M h (T) + I ∂S (∂T) with respect to the topology of the flat norm. (10.1515/acv-2018-0079)
    DOI : 10.1515/acv-2018-0079
  • Orlicz norm and concentration inequalities for beta-heavy tailed distributions
    • Chamakh Linda
    • Gobet Emmanuel
    • Liu Wenjun
    , 2021.
  • Sampling scheme for intractable copula function, application to the computation of tail events in factor copula model
    • Bénézet Cyril
    • Gobet Emmanuel
    • Targino Rodrigo
    , 2021.
  • Orlicz norms and concentration inequalities for β-heavy tailed random variables
    • Chamakh Linda
    • Gobet Emmanuel
    • Liu Wenjun
    , 2021. We establish a new concentration-of-measure inequality for the sum of independent random variables with β- heavy tail. This includes exponential of Gaussian distributions (a.k.a. log-normal distributions), or exponential of Weibull distributions, among others. These distributions have finite polynomial moments at any order but may not have finite α-exponential moments. We exhibit a Orlicz norm adapted to this setting of β-heavy tails, we prove a new Talagrand inequality for the sum and a new maximal inequality. As consequence, a bound on the deviation probability of the sum from its mean is obtained, as well as a bound on uniform deviation probability.
  • On the approximation of extreme quantiles with ReLU neural networks
    • Allouche Michaël
    • Girard Stéphane
    • Gobet Emmanuel
    , 2021. Feedforward neural networks based on rectified linear units (ReLU) cannot efficiently approximate quantile functions which are not bounded in the Fréchet maximum domain of attraction. We thus propose a new parametrization for the generator of a generative adversarial network (GAN) adapted to this framework of heavy-tailed distributions. We provide an analysis of the uniform error between the extreme quantile and its GAN approximation. It appears that the rate of convergence of the error is mainly driven by the second-order parameter of the data distribution. The above results are illustrated on simulated data and real financial data.
  • 5′ Region Large Genomic Rearrangements in the BRCA1 Gene in French Families: Identification of a Tandem Triplication and Nine Distinct Deletions with Five Recurrent Breakpoints
    • Caputo Sandrine M
    • Telly Dominique
    • Briaux Adrien
    • Sesen Julie
    • Ceppi Maurizio
    • Bonnet Françoise
    • Bourdon Violaine
    • Coulet Florence
    • Castera Laurent
    • Delnatte Capucine
    • Hardouin Agnès
    • Mazoyer Sylvie
    • Schultz Inès
    • Sevenet Nicolas
    • Uhrhammer Nancy
    • Bonnet Céline
    • Tilkin-Mariamé Anne-Françoise
    • Houdayer Claude
    • Moncoutier Virginie
    • Andrieu Catherine
    • Bièche Ivan
    • Stern Marc-Henri
    • Stoppa-Lyonnet Dominique
    • Lidereau Rosette
    • Toulas Christine
    • Rouleau Etienne
    Cancers, MDPI, 2021, 13 (13), pp.3171. Background: Large genomic rearrangements (LGR) in BRCA1 consisting of deletions/duplications of one or several exons have been found throughout the gene with a large proportion occurring in the 5′ region from the promoter to exon 2. The aim of this study was to better characterize those LGR in French high-risk breast/ovarian cancer families. Methods: DNA from 20 families with one apparent duplication and nine deletions was analyzed with a dedicated comparative genomic hybridization (CGH) array, high-resolution BRCA1 Genomic Morse Codes analysis and Sanger sequencing. Results: The apparent duplication was in fact a tandem triplication of exons 1 and 2 and part of intron 2 of BRCA1, fully characterized here for the first time. We calculated a causality score with the multifactorial model from data obtained from six families, classifying this variant as benign. Among the nine deletions detected in this region, eight have never been identified. The breakpoints fell in six recurrent regions and could confirm some specific conformation of the chromatin. Conclusions: Taken together, our results firmly establish that the BRCA1 5′ region is a frequent site of different LGRs and highlight the importance of the segmental duplication and Alu sequences, particularly the very high homologous region, in the mechanism of a recombination event. This also confirmed that those events are not systematically deleterious. (10.3390/cancers13133171)
    DOI : 10.3390/cancers13133171
  • Statistical modelling of medical data and theoretical analysis of estimation algorithms
    • Debavelaere Vianney
    , 2021. In the medical field, the use of features extracted from images is increasingly common to perform diagnostics or measure the effectiveness of a treatment over time. These measures can for example be real numbers (volume, cognitive scores), meshes of an organ or even the image itself. In the latter two cases, a Euclidean space cannot describe the space of measurements and it is necessary to use Riemannian manifolds. Using this Riemannian framework and mixed effects models, it is then possible to estimate a representative object of the population as well as the inter-individual variability.In the longitudinal case (subjects observed repeatedly over time), these models allow to create an average trajectory, representative of the global evolution of the population. In this thesis, we propose to generalize these models in the case of a mixture of populations. Each sub-population can follow different dynamics over time and their representative trajectory can branch or join from one time interval to another. This new model allows, for example, to model the onset of a disease as a deviation from a normal aging.In a second step, we are also interested in the detection of anomalies (e.g. tumours) in a population. Given an object representing a control population, we define an anomaly as a structure that cannot be reconstructed by a diffeomorphic deformation of this representative object. Our method has the advantage of requiring neither a large data set nor annotation by physicians. Moreover, it can be easily applied to any organ.Finally, we are interested in different theoretical properties of the previously used estimation algorithms. In the context of non-linear mixed effects models, the MCMC-SAEM algorithm is used. In this thesis, we will discuss two theoretical limitations. Firstly, we will lift the geometric ergodicity assumption by replacing it with a sub-geometric ergodicity assumption. Furthermore, we will look at a method, often used in practice, allowing to apply the SAEM algorithm when the joint distribution is not exponentially curved. We will show that this method introduces a bias in the estimation that we will measure. We will also propose a new algorithm to reduce it.
  • SOME RECENT ADVANCES ON THE METHOD OF MOMENTS IN KINETIC THEORY
    • Pichard Teddy
    , 2023, 75, pp.86-95. This review presents some recent works on the construction of closure relations for moment systems extracted from a kinetic equation. A rough construction of those closures together with the main properties of the resulting systems are described. Especially, based on the underlying kinetic equation, the main properties desired for such moment systems are the realizability, i.e. the positivity of an underlying kinetic solution, global strong hyperbolicity and entropy dissipation. (10.1051/proc/202375086)
    DOI : 10.1051/proc/202375086
  • New preconditioners for the Laplace and Helmholtz integral equations on open curves: analytical framework and numerical results
    • Alouges François
    • Averseng Martin
    Numerische Mathematik, Springer Verlag, 2021, 148 (2), pp.255-292. Helmholtz wave scattering by open screens in 2D can be formulated as first-kind integral equations which lead to ill-conditioned linear systems after discretization. We introduce two new preconditioners in the form of square-roots of on-curve differential operators both for the Dirichlet and Neumann boundary conditions on the screen. They generalize the so-called “analytical” preconditioners available for Lipschitz scatterers. We introduce a functional setting adapted to the singularity of the problem and enabling the analysis of those preconditioners. The efficiency of the method is demonstrated on several numerical examples. (10.1007/s00211-021-01189-5)
    DOI : 10.1007/s00211-021-01189-5
  • Une classe de schémas conservatifs sur grilles décalées pour les systèmes hyperboliques
    • Ndjinga Michael
    • Ait-Ameur Katia
    • Mounier Coraline
    , 2021. Finite volume schemes on staggered grids are popular among thermalhydraulics engineers for their good low Mach number asymptotic expansion and the absence of checkerboard type spurious oscillations [5]. However, they are generally non conservative, and their stability analysis has historically been performed with a heuristic approach and the tuning of numerical parameters ([4]). In this talk, we first investigate the L 2 -stability of staggered schemes by analysing their numerical diffusion operator. This analysis of the numerical diffusion operator gives new insights into the schemes numerical properties and is a step towards a proof of linear stability or stability for almost constant initial data. For most classical staggered schemes ([5, 1, 2, 3]), our analysis shows that the numerical diffusion is highly nonlinear and we are able to prove its positivity only in the case of constant sign velocities. In order to deal with compressible flows, possibly at high Mach number, we then propose a new class of conservative linearly L 2 -stable staggered schemes for the compressible Euler equations on staggered grids. The schemes are based on a carefully chosen numerical diffusion operator and the proof of stability follows from the energy decay in the basis that symetrises the Euler equations. An important remark is that unlike Godunov type schemes on colocated grids, the numerical diffusion operator of a symmetric system is not symmetric. This property seems important to avoid spurious checkerboard modes oscillations. We give an example of such a conservative staggered scheme and present some numerical results showing the good behaviour of the method for low and high Mach number flows, in 1D and 2D.
  • Techniques de spatialisation binaurale pour le guidage de sportifs non-voyants
    • Ferrand Sylvain
    , 2021. Dans la continuité d’autres travaux de la communauté du son binaural, nous pensons que le son spatialisé peut constituer un outil efficace pour guider des personnes aveugles, y compris pour la pratique sportive.Un système destiné au guidage par audio binaural doit être suffisamment précis et réactif pour pouvoir prendre en compte chaque mouvement du sujet à guider, ce qui a impliqué le développement et la mise en œuvre d’un système de localisation temps réel et d'un logiciel de spatialisation binaural à faible latence. Enfin nous avons intégré l’ensemble dans un dispositif embarqué.Les techniques les plus avancées de navigation globale par satellite (augmenté et multibande) ne sont pas toujours disponibles en intérieur ou en environnement urbain. C’est pourquoi nous avons travaillé sur des méthodes alternatives de localisation et de suivi temps réel pour l’intérieur. Tout d’abord nous avons développé une méthode de calibration et de latération temps réel robuste par réseau de balises Ultra WideBand utilisant un filtre de Kalman (UKF). Nous avons également développé une méthode originale de localisation par réseau de radars Doppler à onde continue non modulée. Nous avons montré qu’il était possible d’utiliser l’amplitude du signal Doppler pour estimer la distance à un objet mobile. Nous avons alors implémenté un filtre particulaire qui permet la localisation en temps réel par hybridation des données de distance, des mesures de vitesse radiale Doppler et du cap fourni par une centrale inertielle.Dans le domaine de l’acoustique et de l’audio binaural, nous avons cherché à mieux comprendre les capacités des personnes à localiser et à suivre un son en mouvement. Pour cela nous avons mené des expériences en utilisant des sons naturels et des sons spatialisés par audio binaural. Nous avons pu montrer, que sur le plan azimutal, les stimuli audio spatialisés permettaient une localisation comparable aux sons naturels, y compris avec les HRTFs (head-related transfer function) non individualisées et interpolées. Par ailleurs, nous avons pu montrer que même sur le plan azimutal, les stimuli obtenus par convolution de HRTFs étaient supérieurs au panning (ITD+ILD) pour les sons fixes et pour les sons en mouvement. En nous appuyant sur les travaux antérieurs de l’équipe, nous avons implémenté des algorithmes efficaces pour la spatialisation sonore temps-réel sur des plateformes embarquées disposant de peu de ressources. Pour une mise en œuvre efficace, cette approche temps-réel a impliqué une compréhension approfondie des sources de latence qu’elles soient liées au head-tracking ou au sous-système audio des systèmes d’exploitation modernes.Finalement nous avons mis en œuvre ces méthodes de localisation et ces techniques audio pour construire un dispositif de guidage où la source sonore précède continuellement la personne pour lui indiquer le chemin à suivre.Il a été conçu en lien avec des personnes déficientes visuelles, dans une démarche itérative et avec une approche centrée sur les besoins utilisateurs. Nous avons alors mené des expériences de guidage avec des personnes aveugles, en lien avec nos partenaires associatifs, qui ont permis d’évaluer différentes stratégies de contrôle. Nous avons ainsi pu confirmer que le son spatialisé pouvait constituer un outil efficace pour guider des personnes aveugles sans induire de charge cognitive pénalisante pour des pratiques sportives comme la marche, la course à pied ou le roller en autonomie partielle, y compris dans un contexte de recherche de performances.
  • Best $k$-layer neural network approximations
    • Lim Lek-Heng
    • Michałek Mateusz
    • Qi Yang
    Constructive Approximation, Springer Verlag, 2021, 55 (1), pp.583-604. (10.1007/s00365-021-09545-2)
    DOI : 10.1007/s00365-021-09545-2
  • On the approximation of extreme quantiles with neural networks
    • Allouche Michaël
    • Girard Stéphane
    • Gobet Emmanuel
    , 2021, pp.1-5. In this study, we propose a new parametrization for the generator of a Generative adversarial network (GAN) adapted to data from heavy-tailed distributions. We provide an analysis of the uniform error between an extreme quantile and its GAN approximation. Numerical experiments are conducted both on real and simulated data.
  • Geom-SPIDER-EM: Faster Variance Reduced Stochastic Expectation Maximization for Nonconvex Finite-Sum Optimization
    • Fort Gersende
    • Moulines Eric
    • Wai Hoi-To
    , 2021, pp.3135-3139. The Expectation Maximization (EM) algorithm is a key reference for inference in latent variable models; unfortunately, its computational cost is prohibitive in the large scale learning setting. In this paper, we propose an extension of the Stochastic Path-Integrated Differential EstimatoR EM (SPIDER-EM) and derive complexity bounds for this novel algorithm, designed to solve smooth nonconvex finite-sum optimization problems. We show that it reaches the same state of the art complexity bounds as SPIDER-EM; and provide conditions for a linear rate of convergence. Numerical results support our findings. (10.1109/ICASSP39728.2021.9414271)
    DOI : 10.1109/ICASSP39728.2021.9414271
  • Interferometric Graph Transform for Community Labeling
    • Grinsztajn Nathan
    • Leconte Louis
    • Preux Philippe
    • Oyallon Edouard
    , 2021. We present a new approach for learning unsupervised node representations in community graphs. We significantly extend the Interferometric Graph Transform (IGT) to community labeling: this non-linear operator iteratively extracts features that take advantage of the graph topology through demodulation operations. An unsupervised feature extraction step cascades modulus non-linearity with linear operators that aim at building relevant invariants for community labeling. Via a simplified model, we show that the IGT concentrates around the E-IGT: those two representations are related through some ergodicity properties. Experiments on community labeling tasks show that this unsupervised representation achieves performances at the level of the state of the art on the standard and challenging datasets Cora, Citeseer, Pubmed and WikiCS.