Partager

Publications

Publications

Les thèses soutenues au CMAP sont disponibles en suivant ce lien:
Découvrez les thèses du CMAP

Sont listées ci-dessous, par année, les publications figurant dans l'archive ouverte HAL.

2021

  • Formalizing the Face Lattice of Polyhedra
    • Allamigeon Xavier
    • Katz Ricardo D.
    • Strub Pierre-Yves
    , 2021. Faces play a central role in the combinatorial and computational aspects of polyhedra. In this paper, we present the first formalization of faces of polyhedra in the proof assistant Coq. This builds on the formalization of a library providing the basic constructions and operations over polyhedra, including projections, convex hulls and images under linear maps. Moreover, we design a special mechanism which automatically introduces an appropriate representation of a polyhedron or a face, depending on the context of the proof. We demonstrate the usability of this approach by establishing some of the most important combinatorial properties of faces, namely that they constitute a family of graded atomistic and coatomistic lattices closed under interval sublattices. We also prove a theorem due to Balinski on the d-connectedness of the adjacency graph of polytopes of dimension d.
  • Experimental methodology for the accurate stochastic calibration of catalytic recombination affecting reusable spacecraft thermal protection systems
    • del Val Anabel
    • Luís Diana
    • Chazot Olivier
    , 2021. This work focuses on the development of a dedicated experimental methodology that allows for a better stochastic characterization of catalytic recombination parameters for reusable ceramic matrix composite materials when dealing with uncertain measurements and model parameters. As one of the critical factors affecting the performance of such materials, the contribution to the heat flux of the exothermic recombination reactions at the vehicle surface must be carefully assessed. In this work, we first use synthetic data to test whether or not the proposed experimental methodology brings any advantages in terms of uncertainty reduction on the sought out parameters compared to more traditional experimental approaches in the literature. The evaluation is done through the use of a Bayesian framework developed in a previous work with the advantage of being able to fully and objectively characterize the uncertainty on the calibrated parameters. The synthetic dataset is adapted for testing ceramic matrix composites by carefully choosing adequate auxiliary materials whose heat flux measurements have the capability of reducing the resulting uncertainty on the catalytic parameter of the thermal protection material itself when tested under the same flow conditions. We then propose a comprehensive set of real wind tunnel testing cases for which stochastic analyses are carried out. The physical model used for the estimations consists of a 1D boundary layer solver along the stagnation line in which the chemical production term included in the surface mass balance depends on the catalytic recombination efficiency. All catalytic parameters of the auxiliary and thermal protection materials are calibrated jointly with the boundary conditions of the experiments. The testing methodology confirms to be a reliable experimental approach for characterizing these materials while reducing the uncertainty on the calibrated catalytic efficiencies by more than 50 %. An account of the posteriors summary statistics is provided to enrich the current state-of-the-art experimental databases.
  • Reduced order model approach for imaging with waves
    • Borcea Liliana
    • Garnier Josselin
    • Mamonov Alexander
    • Zimmerling Jörn
    Inverse Problems, IOP Publishing, 2021, 38 (2), pp.025004. Abstract We introduce a novel, computationally inexpensive approach for imaging with an active array of sensors, which probe an unknown medium with a pulse and measure the resulting waves. The imaging function is based on the principle of time reversal in non-attenuating media and uses a data driven estimate of the ‘internal wave’ originating from the vicinity of the imaging point and propagating to the sensors through the unknown medium. We explain how this estimate can be obtained using a reduced order model (ROM) for the wave propagation. We analyze the imaging function, connect it to the time reversal process and describe how its resolution depends on the aperture of the array, the bandwidth of the probing pulse and the medium through which the waves propagate. We also show how the internal wave can be used for selective focusing of waves at points in the imaging region. This can be implemented experimentally and can be used for pixel scanning imaging. We assess the performance of the imaging methods with numerical simulations and compare them to the conventional reverse-time migration method and the ‘backprojection’ method introduced recently as an application of the same ROM. (10.1088/1361-6420/ac41d0)
    DOI : 10.1088/1361-6420/ac41d0
  • Trees, forests, and impurity-based variable importance
    • Scornet Erwan
    , 2021. Tree ensemble methods such as random forests [Breiman, 2001] are very popular to handle high-dimensional tabular data sets, notably because of their good predictive accuracy. However, when machine learning is used for decision-making problems, settling for the best predictive procedures may not be reasonable since enlightened decisions require an in-depth comprehension of the algorithm prediction process. Unfortunately, random forests are not intrinsically interpretable since their prediction results from averaging several hundreds of decision trees. A classic approach to gain knowledge on this so-called black-box algorithm is to compute variable importances, that are employed to assess the predictive impact of each input variable. Variable importances are then used to rank or select variables and thus play a great role in data analysis. Nevertheless, there is no justification to use random forest variable importances in such way: we do not even know what these quantities estimate. In this paper, we analyze one of the two well-known random forest variable importances, the Mean Decrease Impurity (MDI). We prove that if input variables are independent and in absence of interactions, MDI provides a variance decomposition of the output, where the contribution of each variable is clearly identified. We also study models exhibiting dependence between input variables or interaction, for which the variable importance is intrinsically ill-defined. Our analysis shows that there may exist some benefits to use a forest compared to a single tree.
  • Differentially Private Federated Learning on Heterogeneous Data
    • Noble Maxence
    • Bellet Aurélien
    • Dieuleveut Aymeric
    , 2021. Federated Learning (FL) is a paradigm for large-scale distributed learning which faces two key challenges: (i) efficient training from highly heterogeneous user data, and (ii) protecting the privacy of participating users. In this work, we propose a novel FL approach (DP-SCAFFOLD) to tackle these two challenges together by incorporating Differential Privacy (DP) constraints into the popular SCAFFOLD algorithm. We focus on the challenging setting where users communicate with a ''honest-but-curious'' server without any trusted intermediary, which requires to ensure privacy not only towards a third-party with access to the final model but also towards the server who observes all user communications. Using advanced results from DP theory, we establish the convergence of our algorithm for convex and non-convex objectives. Our analysis clearly highlights the privacy-utility trade-off under data heterogeneity, and demonstrates the superiority of DP-SCAFFOLD over the state-of-the-art algorithm DP-FedAvg when the number of local updates and the level of heterogeneity grow. Our numerical results confirm our analysis and show that DP-SCAFFOLD provides significant gains in practice.
  • A surrogate-based optimal likelihood function for the Bayesian calibration of catalytic recombination in atmospheric entry protection materials
    • del Val Anabel
    • Le Maître Olivier
    • Magin Thierry E
    • Chazot Olivier
    • Congedo Pietro Marco
    Applied Mathematical Modelling, Elsevier, 2021, 101, pp.791-810. This work deals with the inference of catalytic recombination parameters from plasma wind tunnel experiments for reusable thermal protection materials. One of the critical factors affecting the performance of such materials is the contribution to the heat flux of the exothermic recombination reactions at the vehicle surface. The main objective of this work is to develop a dedicated Bayesian framework that allows us to compare uncertain measurements with model predictions which depend on the catalytic parameter values. Our framework accounts for uncertainties involved in the model definition and incorporates all measured variables with their respective uncertainties. The physical model used for the estimation consists of a 1D boundary layer solver along the stagnation line. The chemical production term included in the surface mass balance depends on the catalytic recombination eciency. As not all the different quantities needed to simulate a reacting boundary layer can be measured or known (such as the flow enthalpy at the inlet boundary), we propose an optimization procedure built on the construction of the likelihood function to determine their most likely values based on the available experimental data. This procedure avoids the need to introduce any a priori estimates on the nuisance quantities, namely, the boundary layer edge enthalpy, wall temperatures, static and dynamic pressures, which would entail the use of very wide priors. Furthermore, we substitute the optimal likelihood of the experimental measurements with a surrogate model to make the inference procedure both faster and more robust. We show that the resulting Bayesian formulation yields meaningful and accurate posterior probability distributions of the catalytic parameters with a reduction of more than 20% of the standard deviation with respect to previous works. We also study the implications of an extension of the experimental procedure on the improvement of the quality of the inference. (10.1016/j.apm.2021.07.019)
    DOI : 10.1016/j.apm.2021.07.019
  • Estimation of the largest tail-index and extreme quantiles from a mixture of heavy-tailed distributions
    • Girard Stéphane
    • Gobet Emmanuel
    , 2021. The estimation of extreme quantiles requires adapted methods to extrapolate beyond the largest observation of the sample. Extreme-value theory provides a mathematical framework to tackle this problem together with statistical procedures based on the estimation of the so-called tail-index describing the distribution tail. We focus on heavy-tailed distributions and consider the case where the observations at hand are related to statistical models with different tail-index, a.k.a. as a mixture of heavy-tail models, and for conservative risk management reasons, we are interested in the largest tail-index. In such a mixture situation, usual extreme-value estimators suffer from a strong bias, which may induce in turn a strong bias when quantifying tail risk in this mixture model. We propose several methods to mitigate this bias under mild assumptions on the mixture distribution. Their asymptotic properties are established and their finite sample performance is illustrated both on simulated and real financial data.
  • Zero-sum repeated games : accelerated algorithms and tropical best-approximation
    • Saadi Omar
    , 2021. In this thesis, we develop accelerated algorithms for Markov decision processes (MDP) and more generally for zero-sum stochastic games (SG). We also address best approximation problems arising in tropical geometry.Dynamic programming is one of the main approaches used to solve MDP and SG problems. It allows one to transform a game to a fixed point problem involving an operator called Shapley operator (or Bellman operator in the case of MDP). Value iteration (VI) and policy iteration (PI) are the two main algorithms allowing one to solve these fixed point problems. However, in the case of large scale instances, or when we want to solve a mean payoff problem (where there is no discount factor for the payments received in the future), classical methods become slow.In the first part of this thesis, we develop two refinements of the classical value or policy iteration algorithms. We first propose an accelerated version of value iteration (AVI) allowing to solve affine fixed point problems with non self-adjoint matrices, alongside with an accelerated version of policy iteration (API) for MDP, building on AVI. This acceleration extends Nesterov accelerated gradient algorithm to a class of fixed point problems -- which cannot be interpreted in terms of convex programming. We characterize the spectra of matrices for which this algorithm does converge with an accelerated asymptotic rate. We also introduce an accelerated algorithm of degree d, and show that it yields a multiply accelerated rate of convergence under more demanding conditions on the spectrum of the matrices. Another contribution is a deflated version of value iteration (DVI) to solve the mean payoff version of stochastic games. This method allows one to transform a mean payoff problem to a discounted one under the hypothesis of existence of a distinguished state that is accessible from all other states and under all policies. Combining this deflation method with variance reduction techniques, we derive a sublinear algorithm solving mean payoff stochastic games.In the second part of this thesis, we study tropical best approximation problems. We first solve a tropical linear regression problem consisting in finding the best approximation of a set of points by a tropical hyperplane. We show that the value of this regression problem coincides with the maximal radius of a Hilbert's ball included in a tropical polyhedron, and that this problem is polynomial-time equivalent to mean payoff games. We apply these results to an inverse problem from auction theory. We study also a tropical analogue of low-rank approximation for matrices. This is motivated by approximate methods in dynamic programming, in which the value function is approximated by a supremum of elementary functions. We establish general properties of tropical low-rank approximation, and identify classes of low-rank approximation problems that are polynomial-time solvable. In particular, we show that the best tropical rank-one matrix approximation is equivalent to finding the minimal radius of a Hilbert's ball containing a tropical polyhedron.
  • Multistage hematopoietic stem cell regulation in the mouse: A combined biological and mathematical approach
    • Bonnet Celine
    • Gou Panhong
    • Girel Simon
    • Bansaye Vincent
    • Lacout Catherine
    • Bailly Karine
    • Schlagetter Marie-Hélène
    • Lauret Evelyne
    • Méléard Sylvie
    • Giraudier Stéphane
    iScience, Elsevier, 2021. We have reconciled steady-state and stress hematopoiesis in a single mathematical model based on murine in vivo experiments and with a focus on hematopoietic stem and progenitor cells. A phenylhydrazine stress was first applied to mice. A reduced cell number in each progenitor compartment was evidenced during the next 7 days through a drastic level of differentiation without proliferation, followed by a huge proliferative response in all compartments including long-term hematopoietic stem cells, before a return to normal levels. Data analysis led to the addition to the 6-compartment model, of time-dependent regulation that depended indirectly on the compartment sizes. The resulting model was finely calibrated using a stochastic optimization algorithm and could reproduce biological data in silico when applied to different stress conditions (bleeding, chemotherapy, HSC depletion). In conclusion, our multi-step and time-dependent model of immature hematopoiesis provides new avenues to a better understanding of both normal and pathological hematopoiesis. (10.1016/j.isci.2021.103399)
    DOI : 10.1016/j.isci.2021.103399
  • Problème de Schrödinger, inégalités fonctionnelles et transport optimal
    • Conforti Giovanni
    , 2021. This document contains an overview of a large part of my work after the completion of my PhD Thesis. I have conducted most of my research in probability theory and the common ground it shares with Riemannian geometry and some branches of analysis, optimal transport in particular. All these research fields interact naturally when looking at the family of Schrödinger problems, whose study has been at the heart of my scientific interests over the past few years. There are two central themes in this manuscript: the first one is that of showing that Schrödinger bridges solve a second order equation, i.e. an equation involving an acceleration term. Developing this theme requires to address the old question of how to properly define the acceleration of a stochastic processes and permits to enrich the already strong connections between optimal transport, in particular Otto calculus, and stochastic analysis, in particular Itô calculus. The second recurrent theme is that of quantifying by means of functional inequalities and entropy dissipation estimates the trend to equilibrium and the ergodic behavior of Markov processes, with a particular emphasis on optimally controlled diffusion processes and Markov chains on countable state spaces. I have divided this manuscript in three parts and six chapters. The first four chapters constitute the first part and are devoted to the Schrödinger problem. The second is about convex Sobolev inequalities for Markov chains and is made of one single chapter as the third part, that concentrates on the problem of defining spline curves to interpolate between probability measures
  • Collections of solids in interaction: suspensions, granular media and micro-swimmers.
    • Lefebvre-Lepot Aline
    , 2021. In this manuscript, we are interested in the study and numerical simulation of mechanical systems composed of interacting solids: passive or active suspensions, or granular media. The first chapter describes some of the problems raised by the study of these systems. We then focus on the numerical simulation of suspensions. This requires the resolu- tion of a coupled problem between the Stokes fluid and the rigid structures. In the second chapter, we aim to solve precisely the problem when the particles are close. The method we propose is based on an explicit asymptotic expansion of the solution when the inter-particle distance goes to zero. In the third chapter, we focus on the use of boundary element methods to solve the fluid-structure problem. We deal, in the case of Stokes equations, with the two classical difficulties for these methods: on the one hand, the development of fast algorithms to solve full systems and on the other hand, the computation of singular integrals involved in the problem. In the fourth chapter, we are interested in designing algorithms to deal with contacts (with or without friction) in the systems we consider. The algorithms described are Contact Dynamics algorithms. At each time step, the contact forces are computed in an implicit way, as solution to a convex optimization problem. Rheological studies of granular materials based on these algorithms are presented. Finally, in the fifth chapter, we study micro-swimmers evolving in a Stokes fluid. We investigate whether these swimmers can swim, that is, perform cyclic deformations (a stroke) generating a given displacement. We propose a general framework to answer this question by rewriting the problem as a control problem.
  • Spatial dynamic of interfaces in ecology : deterministic and stochastic models
    • Tourniaire Julie
    , 2021. Traveling fronts arising from reaction diffusion equations model various phenomena observed in physics and biology. From a biological standpoint, a traveling front can be seen as the invasion of an uninhabited environment by a species. Since biological systems are finite and thus undergo demographic fluctuations, these deterministic wavefronts only represent an approximation of the population dynamics, in which we presuppose that the local density of individuals is infiniteso that the fluctuations self-average. In this sense, reaction diffusion equations can be seen as hydrodynamic limits of some individual based models. In this thesis, we investigate the long time behaviour of some finite microscopic systems modeling such front propagations and compare them to the one of their large population asymptotics.The first part of this thesis is dedicated to the study of the dynamics of a population colonising a slowly varying environment. This question has been widely studied from the PDE point of view. However, the results given by the viscosity solutions theory turn out to be biologically unsatisfactory in some situations. We thus suggest to study an individual based model for front propagation in the limit, when the scale of heterogeneity of the environment tends to infinity. In this framework, we show that the spreading speed of the population may be much slower than the speed of the front in the PDE describing the large population asymptotics of the system. This qualitative disagreement between the two behaviours is related to the so-called tail problem observed in PDE theory, due to the absence of local extinction in FKPP-type equations.In a second part, we study the impact of the type of the deterministic limit waves on the related stochastic models to explain this drastic slow-down in the particle system. Indeed, wavefronts arising from monostable reaction diffusion PDEs are classified into two types: pulled and pushed waves. It is well-known that small perturbations have a huge impact on pulled waves. In sharp contrast, pushed waves are expected to be less sensitive. Nevertheless, some recent numerical experiments have suggested the existence of a third class of waves in stochasticfronts. It is a subclass of pushed fronts very sensitive to fluctuations. In this thesis, we investigate the internal mechanisms of such fronts to explain the transition between these three regimes.
  • Optimisation Topologique du Couple Pièce/Support pour la Fabrication Additive sur Lit de Poudre
    • Bihr Martin
    , 2021. Cette thèse est consacrée à l'optimisation de la forme et de la topologie de pièces construites en fabrication additive. Les nouveaux procédés de fabrication suscitent beaucoup d'intérêts ces dernières années pour leur capacité à construire des pièces complexes. En particulier la fabrication additive (FA) métallique sur lit de poudre permet aux industries de ne plus être limitées par les contraintes de fabrication conventionnelles (fraisage, moulage,...). Plus spécifiquement, le procède le plus utilisé est le SLM (Selective Laser Melting), où un laser fond le lit de poudre très localement avant qu'un racleur étale une nouvelle couche de poudre après la solidification. Néanmoins, même si ce procédé offre une grande liberté dans le design de la pièce, il possède ses propres contraintes de fabrication. Les forts gradients de température apportés par la chaleur du laser de manière successive à chaque couche contraignent fortement la pièce et peuvent même la déformer lors de la fabrication, ne la rendant plus conforme au design voulu.Une solution est alors de rajouter de la matière autour la pièce, des supports de fabrication, pour maintenir les parties sujettes aux déformations. Ils peuvent être optimisés pour minimiser la matière utilisée, leur temps de fabrication ou de leur retrait. Cependant, il peut être préférable de modifier la géométrie de la pièce plutôt que de rajouter cette matière supplémentaire onéreuse. L’objectif est alors de concevoir une pièce différente qui n'amène pas de défaut de fabrication tout dégradant le moins possible sa performance dans son utilisation finale.L'optimisation de forme est alors une bonne solution pour obtenir des formes fabricables par ce procédé en limitant ces différentes contraintes de fabrication. Pour finir, l'optimisation topologique du couple pièce/support consiste à trouver la forme optimale d'une pièce ainsi que de ses supports, souvent nécessaires à sa fabrication. La structure finale obtenue sera optimisée en termes de coût de fabrication et d'utilisation dans un problème final donné. Pour cela, plusieurs contraintes de fabrication liées à la technologie SLM devront être prises en compte.
  • Federated Expectation Maximization with heterogeneity mitigation and variance reduction
    • Dieuleveut Aymeric
    • Fort Gersende
    • Moulines Eric
    • Robin Geneviève
    , 2021. The Expectation Maximization (EM) algorithm is the default algorithm for inference in latent variable models. As in any other field of machine learning, applications of latent variable models to very large datasets make the use of advanced parallel and distributed architectures mandatory. This paper introduces FedEM, which is the first extension of the EM algorithm to the federated learning context. FedEM is a new communication efficient method, which handles partial participation of local devices, and is robust to heterogeneous distributions of the datasets. To alleviate the communication bottleneck, FedEM compresses appropriately defined complete data sufficient statistics. We also develop and analyze an extension of FedEM to further incorporate a variance reduction scheme. In all cases, we derive finite-time complexity bounds for smooth non-convex problems. Numerical results are presented to support our theoretical findings, as well as an application to federated missing values imputation for biodiversity monitoring. (10.48550/arXiv.2111.02083)
    DOI : 10.48550/arXiv.2111.02083
  • Unbalanced Optimal Transport through Non-negative Penalized Linear Regression
    • Chapel Laetitia
    • Flamary Rémi
    • Wu Haoran
    • Févotte Cédric
    • Gasso Gilles
    , 2021, 34. This paper addresses the problem of Unbalanced Optimal Transport (UOT) in which the marginal conditions are relaxed (using weighted penalties in lieu of equality) and no additional regularization is enforced on the OT plan. In this context, we show that the corresponding optimization problem can be reformulated as a non-negative penalized linear regression problem. This reformulation allows us to propose novel algorithms inspired from inverse problems and nonnegative matrix factorization. In particular, we consider majorization-minimization which leads in our setting to efficient multiplicative updates for a variety of penalties. Furthermore, we derive for the first time an efficient algorithm to compute the regularization path of UOT with quadratic penalties. The proposed algorithm provides a continuity of piece-wise linear OT plans converging to the solution of balanced OT (corresponding to infinite penalty weights). We perform several numerical experiments on simulated and real data illustrating the new algorithms, and provide a detailed discussion about more sophisticated optimization tools that can further be used to solve OT problems thanks to our reformulation. (10.48550/arXiv.2106.04145)
    DOI : 10.48550/arXiv.2106.04145
  • Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize
    • Durmus Alain
    • Moulines Eric
    • Naumov Alexey
    • Samsonov Sergey
    • Scaman Kevin
    • Wai Hoi-To
    , 2021, pp.30063 - 30074. This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}\theta = \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$. Our analysis is based on new results regarding moments and high probability bounds for products of matrices which are shown to be tight. We derive high probability bounds on the performance of LSA under weaker conditions on the sequence $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$ than previous works. However, in contrast, we establish polynomial concentration bounds with order depending on the stepsize. We show that our conclusions cannot be improved without additional assumptions on the sequence of random matrices $\{{\bf A}_n: n \in \mathbb{N}^*\}$, and in particular that no Gaussian or exponential high probability bounds can hold. Finally, we pay a particular attention to establishing bounds with sharp order with respect to the number of iterations and the stepsize and whose leading terms contain the covariance matrices appearing in the central limit theorems. (10.5555/3540261.3542562)
    DOI : 10.5555/3540261.3542562
  • What's a good imputation to predict with missing values?
    • Le Morvan Marine
    • Josse Julie
    • Scornet Erwan
    • Varoquaux Gaël
    , 2021. How to learn a good predictor on data with missing values? Most efforts focus on first imputing as well as possible and second learning on the completed data to predict the outcome. Yet, this widespread practice has no theoretical grounding. Here we show that for almost all imputation functions, an impute-then-regress procedure with a powerful learner is Bayes optimal. This result holds for all missing-values mechanisms, in contrast with the classic statistical results that require missing-at-random settings to use imputation in probabilistic modeling. Moreover, it implies that perfect conditional imputation is not needed for good prediction asymptotically. In fact, we show that on perfectly imputed data the best regression function will generally be discontinuous, which makes it hard to learn. Crafting instead the imputation so as to leave the regression function unchanged simply shifts the problem to learning discontinuous imputations. Rather, we suggest that it is easier to learn imputation and regression jointly. We propose such a procedure, adapting NeuMiss, a neural network capturing the conditional links across observed and unobserved variables whatever the missing-value pattern. Experiments confirm that joint imputation and regression through NeuMiss is better than various two step procedures in our experiments with finite number of samples. (10.48550/arXiv.2106.00311)
    DOI : 10.48550/arXiv.2106.00311
  • Preserved central model for faster bidirectional compression in distributed settings
    • Philippenko Constantin
    • Dieuleveut Aymeric
    , 2021, 34, pp.2387-2399. We develop a new approach to tackle communication constraints in a distributed learning problem with a central server. We propose and analyze a new algorithm that performs bidirectional compression and achieves the same convergence rate as algorithms using only uplink (from the local workers to the central server) compression. To obtain this improvement, we design MCM, an algorithm such that the downlink compression only impacts local models, while the global model is preserved. As a result, and contrary to previous works, the gradients on local servers are computed on perturbed models. Consequently, convergence proofs are more challenging and require a precise control of this perturbation. To ensure it, MCM additionally combines model compression with a memory mechanism. This analysis opens new doors, e.g. incorporating worker dependent randomized-models and partial participation. (10.48550/arXiv.2102.12528)
    DOI : 10.48550/arXiv.2102.12528
  • NEO: Non Equilibrium Sampling on the Orbit of a Deterministic Transform
    • Thin Achille
    • Janati Yazid
    • Le Corff Sylvain
    • Ollion Charles
    • Doucet Arnaud
    • Durmus Alain
    • Moulines Eric
    • Robert Christian
    , 2021, 34. Sampling from a complex distribution $\pi$ and approximating its intractable normalizing constant Z are challenging problems. In this paper, a novel family of importance samplers (IS) and Markov chain Monte Carlo (MCMC) samplers is derived. Given an invertible map T, these schemes combine (with weights) elements from the forward and backward Orbits through points sampled from a proposal distribution $\rho$. The map T does not leave the target $\pi$ invariant, hence the name NEO, standing for Non-Equilibrium Orbits. NEO-IS provides unbiased estimators of the normalizing constant and self-normalized IS estimators of expectations under $\pi$ while NEO-MCMC combines multiple NEO-IS estimates of the normalizing constant and an iterated sampling-importance resampling mechanism to sample from $\pi$. For T chosen as a discrete-time integrator of a conformal Hamiltonian system, NEO-IS achieves state-of-the art performance on difficult benchmarks and NEO-MCMC is able to explore highly multimodal targets. Additionally, we provide detailed theoretical results for both methods. In particular, we show that NEO-MCMC is uniformly geometrically ergodic and establish explicit mixing time estimates under mild conditions. (10.5555/3540261.3541565)
    DOI : 10.5555/3540261.3541565
  • Mean field games : numerical methods and case of risk-averse agents
    • Lavigne Pierre
    , 2021. Mean field games (abbreviated MFGs) are both a mathematical theory and a modeling tool. Developed in 2006 independently by Jean-Michel Lasry and Pierre-Louis Lions and Minyi Huang, Roland P. Malhamé, and Peter E. Caines, MFGs provide a framework to analyze interactions among a large number of rational and anonymous agents.In this thesis we provide several developments to this theory:1) Using the concept of composite risk measure, we study a discrete-time MFG model involving risk-averse agents. We show the existence of a solution via a fixed point approach. We show that an optimal policy of the MFG is ε(N)-optimal for a certain N-player game. The sequence ε(N) converges to zero as the number of players tends to infinity.2) We study discrete time and finite state space potential (also called variational) MFGs with hard constraints, that is with convex potentials, possibly non-differentiable and with bounded domain. We study a primal and a dual problem, and we show: a duality result, the existence and uniqueness (in the differentiable case) of a solution to the MFG system.Then we implement two families of numerical methods: primal-dual proximal methods (Chambolle-Pock and Chambolle-Pock-Bregman) and augmented Lagrangian based methods (ADMM and ADM-G).We propose a congestion model and a price model that we solve with these methods. We compare the empirical performance of each method for each problem.3) We apply the generalized conditional gradient algorithm for potential MFGs in a PDE framework. We highlight the connection between this algorithm and a learning method called fictitious play algorithm.We show that for the learning rate δ_k = 2/(k + 2),the potential cost converges in O(1/k); the exploitability the variables of the problem converge in O(1/k^1/2), for specific norms.
  • Optimisation des supports pour la fabrication additive
    • Vallade Alexis
    , 2021.
  • Bayesian calibration of a methane-air global scheme and uncertainty propagation to flame-vortex interactions
    • Armengol Jan M
    • Le Maitre Olivier
    • Vicquelin Ronan
    Combustion and Flame, Elsevier, 2021, 234, pp.111642. Simplified chemistry models are commonly used in reactive computational fluid dynamics (CFD) simulations to alleviate the computational cost. Uncertainties associated with the calibration of such simplified models have been characterized in some works, but to our knowledge, there is a lack of studies analyzing the subsequent propagation through CFD simulation of combustion processes. This work propagates the uncertainties-arising in the calibration of a global chemistry model-through direct numerical simulations (DNS) of flame-vortex interactions. Calibration uncertainties are derived by inferring the parameters of a two-step reaction mechanism for methane, using synthetic observations of one-dimensional laminar premixed flames based on a detailed mechanism. To assist the inference, independent surrogate models for estimating flame speed and thermal thickness are built taking advantage of the Principal Component Analysis (PCA) and the Polynomial Chaos (PC) expansion. Using the Markov Chain Monte Carlo (MCMC) sampling method, a discussion on how push-forward posterior densities behave with respect to the detailed mechanism is provided based on three different calibrations relying (i) only on flame speed, (ii) only on thermal thickness, and (iii) on both quantities simultaneously. The model parameter uncertainties characterized in the latter calibration are propagated to two-dimensional flamevortex interactions using 60 independent samples. Posterior predictive densities for the time evolution of the heat release and flame surface are consistent, being that the confidence intervals contain the reference simulation. However, the twostep mechanism fails to reproduce the flame response to stretch as it was not considered in the calibration. This study highlights the capabilities and limitations of propagating chemistry-models uncertainties to CFD applications to fully quantify posterior uncertainties on target quantities. (10.1016/J.COMBUSTFLAME.2021.111642)
    DOI : 10.1016/J.COMBUSTFLAME.2021.111642
  • Identification of glucocorticoid-related molecular signature by whole blood methylome analysis
    • Armignacco Roberta
    • Jouinot Anne
    • Bouys Lucas
    • Septier Amandine
    • Lartigue Thomas
    • Neou Mario
    • Gaspar Cassandra
    • Perlemoine Karine
    • Braun Leah
    • Riester Anna
    • Bonnet-Serrano Fidéline
    • Blanchard Anne
    • Amar Laurence
    • Scaroni Carla
    • Ceccato Filippo
    • Rossi Gian Paolo
    • Williams Tracy Ann
    • Larsen Casper K.
    • Allassonnière Stéphanie
    • Zennaro Maria-Christina
    • Beuschlein Felix
    • Reincke Martin
    • Bertherat Jerome
    • Assié Guillaume
    European Journal of Endocrinology, Oxford Univ. Press, 2021, pp.EJE-21-0907.R1. Objective: Cushing’s syndrome represents a state of excessive glucocorticoids related to glucocorticoid treatments or to endogenous hypercortisolism. Cushing’s syndrome is associated with high morbidity, with significant inter-individual variability. Likewise, adrenal insufficiency is a life-threatening condition of cortisol deprivation. Currently, hormone assays contribute to identify Cushing’s syndrome or adrenal insufficiency. However, no biomarker directly quantifies the biological glucocorticoid action. The aim of this study was to identify such markers. Design: We evaluated whole blood DNA methylome in 94 samples obtained from patients with different glucocorticoid states (Cushing’s syndrome, eucortisolism, adrenal insufficiency). We used an independent cohort of 91 samples for validation. Methods: Leukocyte DNA was obtained from whole blood samples. Methylome was determined using the Illumina methylation chip array (~850000 CpG sites). Both unsupervised (Principal Component Analysis) and supervised (Limma) methods were used to explore methylome profiles. A Lasso-penalized regression was used to select optimal discriminating features. Results: Whole blood methylation profile was able to discriminate samples by their glucocorticoid status: glucocorticoid excess was associated with DNA hypomethylation, recovering within months after Cushing’s syndrome correction. In Cushing’s syndrome, an enrichment in hypomethylated CpG sites was observed in the region of FKBP5 gene locus. A methylation predictor of glucocorticoid excess was built on a training cohort and validated on two independent cohorts. Potential CpG sites associated with the risk for specific complications, such as glucocorticoid-related hypertension or osteoporosis, were identified, needing now to be confirmed on independent cohorts. Conclusions: Whole blood DNA methylome is dynamically impacted by glucocorticoids. This biomarker could contribute to better assess glucocorticoid action beyond hormone assays. (10.1530/EJE-21-0907)
    DOI : 10.1530/EJE-21-0907
  • Collaborative Learning in the Jungle (Decentralized, Byzantine, Heterogeneous, Asynchronous and Nonconvex Learning)
    • El-Mhamdi El-Mahdi
    • Farhadkhani Sadegh
    • Guerraoui Rachid
    • Guirguis Arsany
    • Hoang Lê Nguyên
    • Rouault Sébastien
    , 2021. We study Byzantine collaborative learning, where n nodes seek to collectively learn from each others' local data. The data distribution may vary from one node to another. No node is trusted, and f < n nodes can behave arbitrarily. We prove that collaborative learning is equivalent to a new form of agreement, which we call averaging agreement. In this problem, nodes start each with an initial vector and seek to approximately agree on a common vector, which is close to the average of honest nodes' initial vectors. We present two asynchronous solutions to averaging agreement, each we prove optimal according to some dimension. The first, based on the minimum-diameter averaging, requires n ≥ 6f +1, but achieves asymptotically the best-possible averaging constant up to a multiplicative constant. The second, based on reliable broadcast and coordinate-wise trimmed mean, achieves optimal Byzantine resilience, i.e., n ≥ 3f + 1. Each of these algorithms induces an optimal Byzantine collaborative learning protocol. In particular, our equivalence yields new impossibility theorems on what any collaborative learning algorithm can achieve in adversarial and heterogeneous environments. (10.48550/arXiv.2008.00742)
    DOI : 10.48550/arXiv.2008.00742
  • A quantitative approach to climate-related credit risk, using Shared Socioeconomic Pathways
    • Bourgey Florian
    • Gobet Emmanuel
    • Jiao Ying
    , 2021.