Share

Publications

Publications

CMAP Theses  are available by following this link:
Discover CMAP theses

Listed below, are sorted by year, the publications appearing in the HAL open archive.

2019

  • Quantification d'incertitude pour l'Approximation Stochastique
    • Crépey Stéphane
    • Fort Gersende
    • Gobet Emmanuel
    • Stazhynski Uladzislau
    , 2019, 2019, pp.537-540. L'Approximation Stochastique est une procédure itérative pour le calcul d'un zero θ d'une fonction non explicite mais définie comme une espérance. C'est par exemple un outil numérique pour le calcul du maximum de vraisemblance dans des modèlesà données latentes "réguliers". Si la définition du modèle statistique est entachée d'une incertitude τ , dont on ne connaît qu'un a priori dπ(τ), alors les zeros dépendent de τ et la question naturelle est d'explorer leur distribution lorsque τ ∼ dπ. Dans ce papier, nous proposons un algorithme itératif basé sur un schéma d'Approximation Stochastique qui,à la limite, calcule θ (τ) pour tout τ et produit une caractérisation de sa distribution; et nousénonçons des conditions suffisantes pour la convergence de cet algorithme. Abstract-Stochastic Approximation is an iterative procedure for the computation of a root θ of a non explicit function defined as an expectation. It is for example a numerical tool for the computation of the Maximum Likelihood in "regular" latent variable models. When the definition of the statistical model is uncertain, depending on a quantity τ for which only a prior π(dτ) is known, then the roots also depend on τ ; a natural question is to explore their distribution when τ ∼ dπ. In this paper, we propose a Stochastic Approximation-based algorithm which, in its limiting behavior, provides a computation of θ (τ) for any τ and a characterization of its distribution; we also state sufficient conditions for the convergence of this algorithm.
  • Uncertainty analysis methodology for multi-physics coupled rod ejection accident
    • Delipei G.-K
    • Garnier J.
    • Le Pallec J-C
    • Normand B.
    , 2019. Nuclear reactor's transients computational modeling under an uncertainty framework creates many challenges related to the potential large number of inputs and outputs to be considered, their interactions and dependencies. In the particular case of Rod Ejection Accident (REA) in Pressurized Water Reactors (PWR) strong multi-physics coupling effects occur between neu-tronics, fuel-thermal and thermal-hydraulics. APOLLO3 R neutronic code using two group diffusion modeling and FLICA4 thermal-hydraulic code using axial multi-channel 1D model-ing are coupled in the framework of CORPUS Best Estimate multi-physics tool to model the REA. CORPUS, APOLLO3 R and FLICA4 are developed at CEA and are used for the first time in a REA uncertainty analysis. Different statistical tools are explored and combined in an uncertainty analysis methodology using R language. The methodology is developed and tested on a small scale geometry representative of a PWR core. A total of 22 inputs are considered spanning neutronics, fuel-thermal and thermal-hydraulics. Three scalar and one functional outputs are studied. The methodology consists of different steps. First, a screening process based on dependence measures is performed in order to identify an important reduced input subspace. Second, a design of experiments is created by preserving good space-filling properties in both the original and reduced input spaces. This design is used to train kriging surrogate models only on the reduced subspaces. The kriging models are used then for brute force Monte Carlo (MC) uncertainty propagation and global sensitivity analysis by estimating Shapley indices. Concerning the functional output Principal Components Analysis (PCA) was used to reduce its dimension. The results show that the methodology manages to identify two subsets of important inputs and estimates the histograms and Shapley indices for both scalar and functional outputs. This will motivate the application of the derived methodology to a full core design for transient analysis purpose.
  • Invisible floating objects
    • Chesnel Lucas
    • Rihani Mahran
    , 2019. We consider a time-harmonic water waves problem in a 2D waveguide. The geometry is symmetric with respect to an axis orthogonal to the direction of propagation of waves. Moreover, the waveguide contains two floating obstacles separated by a distance L. We study the behaviours of R (the reflection coefficient) and T (the transmission coefficient) as L tends to +∞. From this analysis, we exhibit situations of non reflectivity (R = 0, |T | = 1) or perfect invisibility (R = 0, T = 1). (10.34726/waves2019)
    DOI : 10.34726/waves2019
  • Directed topological complexity
    • Goubault Eric
    • Sagnier Aurélien
    • Färber Michael
    Journal of Applied and Computational Topology, Springer, 2019. It has been observed that the very important motion planning problem of robotics mathematically speaking boils down to the problem of finding a section to the path-space fibration, raising the notion of topological complexity, as introduced by M. Farber. The above notion fits the motion planning problem of robotics when there are no constraints on the actual control that can be applied to the physical apparatus. In many applications, however, a physical apparatus may have constrained controls, leading to constraints on its potential future dynamics. In this paper we adapt the notion of topological complexity to the case of directed topological spaces, which encompass such controlled systems, and also systems which appear in concurrency theory. We study its first properties, make calculations for some interesting classes of spaces, and show applications to a form of directed homotopy equivalence.
  • Estimation of the environment distribution of a random walk in random environment
    • Havet Antoine
    , 2019. Introduced in the 1960s, the model of random walk in i.i.d. environment on integers (or RWRE) raised only recently interest in the statistical community. Various works have in particular focused on the estimation of the environment distribution from a single trajectory of the RWRE.This thesis extends the advances made in those works and offers new approaches to the problem.First, we consider the estimation problem from a frequentist point of view. When the RWRE is transient to the right or recurrent, we build the first non-parametric estimator of the density of the environment distribution and obtain an upper-bound of the associated risk in infinite norm.Then, we consider the estimation problem from a Bayesian perspective. When the RWRE is transient to the right, we prove the posterior consistency of the Bayesian estimator of the environment distribution.The main difficulty of the thesis was to develop the tools necessary to the proof of Bayesian consistency.For this purpose, we demonstrate a quantitative version of a Mac Diarmid's type concentration inequality for Markov chains.We also study the return time to 0 of a branching process with immigration in random environment (or BPIRE). We show the existence of a finite exponential moment uniformly valid on a class of BPIRE. The BPIRE being a Markov chain, this result enables then to make explicit the dependence of the constants of the concentration inequality with respect to the characteristics of the BPIRE.
  • R-miss-tastic: a unified platform for missing values methods and workflows
    • Mayer Imke
    • Josse Julie
    • Tierney Nicholas
    • Vialaneix Nathalie
    , 2019. Missing values are unavoidable when working with data. Their occurrence is exacerbated as more data from different sources become available. However, most statistical models and visualization methods require complete data, and improper handling of missing data results in information loss, or biased analyses. Since the seminal work of Rubin (1976), there has been a burgeoning literature on missing values with heterogeneous aims and motivations. This has resulted in the development of various methods, formalizations, and tools (including a large number of R packages). However, for practitioners, it is challenging to decide which method is most suited for their problem, partially because handling missing data is still not a topic systematically covered in statistics or data science curricula. To help address this challenge, we have launched a unified platform: "R-miss-tastic", which aims to provide an overview of standard missing values problems, methods, how to handle them in analyses, and relevant implementations of methodologies. The objective is not only to collect, but also comprehensively organize materials, to create standard analysis workflows, and to unify the community. These overviews are suited for beginners, students, more advanced analysts and researchers.
  • Extension of AK-MCS for the efficient computation of very small failure probabilities
    • Razaaly Nassim
    • Congedo Pietro Marco
    , 2019.
  • A Stochastic Analysis of a Network with Two Levels of Service
    • Boeuf Vianney
    • Robert Philippe
    Queueing Systems, Springer Verlag, 2019, 92 (3-4), pp.30. In this paper a stochastic model of a call center with a two-level architecture is analyzed. A first-level pool of operators answers calls, identifies, and handles non-urgent calls. A call classified as urgent has to be transferred to specialized operators at the second level. When the operators of the second level are all busy, the operator of first level handling the urgent call is blocked until an operator at the second level is available. Under a scaling assumption, the evolution of the number of urgent calls blocked at level $1$ is investigated. It is shown that if the ratio of the number of operators at level $2$ and~$1$ is greater than some threshold, then, essentially, the system operates without congestion, with probability close to $1$, no urgent call is blocked after some finite time. Otherwise, we prove that a positive fraction of the operators of the first level are blocked due to the congestion of the second level. Stochastic calculus with Poisson processes, coupling arguments and formulations in terms of Skorokhod problems are the main mathematical tools to establish these convergence results. (10.1007/s11134-019-09617-y)
    DOI : 10.1007/s11134-019-09617-y
  • Approximation of stochastic processes by non-expansive flows and coming down from infinity
    • Bansaye Vincent
    The Annals of Applied Probability, Institute of Mathematical Statistics (IMS), 2019, 29 (4). We approximate stochastic processes in finite dimension by dynamical systems. We provide trajectorial estimates which are uniform with respect to the initial condition for a well chosen distance. This relies on some non-expansivity property of the flow, which allows to deal with non-Lipschitz vector fields. We use the stochastic calculus and follow the martingale technics initiated in Berestycki and al [5] to control the fluctuations. Our main applications deal with the short time behavior of stochastic processes starting from large initial values. We state general properties on the coming down from infinity of one-dimensional SDEs, with a focus on stochastically monotone processes. In particular, we recover and complement known results on Lambda-coalescent and birth and death processes. Moreover, using Poincaré's compactification technics for dynamical systems close to infinity, we develop this approach in two dimensions for competitive stochastic models. We classify the coming down from infinity of Lotka-Volterra diffusions and provide uniform estimates for the scaling limits of competitive birth and death processes. (10.1214/18-AAP1456)
    DOI : 10.1214/18-AAP1456
  • Model-uncertain value-at-risk, expected shortfall and sharpe ratio, using Stochastic Approximation
    • Crépey Stéphane
    • Fort Gersende
    • Gobet Emmanuel
    • Stazhynski Uladzislau
    , 2019.
  • Taylor expansions of the value function associated with a bilinear optimal control problem
    • Breiten Tobias
    • Kunisch Karl
    • Pfeiffer Laurent
    Annales de l'Institut Henri Poincaré (C), Analyse non linéaire, EMS, 2019, 36 (5), pp.1361-1399. A general bilinear optimal control problem subject to an infinitedimensional state equation is considered. Polynomial approximations of the associated value function are derived around the steady state by repeated forma ldifferentiation of the Hamilton–Jacobi-Bellman equation. The terms of the approximations are described by multilinear forms, which can be obtained as solutions to generalized Lyapunov equations with recursively defined right-handsides.They form the basis for defining a sub-optimal feedback law.The approximation properties of this feedback law are investigated. An application to the optimal control of a Fokker-Planck equation is also provided. (10.1016/j.anihpc.2019.01.001)
    DOI : 10.1016/j.anihpc.2019.01.001
  • A birth–death model of ageing: from individual-based dynamics to evolutive differential inclusions
    • Méléard Sylvie
    • Rera Michael
    • Roget Tristan
    Journal of Mathematical Biology, Springer, 2019, 79 (3), pp.901-939. (10.1007/s00285-019-01382-z)
    DOI : 10.1007/s00285-019-01382-z
  • A Formalization of Convex Polyhedra Based on the Simplex Method
    • Allamigeon Xavier
    • Katz Ricardo David
    Journal of Automated Reasoning, Springer Verlag, 2019, 63 (2), pp.323–345. We present a formalization of convex polyhedra in the proof assistant Coq. The cornerstone of our work is a complete implementation of the simplex method, together with the proof of its correctness and termination. This allows us to define the basic predicates over polyhedra in an effective way (i.e. as programs), and relate them with the corresponding usual logical counterparts. To this end, we make an extensive use of the Boolean reflection methodology.The benefit of this approach is that we can easily derive the proof of several fundamental results on polyhedra, such as Farkas’ Lemma, the duality theorem of linear programming, and Minkowski’s Theorem. (10.1007/s10817-018-9477-1)
    DOI : 10.1007/s10817-018-9477-1
  • Systems of Gaussian process models for directed chains of solvers
    • Sanson Francois
    • Le Maitre Olivier
    • Congedo Pietro Marco
    Computer Methods in Applied Mechanics and Engineering, Elsevier, 2019, 352, pp.32-55. The simulation of complex multi-physics phenomena often relies on System of Solvers (SoS), which we define here as a set of interdependent solvers where the output of an upstream solver is the input of downstream solvers. Performing Uncertainty Quantification (UQ) analyses in SoS is challenging as they generally feature a large number of uncertain input parameters so that classical UQ methods, such as spectral expansions or Gaussian process models, are affected by the curse of dimensionality. In this work, we develop an original mathematical framework, based on Gaussian Process (GP) models, to construct a global surrogate model of the uncertain directed SoS, (i.e. merely featuring one-way dependences between solvers). The key ideas of the proposed approach are i) to determine a local GP model for each solver constituting the SoS and, ii) to define the prediction as the composition of the individual GP models constituting a system of GP models (SoGP). We further propose different adaptive sampling strategies for the construction of the SoGP. These strategies are based on the decomposition of the SoGP prediction variance into individual contributions of the constitutive GP models and on extensions of the Maximum Mean Square Predictive Error criterion to system of GP models. The performance of the SoGP framework is then assessed on several SoS involving different numbers of solvers and structures of input dependencies. The results show that the SoGP framework is very flexible and can handle different types of SoS, with a significantly reduced construction cost (measured by the number of training samples) compared to the direct GP model approximation of the SoS. (10.1016/j.cma.2019.04.013)
    DOI : 10.1016/j.cma.2019.04.013
  • Gaussian Mixture Penalty for Trajectory Optimization Problems
    • Rommel Cédric
    • Bonnans Frédéric
    • Martinon Pierre
    • Gregorutti Baptiste
    Journal of Guidance, Control, and Dynamics, American Institute of Aeronautics and Astronautics, 2019, 42 (8), pp.1857--1862. We consider the task of solving an aircraft trajectory optimization problem where the system dynamics have been estimated from recorded data. Additionally, we want to avoid optimized trajectories that go too far away from the domain occupied by the data, since the model validity is not guaranteed outside this region. This motivates the need for a proximity indicator between a given trajectory and a set of reference trajectories. In this presentation, we propose such an indicator based on a parametric estimator of the training set density. We then introduce it as a penalty term in the optimal control problem. Our approach is illustrated with an aircraft minimal consumption problem and recorded data from real flights. We observe in our numerical results the expected trade-off between the consumption and the penalty term. (10.2514/1.G003996)
    DOI : 10.2514/1.G003996
  • f-SAEM: A fast Stochastic Approximation of the EM algorithm for nonlinear mixed effects models
    • Karimi Belhal
    • Lavielle Marc
    • Moulines Éric
    Computational Statistics and Data Analysis, Elsevier, 2019. The ability to generate samples of the random effects from their conditional distributions is fundamental for inference in mixed effects models. Random walk Metropolis is widely used to perform such sampling, but this method is known to converge slowly for high dimensional problems, or when the joint structure of the distributions to sample is spatially heterogeneous. We propose an independent Metropolis-Hastings (MH) algorithm based on a multidimensional Gaussian proposal that takes into account the joint conditional distribution of the random effects and does not require any tuning. Indeed, this distribution is automatically obtained thanks to a Laplace approximation of the incomplete data model. We show that such approximation is equivalent to linearizing the structural model in the case of continuous data. Numerical experiments based on simulated and real data illustrate the performance of the proposed methods. In particular, we show that the suggested MH algorithm can be efficiently combined with a stochastic approximation version of the EM algorithm for maximum likelihood estimation in nonlinear mixed effects models. (10.1016/j.csda.2019.07.001)
    DOI : 10.1016/j.csda.2019.07.001
  • High-frequency trading : statistical analysis, modelling and regulation
    • Saliba Pamela
    , 2019. This thesis is made of two related parts. In the first one, we study the empirical behaviour of high-frequency traders on European financial markets. We use the obtained results to build in the second part new agent-based models for market dynamics. The main purpose of these models is to provide innovative tools for regulators and exchanges allowing them to design suitable rules at the microstructure level and to assess the impact of the various participants on market quality.In the first part, we conduct two empirical studies on unique data sets provided by the French regulator. It covers the trades and orders of the CAC 40 securities, with microseconds accuracy and labelled by the market participants identities. We begin by investigating the behaviour of high-frequency traders compared to the rest of the market, notably during periods of stress, in terms of liquidity provision and trading activity. We work both at the day-to-day scale and at the intra-day level. We then deepen our analysis by focusing on liquidity consuming orders. We give some evidence concerning their impact on the price formation process and their information content according to the different order flow categories: high-frequency traders, agency participants and proprietary participants.In the second part, we propose three different agent-based models. Using a Glosten-Milgrom type approach, the first model enables us to deduce the whole limit order book (bid-ask spread and volume available at each price) from the interactions between three kinds of agents: an informed trader, a noise trader and several market makers. It also allows us to build a spread forecasting methodology in case of a tick size change and to quantify the queue priority value. To work at the individual agent level, we propose a second approach where market participants specific dynamics are modelled by non-linear and state dependent Hawkes type processes. In this setting, we are able to compute several relevant microstructural indicators in terms of the individual flows. It is notably possible to rank market makers according to their own contribution to volatility. Finally, we introduce a model where market makers optimise their best bid and ask according to the profit they can generate from them and the inventory risk they face. We then establish theoretically and empirically a new important relationship between inventory and volatility.
  • Weighted Radon transforms and their applications
    • Goncharov Fedor
    , 2019. This thesis is devoted to studies of inverse problems for weighted Radon tranforms in euclidean spaces. On one hand, our studies are motivated by applications of weighted Radon transforms in different tomographies, for example, in emission tomographies (PET, SPECT), flourescence tomography and optical tomography. In particular, we develop a new reconstruction approach for tomographies in 3D, where data are modelized by weighted ray transforms along rays parallel to some fixed plane. In this connection our results include: formulas for reduction of the aforementioned weighted ray transforms to weghted Radon transforms along planes in 3D; an analog of Chang approximate inversion formula and an analog of Kunyansky-type iterative inversion algorithm for weighted Radon transforms in multidimensions; numercal reconstructions from simulated and real data. On the other hand, our studies are motivated by mathematical problems related to the aforementioned transforms. More precisely, we continue studies of injectivity and non-injectivity of weighted ray and Radon transforms in multidimensions and we construct a series of counterexamples to injectivity for the latter. These counterexamples are interesting and in some sense unexpected because they are close to the setting when the corresponding weighted ray and Radon transforms become injective. In particular, by one ofour constructions we give counterexamples to well-known injectivity theorems for weighted ray transforms (Quinto (1983), Markoe, Quinto (1985), Finch (1986), Ilmavirta (2016)) when the regularity assumptions on weights are slightly relaxed. By this result we show that, in particular, the regularity assumptions on weights are crucial for the injectivity and there is a breakdown of the latter if the assumptions are slightly relaxed.
  • Model-based clustering with missing not at random data. Missing mechanism
    • Laporte Fabien
    • Biernacki Christophe
    • Celeux Gilles
    • Josse Julie
    , 2019. Since the 90s, model-based clustering is largely used to classify data. Nowadays, with the increase of available data, missing values are more frequent. We defend the need to embed the missingness mechanism directly within the clustering model-ing step. There exist three types of missing data: missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). In all situations , logistic regression is proposed as a natural and exible candidate model. In this unied context, standard model selection criteria can be used to select between such dierent missing data mechanisms, simultaneously with the number of clusters. Practical interest of our proposal is illustrated on data derived from medical studies suffering from many missing data.
  • Rates of Convergence of Perturbed FISTA-based algorithms
    • Aujol Jean-François
    • Dossal Charles
    • Fort Gersende
    • Moulines Éric
    , 2019.
  • Benchmarking Algorithms from the platypus Framework on the Biobjective bbob-biobj Testbed
    • Brockhoff Dimo
    • Tušar Tea
    , 2019, 7. One of the main goals of the COCO platform is to produce, collect , and make available benchmarking performance data sets of optimization algorithms and, more concretely, algorithm implementations. For the recently proposed biobjective bbob-biobj test suite, less than 20 algorithms have been benchmarked so far but many more are available to the public. We therefore aim in this paper to benchmark several available multiobjective optimization algorithms on the bbob-biobj test suite and discuss their performance. We focus here on algorithms implemented in the platypus framework (in Python) whose main advantage is its ease of use without the need to set up many algorithm parameters. (10.1145/3319619.3326896)
    DOI : 10.1145/3319619.3326896
  • A Global Surrogate Assisted CMA-ES
    • Hansen Nikolaus
    , 2019, pp.664-672. (10.1145/3321707.3321842)
    DOI : 10.1145/3321707.3321842
  • Benchmarking MO-CMA-ES and COMO-CMA-ES on the Bi-objective bbob-biobj Testbed
    • Dufossé Paul
    • Touré Cheikh
    , 2019. In this paper, we propose a comparative benchmark of MO-CMAES, COMO-CMA-ES (recently introduced in [12]) and NSGA-II, using the COCO framework for performance assessment and the Bi-objective test suite bbob-biobj. For a fixed number of points p, COMO-CMA-ES approximates an optimal p-distribution of the Hypervolume Indicator. While not designed to perform on archive-based assessment, i.e. with respect to all points evaluated so far by the algorithm, COMO-CMA-ES behaves well on the COCO platform. The experiments are done in a true Black-Blox spirit by using a minimal setting relative to the information shared by the 55 problems of the bbob-biobj Testbed. (10.1145/3319619.3326892)
    DOI : 10.1145/3319619.3326892
  • The Impact of Sample Volume in Random Search on the bbob Test Suite
    • Brockhoff Dimo
    • Hansen Nikolaus
    , 2019. Uniform Random Search is considered the simplest of all randomized search strategies and thus a natural baseline in benchmarking. Yet, in continuous domain it has its search domain width as a parameter that potentially has a strong effect on its performance. In this paper, we investigate this effect on the well-known 24 functions from the bbob test suite by varying the sample domain of the algorithm ([−α, α]^n for α ∈ {0.5, 1, 2, 3, 4, 5, 6, 10, 20} and n the search space dimension). Though the optima of the bbob testbed are randomly chosen in [−4, 4]^n (with the exception of the linear function f5), the best strategy depends on the search space dimension and the chosen budget. Small budgets and larger dimensions favor smaller domain widths. (10.1145/3319619.3326894)
    DOI : 10.1145/3319619.3326894
  • Benchmarking Multivariate Solvers of SciPy on the Noiseless Testbed
    • Varelas Konstantinos
    • Dahito Marie-Ange
    , 2019. In this article we benchmark eight multivariate local solvers as well as the global Differential Evolution algorithm from the Python SciPy library on the BBOB noiseless testbed. We experiment with different parameter settings and termination conditions of the solvers. More focus is given to the L-BFGS-B and Nelder-Mead algorithms. For the first we investigate the effect of the maximum number of variable metric corrections used for the Hessian approximation and show that larger values than the default are of advantage. For the second we investigate the effect of adaptation of parameters, which is proved crucial for the performance of the method with increasing dimensionality. (10.1145/3319619.3326891)
    DOI : 10.1145/3319619.3326891