Share

Publications

Publications

CMAP Theses  are available by following this link:
Discover CMAP theses

Listed below, are sorted by year, the publications appearing in the HAL open archive.

2019

  • Statistical estimation in a randomly structured branching population
    • Hoffmann Marc
    • Marguet Aline
    Stochastic Processes and their Applications, Elsevier, 2019, 129 (12), pp.5236-5277. We consider a binary branching process structured by a stochastic trait that evolves according to a diffusion process that triggers the branching events, in the spirit of Kimmel's model of cell division with parasite infection. Based on the observation of the trait at birth of the first n generations of the process, we construct nonparametric estimator of the transition of the associated bifurcating chain and study the parametric estimation of the branching rate. In the limit $n → ∞$, we obtain asymptotic efficiency in the parametric case and minimax optimality in the nonparametric case. (10.1016/j.spa.2019.02.015)
    DOI : 10.1016/j.spa.2019.02.015
  • New preconditioners for Laplace and Helmholtz integral equations on open curves
    • Alouges François
    • Averseng Martin
    , 2019. The numerical resolution of wave scattering problems by open curves leads to ill-conditioned linear systems which are difficult to precondition due to the geometrical singularities at the edges. We introduce two new preconditioners to tackle this problem respectively for Dirichlet or Neu-mann boundary data, that take the form of square roots of local operators. We describe an adapted analytical setting to analyze them and demonstrate the efficiency of this method on several numerical examples. A complete new pseudo-differential calculus suited to the study of such operators is postponed to the second part of this work.
  • C-mix: a high dimensional mixture model for censored durations, with applications to genetic data
    • Bussy Simon
    • Guilloux Agathe
    • Gaïffas Stéphane
    • Jannot Anne-Sophie
    Statistical Methods in Medical Research, SAGE Publications, 2019, 28 (5), pp.1523--1539. We introduce a supervised learning mixture model for censored durations (C-mix) to simultaneously detect subgroups of patients with different prognosis and order them based on their risk. Our method is applicable in a high-dimensional setting, i.e. with a large number of biomedical covariates. Indeed, we penalize the negative log-likelihood by the Elastic-Net, which leads to a sparse parameterization of the model and automatically pinpoints the relevant covariates for the survival prediction. Inference is achieved using an efficient Quasi-Newton Expectation Maximization (QNEM) algorithm, for which we provide convergence properties. The statistical performance of the method is examined on an extensive Monte Carlo simulation study, and finally illustrated on three publicly available genetic cancer datasets with high-dimensional co-variates. We show that our approach outperforms the state-of-the-art survival models in this context, namely both the CURE and Cox proportional hazards models penalized by the Elastic-Net, in terms of C-index, AUC(t) and survival prediction. Thus, we propose a powerfull tool for personalized medicine in cancerology. (10.1177/0962280218766389)
    DOI : 10.1177/0962280218766389
  • A stochastic data-based traffic model applied to vehicles energy consumption estimation
    • Le Rhun Arthur
    • Bonnans Frédéric
    • de Nunzio Giovanni
    • Leroy Thomas
    • Martinon Pierre
    IEEE Transactions on Intelligent Transportation Systems, IEEE, 2019. A new approach to estimate traffic energy consumption via traffic data aggregation in (speed,acceleration) probability distributions is proposed. The aggregation is done on each segment composing the road network. In order to reduce data occupancy, clustering techniques are used to obtain meaningful classes of traffic conditions. Different times of the day with similar speed patterns and traffic behavior are thus grouped together in a single cluster. Different energy consumption models based on the aggregated data are proposed to estimate the energy consumption of the vehicles in the road network. For validation purposes, a microscopic traffic simulator is used to generate the data and compare the estimated energy consumption to the reference one. A thorough sensitivity analysis with respect to the parameters of the proposed method (i.e. number of clusters, size of the distributions support, etc.) is also conducted in simulation. Finally, a real-life scenario using floating car data is analyzed to evaluate the applicability and the robustness of the proposed method. (10.1109/TITS.2019.2923292)
    DOI : 10.1109/TITS.2019.2923292
  • Imputation of mixed data with multilevel singular value decomposition
    • Husson François
    • Josse Julie
    • Narasimhan Balasubramanian
    • Robin Geneviève
    Journal of Computational and Graphical Statistics, Taylor & Francis, 2019, 28 (3), pp.552-566. Statistical analysis of large data sets offers new opportunities to better understand many processes. Yet, data accumulation often implies relaxing acquisition procedures or compounding diverse sources. As a consequence, such data sets often contain mixed data, i.e. both quantitative and qualitative and many missing values. Furthermore, aggregated data present a natural \textit{multilevel} structure, where individuals or samples are nested within different sites, such as countries or hospitals. Imputation of multilevel data has therefore drawn some attention recently, but current solutions are not designed to handle mixed data, and suffer from important drawbacks such as their computational cost. In this article, we propose a single imputation method for multilevel data, which can be used to complete either quantitative, categorical or mixed data. The method is based on multilevel singular value decomposition (SVD), which consists in decomposing the variability of the data into two components, the between and within groups variability, and performing SVD on both parts. We show on a simulation study that in comparison to competitors, the method has the great advantages of handling data sets of various size, and being computationally faster. Furthermore, it is the first so far to handle mixed data. We apply the method to impute a medical data set resulting from the aggregation of several data sets coming from different hospitals. This application falls in the framework of a larger project on Trauma patients. To overcome obstacles associated to the aggregation of medical data, we turn to distributed computation. The method is implemented in an R package. (10.1080/10618600.2019.1585261)
    DOI : 10.1080/10618600.2019.1585261
  • Existence of strong solutions to the Dirichlet problem for the Griffith energy
    • Chambolle Antonin
    • Crismale Vito
    Calculus of Variations and Partial Differential Equations, Springer Verlag, 2019, 58 (136). In this paper we continue the study of the Griffith brittle fracture energy minimisation under Dirichlet boundary conditions, suggested by Francfort and Marigo in 1998. In a recent paper, we proved the existence of weak minimisers of the problem. Now we show that these minimisers are indeed strong solutions, namely their jump set is closed and they are smooth away from the jump set and continuous up to the Dirichlet boundary. This is obtained by extending up to the boundary the recent regularity results of Conti, Focardi, Iurlano, and of Chambolle, Conti, Iurlano. (10.1007/s00526-019-1571-7)
    DOI : 10.1007/s00526-019-1571-7
  • Incomplete graphical model inference via latent tree aggregation
    • Robin Geneviève
    • Ambroise Christophe
    • Robin Stephane S.
    Statistical Modelling, SAGE Publications, 2019, 19 (5), pp.545-568. Graphical network inference is used in many fields such as genomics or ecology to infer the conditional independence structure between variables, from measurements of gene expression or species abundances for instance. In many practical cases, not all variables involved in the network have been observed, and the samples are actually drawn from a distribution where some variables have been marginalized out. This challenges the sparsity assumption commonly made in graphical model inference, since marginalization yields locally dense structures, even when the original network is sparse. We present a procedure for inferring Gaussian graphical models when some variables are unobserved, that accounts both for the influence of missing variables and the low density of the original network. Our model is based on the aggregation of spanning trees, and the estimation procedure on the Expectation-Maximization algorithm. We treat the graph structure and the unobserved nodes as missing variables and compute posterior probabilities of edge appearance. To provide a complete methodology, we also propose several model selection criteria to estimate the number of missing nodes. A simulation study and an illustration flow cytometry data reveal that our method has favorable edge detection properties compared to existing graph inference techniques. The methods are implemented in an R package. (10.1177/1471082X18786289)
    DOI : 10.1177/1471082X18786289
  • A Scaling Analysis of a Star Network with Logarithmic Weights
    • Robert Philippe
    • Véber Amandine
    Stochastic Processes and their Applications, Elsevier, 2019. The paper investigates the properties of a class of resource allocation algorithms for communication networks: if a node of this network has L requests to transmit and is idle, it tries to access the channel at a rate proportional to log(1+L). A stochastic model of such an algorithm is investigated in the case of the star network, in which J nodes can transmit simultaneously, but interfere with a central node 0 in such a way that node 0 cannot transmit while one of the other nodes does. One studies the impact of the log policy on these J+1 interacting communication nodes. A fluid scaling analysis of the network is derived with the scaling parameter N being the norm of the initial state. It is shown that the asymptotic fluid behavior of the system is a consequence of the evolution of the state of the network on a specific time scale $(N t , t∈(0, 1))$. The main result is that, on this time scale and under appropriate conditions, the state of a node with index $j≥1$ is of the order of $N^{a_j(t)}$ , with $0≤a_j(t)<1$, where $t →a_j(t)$ is a piecewise linear function. Convergence results on the fluid time scale and a stability property are derived as a consequence of this study. (10.1016/j.spa.2018.06.002)
    DOI : 10.1016/j.spa.2018.06.002
  • Mean field model for collective motion bistability
    • Garnier Josselin
    • Papanicolaou George
    • Yang Tzu-Wei
    Discrete and Continuous Dynamical Systems - Series B, American Institute of Mathematical Sciences, 2019, 24 (2), pp.851-879. (10.3934/dcdsb.2018210)
    DOI : 10.3934/dcdsb.2018210
  • On the Essential Self-Adjointness of Singular Sub-Laplacians
    • Franceschi Valentina
    • Prandi Dario
    • Rizzi Luca
    Potential Analysis, Springer Verlag, 2019, 53, pp.89-112. (10.1007/s11118-018-09760-w)
    DOI : 10.1007/s11118-018-09760-w
  • The asymptotic geometry of the Teichmüller metric
    • Walsh Cormac
    Geometriae Dedicata, Springer Verlag, 2019, 200 (1), pp.115-152. We determine the asymptotic behaviour of extremal length along arbitrary Teichmüller rays. This allows us to calculate the endpoint in the Gardiner-Masur boundary of any Teichmüller ray. We give a proof that this compactification is the same as the horofunction compactification. An important subset of the latter is the set of Busemann points. We show that the Busemann points are exactly the limits of the Teichmüller rays, and we give a necessary and sufficient condition for a sequence of Busemann points to converge to a Busemann point. Finally, we determine the detour metric on the boundary. (10.1007/s10711-018-0364-z)
    DOI : 10.1007/s10711-018-0364-z
  • The operator approach to entropy games
    • Akian Marianne
    • Gaubert Stéphane
    • Grand-Clément Julien
    • Guillaud Jérémie
    Theory of Computing Systems, Springer Verlag, 2019, 63, pp.1089-1130. Entropy games and matrix multiplication games have been recently introduced by Asarin et al. They model the situation in which one player (Despot) wishes to minimize the growth rate of a matrix product, whereas the other player (Tribune) wishes to maximize it. We develop an operator approach to entropy games. This allows us to show that entropy games can be cast as stochastic mean payoff games in which some action spaces are simplices and payments are given by a relative entropy (Kullback-Leibler divergence). In this way, we show that entropy games with a fixed number of states belonging to Despot can be solved in polynomial time. This approach also allows us to solve these games by a policy iteration algorithm, which we compare with the spectral simplex algorithm developed by Protasov. (10.1007/s00224-019-09925-z)
    DOI : 10.1007/s00224-019-09925-z
  • Quantifying uncertainties in signal position in non-resolved object images: application to space object observation
    • Sanson Francois
    • Frueh Carolin
    Advances in Space Research, Elsevier, 2019. Charged Coupled Devices (CCDs) and subsequently Complementary metal-oxide-semiconductor (CMOS) detectors revolutionized scientific imaging. On both the CCD and CMOS detector, the generated images are degraded by inevitable noise. In many applications, such as in astronomy or for satellite tracking , only unresolved object images are available. Strategies to estimate the center of the non-resolved image their results are affected by the detector noise. The uncertainty in the center is classically estimated by running prohibitively costly Monte Carlo simulations, but in this paper, we propose analytic uncertainty estimates of the center position. The expressions that depend on the pixel size, the signal to noise ratio and the extension of the object signal relative to the pixel size are validated against rigorous Monte Carlo simulations with very satisfying results. Numerical tests show that our analytic expression is an efficient substitute to the Monte Carlo simulation thereby reducing computational cost. (10.1016/j.asr.2018.12.040)
    DOI : 10.1016/j.asr.2018.12.040
  • Optimal inventory management and order book modeling
    • Baradel Nicolas
    • Bouchard Bruno
    • Evangelista David
    • Mounjid Othmane
    ESAIM: Proceedings and Surveys, EDP Sciences, 2019, 65, pp.145-181. We model the behavior of three agent classes acting dynamically in a limit order book of a financial asset. Namely, we consider market makers (MM), high-frequency trading (HFT) firms, and institutional brokers (IB). Given a prior dynamic of the order book, similar to the one considered in the Queue-Reactive models [14, 20, 21], the MM and the HFT define their trading strategy by optimizing the expected utility of terminal wealth, while the IB has a prescheduled task to sell or buy many shares of the considered asset. We derive the variational partial differential equations that characterize the value functions of the MM and HFT and explain how almost optimal control can be deduced from them. We then provide a first illustration of the interactions that can take place between these different market participants by simulating the dynamic of an order book in which each of them plays his own (optimal) strategy.
  • A degenerate Cahn‐Hilliard model as constrained Wasserstein gradient flow
    • Matthes Daniel
    • Cancès Clément
    • Nabet Flore
    , 2019, 19 (1). (10.1002/pamm.201900158)
    DOI : 10.1002/pamm.201900158
  • Self-Exclusion among Online Poker Gamblers: Effects on Expenditure in Time and Money as Compared to Matched Controls
    • Luquiens Amandine
    • Dugravot Aline
    • Panjo Henri
    • Benyamina Amine
    • Gaïffas Stéphane
    • Bacry Emmanuel
    International Journal of Environmental Research and Public Health, MDPI, 2019, 16 (22), pp.4399. Background: No comparative data is available to report on the effect of online self-exclusion. The aim of this study was to assess the effect of self-exclusion in online poker gambling as compared to matched controls, after the end of the self-exclusion period. Methods: We included all gamblers who were first-time self-excluders over a 7-year period (n = 4887) on a poker website, and gamblers matched for gender, age and account duration (n = 4451). We report the effects over time of self-exclusion after it ended, on money (net losses) and time spent (session duration) using an analysis of variance procedure between mixed models with and without the interaction of time and self-exclusion. Analyzes were performed on the whole sample, on the sub-groups that were the most heavily involved in terms of time or money (higher quartiles) and among short-duration self-excluders (&lt;3 months). Results: Significant effects of self-exclusion and short-duration self-exclusion were found for money and time spent over 12 months. Among the gamblers that were the most heavily involved financially, no significant effect on the amount spent was found. Among the gamblers who were the most heavily involved in terms of time, a significant effect was found on time spent. Short-duration self-exclusions showed no significant effect on the most heavily involved gamblers. Conclusions: Self-exclusion seems efficient in the long term. However, the effect on money spent of self-exclusions and of short-duration self-exclusions should be further explored among the most heavily involved gamblers. (10.3390/ijerph16224399)
    DOI : 10.3390/ijerph16224399
  • Approximation of functions with small jump sets and existence of strong minimizers of Griffith's energy
    • Chambolle Antonin
    • Conti Sergio
    • Iurlano Flaviana
    Journal de Mathématiques Pures et Appliquées, Elsevier, 2019, 128 (9), pp.119--139. We prove that special functions of bounded deformation with small jump set are close in energy to functions which are smooth in a slightly smaller domain. This permits to generalize the decay estimate by De Giorgi, Carriero, and Leaci to the linearized context in dimension n and to establish the closedness of the jump set for local minimizers of the Griffith energy. (10.1016/j.matpur.2019.02.001)
    DOI : 10.1016/j.matpur.2019.02.001
  • A breakdown of injectivity for weighted ray transforms in multidimensions
    • Goncharov Fedor O
    • Novikov Roman G
    Arkiv för Matematik, Royal Swedish Academy of Sciences, Institut Mittag-Leffler, 2019, 57, pp.333–371. We consider weighted ray-transforms $P_W$ (weighted Radon transforms along straight lines) in $\mathbb{R}^d, \, d\geq 2,$ with strictly positive weights $W$. We construct an example of such a transform with non-trivial kernel in the space of infinitely smooth compactly supported functions on $\mathbb{R}^d$. In addition, the constructed weight $W$ is rotation-invariant continuous and is infinitely smooth almost everywhere on $\mathbb{R}^d \times \mathbb{S}^{d-1}$. In particular, by this construction we give counterexamples to some well-known injectivity results for weighted ray transforms for the case when the regularity of $W$ is slightly relaxed. We also give examples of continous strictly positive $W$ such that $\dim \ker P_W \geq n$ in the space of infinitely smooth compactly supported functions on $\mathbb{R}^d$ for arbitrary $n\in \mathbb{N}\cup \{\infty\}$, where $W$ are infinitely smooth for $d=2$ and infinitely smooth almost everywhere for $d\geq 3$. (10.4310/ARKIV.2019.v57.n2.a5)
    DOI : 10.4310/ARKIV.2019.v57.n2.a5
  • A MICROSCOPIC VIEW ON THE FOURIER LAW
    • Bodineau Thierry
    • Gallagher Isabelle
    • Saint-Raymond Laure
    Comptes Rendus. Physique, Académie des sciences (Paris), 2019. The Fourier law of heat conduction describes heat diffusion in macroscopic systems. This physical law has been experimentally tested for a large class of physical systems. A natural question is to know whether it can be derived from the microscopic models using the fundamental laws of mechanics.
  • ConvSCCS: convolutional self-controlled case-seris model for lagged adverser event detection
    • Morel Maryan
    • Bacry Emmanuel
    • Gaïffas Stéphane
    • Guilloux Agathe
    • Leroy Fanny
    Biostatistics, Oxford University Press (OUP), 2019. With the increased availability of large electronic health records databases comes the chance of enhancing health risks screening. Most post-marketing detection of adverse drug reaction (ADR) relies on physicians' spontaneous reports, leading to under-reporting. To take up this challenge, we develop a scalable model to estimate the effect of multiple longitudinal features (drug exposures) on a rare longitudinal outcome. Our procedure is based on a conditional Poisson regression model also known as self-controlled case series (SCCS). To overcome the need of precise risk periods specification, we model the intensity of outcomes using a convolution between exposures and step functions, which are penalized using a combination of group-Lasso and total-variation. Up to our knowledge, this is the first SCCS model with flexible intensity able to handle multiple longitudinal features in a single model. We show that this approach improves the state-of-the-art in terms of mean absolute error and computation time for the estimation of relative risks on simulated data. We apply this method on an ADR detection problem, using a cohort of diabetic patients extracted from the large French national health insurance database (SNIIRAM), a claims database containing medical reimbursements of more than 53 million people. This work has been done in the context of a research partnership between Ecole Polytechnique and CNAMTS (in charge of SNIIRAM). (10.1093/biostatistics/kxz003)
    DOI : 10.1093/biostatistics/kxz003
  • Option pricing under fast-varying long-memory stochastic volatility
    • Garnier Josselin
    • Solna Knut
    Mathematical Finance, Wiley, 2019, 29 (1), pp.39-83. (10.1111/mafi.12186)
    DOI : 10.1111/mafi.12186
  • Kinetic model of adsorption on crystal surfaces
    • Aoki Kazuo
    • Giovangigli Vincent
    Physical Review E, American Physical Society (APS), 2019, 99. A kinetic theory model describing physisorption and chemisorption of gas particles on a crystal surface is introduced. A single kinetic equation is used to model gas and physisorbed particles interacting with a crystal potential and colliding with phonons. The phonons are assumed to be at equilibrium and the physisorbate-gas equation is coupled to similar kinetic equations describing chemisorbed particles and crystal atoms on the surface. A kinetic entropy is introduced for the coupled system and the H theorem is established. Using the Chapman-Enskog method with a fluid scaling, the asymptotic structure of the adsorbate is investigated and fluid boundary conditions are derived from the kinetic model. (10.1103/PhysRevE.99.052137)
    DOI : 10.1103/PhysRevE.99.052137
  • Scaling limits of population and evolution processes in random environment
    • Bansaye Vincent
    • Caballero Maria-Emilia
    • Méléard Sylvie
    Electronic Journal of Probability, Institute of Mathematical Statistics (IMS), 2019, 95 (5), pp.749-784. Our motivation comes from the large population approximation of individual based models in population dynamics and population genetics. We propose a general method to investigate scaling limits of finite dimensional population size Markov chains to diffusion with jumps. The statements of tightness, identification and convergence in law are based on the convergence of suitable characteristics of the transition of the chain and strongly exploit the structure of the population processes defined recursively as sums of independent random variables. These results allow to reduce the convergence of characteristics of semimartingales to analytically tractable functional spaces. We develop two main applications. First, we extend the classical Wright-Fisher diffusion approximation to independent and identically distributed random environment. Second, we obtain the convergence in law of generalized Galton-Watson processes with interactions and random environment to the solution of stochastic differential equations with jumps. (10.1214/19-EJP262)
    DOI : 10.1214/19-EJP262
  • New preconditioners for Laplace and Helmholtz integral equations on open curves
    • Averseng Martin
    , 2019. This paper is the second part of a work on Laplace and Helmholtz integral equations in 2 space dimensions on open curves. A new Galerkin method in weighted L 2 spaces together with new preconditioners for the weighted layer potentials are studied. This second part provides the theoretical analysis needed to establish the results announced in the first part. The main novelty is the introduction of a pseudo-differential calculus on open curves that allows to build parametrices for the weighted layer potentials. Contrarily to more classical approaches where the Mellin transform is used, this new approach is well-suited to the specific singularities that appear in the problem.
  • Curvature: a variational approach
    • Agrachev Andrei
    • Barilari Davide
    • Rizzi Luca
    Memoirs of the American Mathematical Society, American Mathematical Society, 2019, 256 (1225). The curvature discussed in this paper is a rather far going generalization of the Riemann sectional curvature. We define it for a wide class of optimal control problems: a unified framework including geometric structures such as Riemannian, sub-Riemannian, Finsler and sub-Finsler structures; a special attention is paid to the sub-Riemannian (or Carnot-Caratheodory) metric spaces. Our construction of the curvature is direct and naive, and it is similar to the original approach of Riemann. Surprisingly, it works in a very general setting and, in particular, for all sub-Riemannian spaces. (10.1090/memo/1225)
    DOI : 10.1090/memo/1225