Skip to Main content Skip to Navigation
Theses

An Iterative Regularized Method for Segmentation with Applications to Statistics

Abstract : This thesis deals with the development of regularized methods using penalized maximum likelihood estimation. More specifically, I use a sparsity-inducing iterative method called adaptive ridge. The latter is competitive compared to other approaches, namely in terms of ease of implementation and computational cost. My work consists in the application of this method to a wide range of problems: survival analysis, spline regression, and spatial segmentation. Applications in several issues show that the adaptive ridge's good performance in selection, great ease of implementation and low computational cost can make it a good starting point in penalization-base variable selection. In survival analysis, data are often collected by following a cohort, in which case the events are widely spread through time and the sample is suspected to present heterogeneity. I first focus on developing a method for the inference of the incidence, which allows to detect heterogeneity with respect to the date of birth (or cohort). A closely related problem is the study of the evolution of the inference as a joint function of the age, the date of birth (cohort), and the calendar time (period). Epidemiologists have long resorted to the age-period-cohort model or its submodels. The latter assume linear effects of each variable, which is deemed too simplistic to estimate potentially important features of the incidence. In this framework, I develop a model allowing for the joint estimation of two variables' effects and of their interaction. Spline regression is known to be a competitive method for non-parametric regression. However the estimated spline depends highly on the initial choice of knots and choosing the best knots is a computationally hard problem. I propose an approach for the estimation of the best knots jointly with the spline function. By initiating a large number of knots and successively removing the least relevant ones, my method makes a slightly restrictive hypothesis to remove much of the computational burden. In spatial statistics, the spatial domain is often divided into ``units'' and data are gathered at the unit level. The spatial effect is estimated on each unit and its representation is subject to the arbitrary of the unit division, which makes its interpretation difficult. This can be resolved by regularization, which reduces the variance and increases the interpretability. I present a model for segmentation of spatial data based on the adjacency structure of the units.
Complete list of metadata

Cited literature [243 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/tel-02473848
Contributor : Vivien Goepp <>
Submitted on : Tuesday, February 11, 2020 - 8:34:28 AM
Last modification on : Friday, December 4, 2020 - 11:58:02 AM
Long-term archiving on: : Tuesday, May 12, 2020 - 12:43:41 PM

File

phd_manuscript.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02473848, version 1

Citation

Vivien Goepp. An Iterative Regularized Method for Segmentation with Applications to Statistics. Computation [stat.CO]. Université de Paris / Université Paris Descartes (Paris 5), 2019. English. ⟨tel-02473848⟩

Share

Metrics

Record views

134

Files downloads

257