HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Theses

Constrained clustering by constraint programming

Abstract : Cluster analysis is an important task in Data Mining with hundreds of different approaches in the literature. Since the last decade, the cluster analysis has been extended to constrained clustering, also called semi-supervised clustering, so as to integrate previous knowledge on data to clustering algorithms. In this dissertation, we explore Constraint Programming (CP) for solving the task of constrained clustering. The main principles in CP are: (1) users specify declaratively the problem in a Constraint Satisfaction Problem; (2) solvers search for solutions by constraint propagation and search. Relying on CP has two main advantages: the declarativity, which enables to easily add new constraints and the ability to find an optimal solution satisfying all the constraints (when there exists one). We propose two models based on CP to address constrained clustering tasks. The models are flexible and general and supports instance-level constraints and different cluster-level constraints. It also allows the users to choose among different optimization criteria. In order to improve the efficiency, different aspects have been studied in the dissertation. Experiments on various classical datasets show that our models are competitive with other exact approaches. We show that our models can easily be embedded in a more general process and we illustrate this on the problem of finding the Pareto front of a bi-criterion optimization process.
Document type :
Theses
Complete list of metadata

Cited literature [46 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01202674
Contributor : Abes Star :  Contact
Submitted on : Monday, September 21, 2015 - 3:22:06 PM
Last modification on : Thursday, May 5, 2022 - 3:06:11 PM
Long-term archiving on: : Tuesday, December 29, 2015 - 9:01:25 AM

File

khanhchuong-duong_3694.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01202674, version 1

Citation

Khanh-Chuong Duong. Constrained clustering by constraint programming. Computers and Society [cs.CY]. Université d'Orléans, 2014. English. ⟨NNT : 2014ORLE2049⟩. ⟨tel-01202674⟩

Share

Metrics

Record views

247

Files downloads

639