Integrating pairwise constraints into clustering algorithms: optimization-based approaches
Abstract
In this paper we introduce new models for semi-supervised clustering problem; in particular we address this problem from the representation space point of view. Given a dataset enhanced with constraints (typically must-link and cannot-link constraints) and any clustering algorithm, the proposed approach aims at learning a projection space for the dataset that satisfies not only the constraints but also the required objective of the clustering algorithm on unenhanced data. We propose a boosting framework to weight the constraints and infers successive projection spaces in such a way that algorithm performance is improved. We experiment this approach on standard UCI datasets and show the effectiveness of our algorithm.