Sample Compression Schemes for Balls in Graphs
Résumé
One of the open problems in machine learning is whether any set-family of VC-dimension $d$ admits a sample compression scheme of size $O(d)$. In this paper, we study this problem for balls in graphs. For a ball $B=B_r(x)$ of a graph $G=(V,E)$, a realizable sample for $B$ is a signed subset $X=(X^+,X^-)$ of $V$ such that $B$ contains $X^+$ and is disjoint from $X^-$. A proper sample compression scheme of size $k$ consists of a compressor and a reconstructor. The compressor maps any realizable sample $X$ to a subsample $X'$ of size at most $k$. The reconstructor maps each such subsample $X'$ to a ball $B'$ of $G$ such that $B'$ includes $X^+$ and is disjoint from $X^-$. For balls of arbitrary radius $r$, we design proper labeled sample compression schemes of size 2 for trees, of size 3 for cycles, of size 4 for interval graphs, of size 6 for trees of cycles, and of size 22 for cube-free median graphs. For balls of a given radius, we design proper labeled sample compression schemes of size 2 for trees and of size 4 for interval graphs. We also design approximate sample compression schemes of size 2 for balls of $\delta$-hyperbolic graphs.