Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Exact clustering of weighted graphs via semidefinite programming

Pirinen, Aleksis LU and Ames, Brendan (2019) In Journal of Machine Learning Research 20. p.1-34
Abstract

As a model problem for clustering, we consider the densest k-disjoint-clique problem of partitioning a weighted complete graph into k disjoint subgraphs such that the sum of the densities of these subgraphs is maximized. We establish that such subgraphs can be recovered from the solution of a particular semidefinite relaxation with high probability if the input graph is sampled from a distribution of clusterable graphs. Specifically, the semidefinite relaxation is exact if the graph consists of k large disjoint subgraphs, corresponding to clusters, with weight concentrated within these subgraphs, plus a moderate number of nodes not belonging to any cluster. Further, we establish that if noise is weakly obscuring these clusters, i.e, the... (More)

As a model problem for clustering, we consider the densest k-disjoint-clique problem of partitioning a weighted complete graph into k disjoint subgraphs such that the sum of the densities of these subgraphs is maximized. We establish that such subgraphs can be recovered from the solution of a particular semidefinite relaxation with high probability if the input graph is sampled from a distribution of clusterable graphs. Specifically, the semidefinite relaxation is exact if the graph consists of k large disjoint subgraphs, corresponding to clusters, with weight concentrated within these subgraphs, plus a moderate number of nodes not belonging to any cluster. Further, we establish that if noise is weakly obscuring these clusters, i.e, the between-cluster edges are assigned very small weights, then we can recover significantly smaller clusters. For example, we show that in approximately sparse graphs, where the between-cluster weights tend to zero as the size n of the graph tends to infinity, we can recover clusters of size polylogarithmic in n under certain conditions on the distribution of edge weights. Empirical evidence from numerical simulations is also provided to support these theoretical phase transitions to perfect recovery of the cluster structure.

(Less)
Please use this url to cite or link to this publication:
author
and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Clustering, Densest subgraph, Semidefinite programming, Sparse graphs, Stochastic block models
in
Journal of Machine Learning Research
volume
20
pages
1 - 34
publisher
Microtome Publishing
external identifiers
  • scopus:85072637509
ISSN
1532-4435
language
English
LU publication?
yes
id
0d20fe28-6914-4ad9-8eba-65a0620181f8
alternative location
http://jmlr.org/papers/v20/16-128.html
date added to LUP
2019-10-07 13:51:31
date last changed
2022-04-18 18:24:46
@article{0d20fe28-6914-4ad9-8eba-65a0620181f8,
  abstract     = {{<p>As a model problem for clustering, we consider the densest k-disjoint-clique problem of partitioning a weighted complete graph into k disjoint subgraphs such that the sum of the densities of these subgraphs is maximized. We establish that such subgraphs can be recovered from the solution of a particular semidefinite relaxation with high probability if the input graph is sampled from a distribution of clusterable graphs. Specifically, the semidefinite relaxation is exact if the graph consists of k large disjoint subgraphs, corresponding to clusters, with weight concentrated within these subgraphs, plus a moderate number of nodes not belonging to any cluster. Further, we establish that if noise is weakly obscuring these clusters, i.e, the between-cluster edges are assigned very small weights, then we can recover significantly smaller clusters. For example, we show that in approximately sparse graphs, where the between-cluster weights tend to zero as the size n of the graph tends to infinity, we can recover clusters of size polylogarithmic in n under certain conditions on the distribution of edge weights. Empirical evidence from numerical simulations is also provided to support these theoretical phase transitions to perfect recovery of the cluster structure.</p>}},
  author       = {{Pirinen, Aleksis and Ames, Brendan}},
  issn         = {{1532-4435}},
  keywords     = {{Clustering; Densest subgraph; Semidefinite programming; Sparse graphs; Stochastic block models}},
  language     = {{eng}},
  pages        = {{1--34}},
  publisher    = {{Microtome Publishing}},
  series       = {{Journal of Machine Learning Research}},
  title        = {{Exact clustering of weighted graphs via semidefinite programming}},
  url          = {{http://jmlr.org/papers/v20/16-128.html}},
  volume       = {{20}},
  year         = {{2019}},
}