dc.contributor.author |
Chhaya, Rachit |
|
dc.contributor.author |
Choudhari, Jayesh |
|
dc.contributor.author |
Dasgupta, Anirban |
|
dc.contributor.author |
Shit, Supratim |
|
dc.date.accessioned |
2020-12-26T13:48:45Z |
|
dc.date.available |
2020-12-26T13:48:45Z |
|
dc.date.issued |
2020-12 |
|
dc.identifier.citation |
Chhaya, Rachit; Choudhari, Jayesh; Dasgupta, Anirban and Shit, Supratim, "Online coresets for clustering with Bregman divergences", arXiv, Cornell University Library, DOI: arXiv:2012.06522, Dec. 2020. |
en_US |
dc.identifier.uri |
http://arxiv.org/abs/2012.06522 |
|
dc.identifier.uri |
https://repository.iitgn.ac.in/handle/123456789/6152 |
|
dc.description.abstract |
We present algorithms that create coresets in an online setting for clustering problems according to a wide subset of Bregman divergences. Notably, our coresets have a small additive error, similar in magnitude to the lightweight coresets Bachem et. al. 2018, and take update time O(d) for every incoming point where d is dimension of the point. Our first algorithm gives online coresets of size O~(poly(k,d,?,?)) for k-clusterings according to any ?-similar Bregman divergence. We further extend this algorithm to show existence of a non-parametric coresets, where the coreset size is independent of k, the number of clusters, for the same subclass of Bregman divergences. Our non-parametric coresets are larger by a factor of O(logn) (n is number of points) and have similar (small) additive guarantee. At the same time our coresets also function as lightweight coresets for non-parametric versions of the Bregman clustering like DP-Means. While these coresets provide additive error guarantees, they are also significantly smaller (scaling with O(logn) as opposed to O(dd) for points in Rd) than the (relative-error) coresets obtained in Bachem et. al. 2015 for DP-Means. While our non-parametric coresets are existential, we give an algorithmic version under certain assumptions. |
|
dc.description.statementofresponsibility |
by Rachit Chhaya, Jayesh Choudhari, Anirban Dasgupta and Supratim Shit |
|
dc.language.iso |
en_US |
en_US |
dc.publisher |
Cornell University Library |
en_US |
dc.subject |
Data Structures |
en_US |
dc.subject |
Algorithms (cs.DS) |
en_US |
dc.subject |
Machine Learning (cs.LG) |
en_US |
dc.title |
Online coresets for clustering with Bregman divergences |
en_US |
dc.type |
Pre-Print |
en_US |
dc.relation.journal |
arXiv |
|