dc.contributor.author |
Doshi, Ishita |
|
dc.contributor.author |
Sajjalla, Sreekalyan |
|
dc.contributor.author |
Choudhari, Jayesh |
|
dc.contributor.author |
Bhatt, Rushi |
|
dc.contributor.author |
Dasgupta, Anirban |
|
dc.date.accessioned |
2020-09-03T06:25:09Z |
|
dc.date.available |
2020-09-03T06:25:09Z |
|
dc.date.issued |
2020-08 |
|
dc.identifier.citation |
Doshi, Ishita; Sajjalla, Sreekalyan; Choudhari, Jayesh; Bhatt, Rushi and Dasgupta, Anirban, "Efficient hierarchical clustering for classification and anomaly detection", arXiv, Cornell University Library, DOI: arXiv:/2008.10828, Aug. 2020. |
en_US |
dc.identifier.uri |
http://arxiv.org/abs/2008.10828 |
|
dc.identifier.uri |
https://repository.iitgn.ac.in/handle/123456789/5687 |
|
dc.description.abstract |
We address the problem of large scale real time classification of content posted on social networks, along with the need to rapidly identify novel spam types. Obtaining manual labels for user generated content using editorial labeling and taxonomy development lags compared to the rate at which new content type needs to be classified. We propose a class of hierarchical clustering algorithms that can be used both for efficient and scalable real-time multiclass classification as well as in detecting new anomalies in user generated content. Our methods have low query time, linear space usage, and come with theoretical guarantees with respect to a specific hierarchical clustering cost function [1] (Dasgupta, 2016). We compare our solutions against a range of classification techniques and demonstrate excellent empirical performance. |
|
dc.description.statementofresponsibility |
by Ishita Doshi, Sreekalyan Sajjalla, Jayesh Choudhari, Rushi Bhatt and Anirban Dasgupta |
|
dc.language.iso |
en_US |
en_US |
dc.publisher |
Cornell University Library |
en_US |
dc.title |
Efficient hierarchical clustering for classification and anomaly detection |
en_US |
dc.type |
Pre-Print |
en_US |