UNITYAI-GUARD: pioneering toxicity detection across low-resource Indian languages

Show simple item record

dc.contributor.author Beniwal, Himanshu
dc.contributor.author Venkat, Reddybathuni
dc.contributor.author Kumar, Rohit
dc.contributor.author Srivibhav, Birudugadda
dc.contributor.author Jain, Daksh
dc.contributor.author Doddi, Pavan
dc.contributor.author Dhande, Eshwar
dc.contributor.author Ananth, Adithya
dc.contributor.author Kuldeep
dc.contributor.author Kubadia, Heer
dc.contributor.author Sharda, Pratham
dc.contributor.author Singh, Mayank
dc.coverage.spatial United States of America
dc.date.accessioned 2025-04-11T08:07:19Z
dc.date.available 2025-04-11T08:07:19Z
dc.date.issued 2025-03
dc.identifier.citation Beniwal, Himanshu; Venkat, Reddybathuni; Kumar, Rohit; Srivibhav, Birudugadda; Jain, Daksh; Doddi, Pavan; Dhande, Eshwar; Ananth, Adithya; Kuldeep; Kubadia, Heer; Sharda, Pratham and Singh, Mayank, "UNITYAI-GUARD: pioneering toxicity detection across low-resource Indian languages", arXiv, Cornell University Library, DOI: arXiv:2503.23088, Mar. 2025.
dc.identifier.uri http://arxiv.org/abs/2503.23088
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/11191
dc.description.abstract This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 84.23% across seven languages, leveraging a dataset of 888k training instances and 35k manually verified test instances. By advancing multilingual content moderation for linguistically diverse regions, UnityAI-Guard also provides public API access to foster broader adoption and application.
dc.description.statementofresponsibility by Himanshu Beniwal, Reddybathuni Venkat, Rohit Kumar, Birudugadda Srivibhav, Daksh Jain, Pavan Doddi, Eshwar Dhande, Adithya Ananth, Kuldeep, Heer Kubadia, Pratham Sharda and Mayank Singh
dc.language.iso en_US
dc.publisher Cornell University Library
dc.title UNITYAI-GUARD: pioneering toxicity detection across low-resource Indian languages
dc.type Article
dc.relation.journal arXiv


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account