AIRDELHI: fine-grained spatio-temporal particulate matter dataset from Delhi for ML based modeling

Show simple item record

dc.contributor.author Chauhan, Sachin Kumar
dc.contributor.author Ranu, Sayan
dc.contributor.author Sen, Rijurekha
dc.contributor.author Patel, Zeel B.
dc.contributor.author Batra, Nipun
dc.contributor.other 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
dc.coverage.spatial United States of America
dc.date.accessioned 2023-11-23T09:51:55Z
dc.date.available 2023-11-23T09:51:55Z
dc.date.issued 2023-12-10
dc.identifier.citation Chauhan, Sachin Kumar; Ranu, Sayan; Sen, Rijurekha; Patel, Zeel B. and Batra, Nipun, "AIRDELHI: fine-grained spatio-temporal particulate matter dataset from Delhi for ML based modeling", in the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, US, Dec. 10-16, 2023.
dc.identifier.uri https://openreview.net/pdf?id=n2wW7goGky
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/9485
dc.description.abstract Air pollution poses serious health concerns in developing countries, such as India, necessitating large-scale measurement for correlation analysis, policy recommendations, and informed decision-making. However, fine-grained data collection is costly. Specifically, static sensors for pollution measurement cost several thousand dollars per unit, leading to inadequate deployment and coverage. To complement the existing sparse static sensor network, we propose a mobile sensor network utilizing lower-cost PM2.5 sensors mounted on public buses in the Delhi-NCR region of India. Through this exercise, we introduce a novel dataset AIRDELHI comprising PM2.5 and PM10 measurements. This dataset is made publicly available at https: // www. cse. iitd. ac. in/ pollution data , serving as a valuable resource for machine learning (ML) researchers and environmentalists. We present three key contributions with the release of this dataset. Firstly, through in-depth statistical analysis, we demonstrate that the released dataset significantly differs from existing pollution datasets, highlighting its uniqueness and potential for new insights. Secondly, the dataset quality been validated against existing expensive sensors. Thirdly, we conduct a benchmarking exercise (https: // github. com/ sachin-iitd/ DelhiPMDatasetBenchmark ), evaluating state-of-the-art methods for interpolation, feature imputation, and forecasting on this dataset, which is the largest publicly available PM dataset to date. The results of the benchmarking exercise underscore the substantial disparities in accuracy between the proposed dataset and other publicly available datasets. This finding highlights the complexity and richness of our dataset, emphasizing its value for advancing research in the field of air pollution.
dc.description.statementofresponsibility by Sachin Kumar Chauhan, Sayan Ranu, Rijurekha Sen, Zeel B. Patel and Nipun Batra
dc.language.iso en_US
dc.title AIRDELHI: fine-grained spatio-temporal particulate matter dataset from Delhi for ML based modeling
dc.type Conference Paper


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account