Abstract:
Continuous hydrological observations are essential for accurate modelling and informed water resource management. However, significant data gaps in streamflow and water level observations, compounded by extreme hydroclimatic events and quality control issues, impede robust hydrological analyses. We employed geomorphological, meteorological, and hydrological parameters in combination with machine learning to fill gaps in streamflow and water level observations at 343 stations across Peninsular India. We categorized stations into similar-behaving classes using K-means clustering on catchment characteristics to improve model performance and fill the data at ungauged locations. The machine learning approach showed Nash Sutcliffe Efficiency (NSE) of more than 0.90 for water level and streamflow at 78% and 91% of stations, respectively. The machine learning model’s performance decreases with increasing the duration of missing data. However, the average NSE remained above 0.85. The gap-filled streamflow record for the 1961-2021 period highlights a spatial decoupling between rainfall and streamflow trends, indicating the considerable influence of anthropogenic activities in India. Overall, the gap-filled streamflow and water level observations for 1961-2021 are valuable for the planning of hydrological assessment and water resources.