Abstract:
We present a robust machine learning methodology to accurately predict key photovoltaic parameters in organic solar cells (OSCs). Our approach involves curating a comprehensive quantum mechanical database of 300 experimentally validated OSC devices with distinct donor/acceptor combinations. Through a two-step screening process, we identify descriptors correlated with crucial properties such as short-circuit current (JSC), open-circuit voltage (VOC), fill-factor (FF), and power conversion efficiency (PCEmax). Utilizing a LASSO model for feature selection and four different supervised machine learning techniques for prediction, our model achieves high accuracy, with gradient boosting showing exceptional performance for JSC, VOC, and PCEmax. Shapley additive explanations (SHAP) analysis reveals the influential features and the intricate nonlinear relationships governing OSC performance. Additionally, we extend our model’s utility by predicting photovoltaic parameters for a larger data set of 4680 donor–acceptor combinations, including 120 newly designed nonfullerene acceptors and 39 experimentally known donor polymers. Our results highlight 18 donor–acceptor combinations with a power conversion efficiency exceeding 15%, emphasizing the efficacy of our approach in evaluating OSC materials. This work provides valuable insights for advancing photovoltaic research and serves as a powerful tool for the virtual screening of promising donor/acceptor pairs, accelerating the development of high-performance OSC materials and devices.