Climate Change Data Portal
DOI | 10.1016/j.atmosenv.2020.118125 |
Enhancing the Evaluation and Interpretability of Data-Driven Air Quality Models | |
Gu J.; Yang B.; Brauer M.; Zhang K.M. | |
发表日期 | 2021 |
ISSN | 1352-2310 |
卷号 | 246 |
英文摘要 | Resolving spatial variability in ambient air pollutant and quantifying contributing factors are critical to human exposure assessment and effective pollution control. Data-driven techniques have been employed in air quality modeling due to their capability to capture the complex relationships in data as well as for the benefit of fast and easy implementation. In this study, we addressed two issues on model evaluation and interpretability by applying two common data-driven approaches, linear regression (LR) and random forest (RF) with potentially predictive land-use predictor variables to predict spatial variations of air pollution in an urban setting. The data came from the measurement of ambient nitrogen dioxide (NO2) concentrations in the Greater Vancouver Regional District in Canada. First, we showed that the model performance was sensitive to the division of training and test sets. Applying a limited number of hold-out validations or cross-validations and reporting the mean model metrics cannot capture the variability and fairly evaluate the model performance. We proposed repeated cross-validations (RCVs) as a reliable evaluation method that accounts for both mean and variance. Second, there is not a consistent approach to measure the importance of predictor variables and quantify their contributions among different types of data-driven models. Traditional approaches only reflect the relative importance among predictor variables in terms of predictive power without a quantification of contribution to the model output. We proposed to apply SHapley Additive exPlanations (SHAP), a Shapley-value-based explanation method based on the coalitional game theory, as a unifying framework to interpret and compare different types of data-driven methods. We showed that SHAP is capable of 1) calculating predictor variable's contribution to each data point; 2) ranking the importance of predictor variables in terms of their contributions to the model output. The results indicated that different models may favor different predictor variables and result in different interpretability. © 2020 The Authors |
关键词 | Air quality modelingData-driven modelLinear regressionModel evaluationModel interpretabilityRandom forestRepeated cross-validationsSHAP |
语种 | 英语 |
scopus关键词 | Air pollution control; Air quality; Decision trees; Game theory; Land use; Nitrogen oxides; Quality assurance; Air quality modeling; Ambient nitrogen dioxide; Coalitional game theory; Complex relationships; Data driven technique; Data-driven methods; Reliable evaluation method; Traditional approaches; Quality control; nitrogen dioxide; air quality; ambient air; atmospheric modeling; atmospheric pollution; implementation process; model validation; pollution control; ranking; spatial variation; training; air quality; ambient air; Article; Canada; cross validation; model; priority journal; repeated cross validation; British Columbia; Canada; Vancouver [British Columbia] |
来源期刊 | ATMOSPHERIC ENVIRONMENT |
文献类型 | 期刊论文 |
条目标识符 | http://gcip.llas.ac.cn/handle/2XKMVOVA/248691 |
作者单位 | Sibley School of Mechanical and Aerospace Engineering, Cornell University, Ithaca, NY 14853, United States; School of Population and Public Health, The University of British Columbia, Vancouver, BC V6T 1Z3, Canada |
推荐引用方式 GB/T 7714 | Gu J.,Yang B.,Brauer M.,et al. Enhancing the Evaluation and Interpretability of Data-Driven Air Quality Models[J],2021,246. |
APA | Gu J.,Yang B.,Brauer M.,&Zhang K.M..(2021).Enhancing the Evaluation and Interpretability of Data-Driven Air Quality Models.ATMOSPHERIC ENVIRONMENT,246. |
MLA | Gu J.,et al."Enhancing the Evaluation and Interpretability of Data-Driven Air Quality Models".ATMOSPHERIC ENVIRONMENT 246(2021). |
条目包含的文件 | 条目无相关文件。 |
个性服务 |
推荐该条目 |
保存到收藏夹 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[Gu J.]的文章 |
[Yang B.]的文章 |
[Brauer M.]的文章 |
百度学术 |
百度学术中相似的文章 |
[Gu J.]的文章 |
[Yang B.]的文章 |
[Brauer M.]的文章 |
必应学术 |
必应学术中相似的文章 |
[Gu J.]的文章 |
[Yang B.]的文章 |
[Brauer M.]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。