主办单位:中国气象局沈阳大气环境研究所
国际刊号:ISSN 1673-503X
国内刊号:CN 21-1531/P

气象与环境学报 ›› 2023, Vol. 39 ›› Issue (1): 44-54.doi: 10.3969/j.issn.1673-503X.2023.01.006

• 论文 • 上一篇    下一篇

基于极端梯度提升算法的西安市逐小时PM2.5浓度预报研究

张煦庭1,2(),刘慧2,3,刘瑞芳2,4,*(),巨菲2,5,刘嘉慧敏2,3,高星星2,3,黄少妮2,3,王楠2,5   

  1. 1. 陕西省农业遥感与经济作物气象服务中心, 陕西西安 710016
    2. 陕西省气象局秦岭和黄土高原生态环境 气象重点实验室, 陕西西安 710016
    3. 陕西省气象台, 陕西西安 710014
    4. 西安市气象局, 陕西西安 710016
    5. 陕西省气象科学研究所, 陕西西安 710016
  • 收稿日期:2022-07-28 出版日期:2023-02-28 发布日期:2023-03-27
  • 通讯作者: 刘瑞芳 E-mail:zhang_xuting26@126.com;26886083@qq.com
  • 作者简介:张煦庭, 男, 1989年生, 工程师, 主要从事环境气象方面的研究, E-mail: zhang_xuting26@126.com
  • 基金资助:
    陕西省自然科学基础研究计划项目(2021JQ-965);陕西省自然科学基础研究计划项目(2022JQ-249);陕西省自然科学基础研究计划项目(2022JQ-294);陕西省气象局秦岭和黄土高原生态环境气象重点实验室开放研究基金课题(2020G-6)

Study on hourly PM2.5 concentration forecast based on XGBoost method in Xi'an city

Xu-ting ZHANG1,2(),Hui LIU2,3,Rui-fang LIU2,4,*(),Fei JU2,5,Jia-hui-min LIU2,3,Xing-xing GAO2,3,Shao-ni HUANG2,3,Nan WANG2,5   

  1. 1. Shaanxi Meteorological Service Center of Agricultural Remote Sensing and Economic Crops, Xi'an 710016, China
    2. Key Laboratory of Eco-Environment and Meteorology for the Qinling Mountains and Loess Plateau, Shaanxi Meteorological Service, Xi'an 710016, China
    3. Shaanxi Meteorological Observatory, Xi'an 710014, China
    4. Xi'an Meteorological Service, Xi'an 710016, China
    5. Meteorological Institute of Shaanxi Province, Xi'an 710016, China
  • Received:2022-07-28 Online:2023-02-28 Published:2023-03-27
  • Contact: Rui-fang LIU E-mail:zhang_xuting26@126.com;26886083@qq.com

摘要:

利用西安市2016—2021年逐小时PM2.5浓度监测数据和气象观测数据, 基于极端梯度提升机器学习算法模型(extreme Gradient Boosting, XGBoost), 选择气象因子和时间因子作为特征变量, 对西安市逐小时PM2.5浓度进行预报试验。结果表明: 西安市PM2.5浓度与平均气温和能见度显著负相关, 冬季PM2.5浓度与相对湿度和露点温度显著正相关, 偏东风更易诱发重污染天气。西安市12月底至翌年1月初空气污染频发, 但PM2.5浓度总体逐年降低。冬季PM2.5浓度的双峰形日变化最明显, 最高值分别出现在凌晨和11时。西安市PM2.5浓度变化存在“周末效应”。模型能够较为真实地反映PM2.5浓度量级和演变趋势的变化, 预报值与实况值之间的决定系数为0.77、平均绝对误差为12.79 μg·m-3、均方根误差为18.68 μg·m-3。模型秋冬季表现较为稳定, 预报效果优于春夏季, 但对极端峰值存在低估。模型具有较好的可解释性, 能见度特征变量的影响最大, 露点温度、相对湿度、平均气温和海平面气压等特征变量的重要性依次减弱, 时间因子特征变量对模型也有一定影响。与其他统计模型及机器学习模型相比, 模型有更高的预报精度和效率。

关键词: PM2.5浓度, 极端梯度提升算法, 机器学习, 气象因子, 预报

Abstract:

Based on the eXtreme Gradient Boosting (XGBoost) machine learning algorithm model, using hourly PM2.5 concentration monitoring data and meteorological observation data in Xi'an from 2016 to 2021, the forecast test of hourly PM2.5 concentration was carried out by selecting meteorological and time factors as the input features.The results showed that PM2.5 concentration has a significant negative correlation with average temperature and visibility, and relative humidity and dew point temperature are significantly positively correlated with PM2.5 concentration in winter.Easterly wind is more likely to produce heavily polluted weather.Generally, air pollution occurs frequently from the end of December to the beginning of January, but the PM2.5 concentration is decreasing year by year.The PM2.5 concentration in winter shows the most obvious bimodal diurnal variation with the highest values appearing in the early morning and around 11:00.Meanwhile, there is a "weekend effect" in the change of PM2.5 concentration.The forecast model can truly reflect the changes of PM2.5 concentration magnitude and trend with the determination coefficient of 0.77, the mean absolute error of 12.79 μg·m-3 and the root mean square error of 18.68 μg·m-3 between forecasted and observed values.The model has a relatively stable performance and better effect in forecasting PM2.5 concentration in autumn and winter than in spring and summer but underestimates the extreme peaks.Besides, the forecast model has good interpretability and is clearly influenced by the visibility feature variables and the importance of feature variable such as dew point temperature, relative humidity, average temperature, and sea level pressure decrease in turn.Meanwhile, the time factors have a certain impact on the model.In addition, the forecast accuracy and efficiency of this model are higher than those of other statistical and machine-learning models.

Key words: PM2.5 concentration, eXtreme Gradient Boosting, Machine learning, Meteorological factors, Forecast

中图分类号: