Regional source apportionment of trace metals in fine particulate matter using an observation

npj Climate and Atmospheric Science volume 6, Article number: 65 (2023) Cite this article

1 Altmetric

Metrics details

Trace metals in fine particulate matter (PM2.5) are of significant concern in environmental chemistry due to their toxicity and catalytic capability. An observation-constrained hybrid model is developed to resolve regional source contributions to trace metals and other primary species in PM2.5. In this method, source contributions to primary PM2.5 (PPM2.5) from the Community Multiscale Air Quality (CMAQ) Model at each monitoring location are improved to align better with the observation data by applying source-specific scaling factors estimated from a unique regression model. The adjusted PPM2.5 predictions and chemical speciation data are then used to generate observation-constrained source profiles of primary species. Finally, spatial distributions of their source contributions are produced by multiplying the improved CMAQ PPM2.5 contributions with the deduced source profiles. The model is applied to the Pearl River Delta (PRD) region, China using daily observation data collected at multiple stations in 2015 to resolve source contributions to 8 trace metals, elemental carbon, primary organic carbon, and 10 other primary species. Compared to three previous methods, the new model predicts 13 species with smaller model errors and 16 species with less model biases in 10-fold cross validation analysis. The source profiles determined in this study also show good agreement with those collected from the literature. The new model shows that during 2015 in the PRD region, Cu is mainly from the area sources (31%), industry sector (27%), and power generation (20%), with an annual average concentration as high as 50 ng m−3 in some districts. Meanwhile, major contributors to Mn are area sources (40%), emission from outside PRD (23%) and power generation (17%), leading to a mean level of around 10 ng m−3. Such information is essential in assessing the epidemiological impacts of trace metals as well as formulating effective control measures to protect public health.

Fine particulate matter (PM2.5) has drawn much attention in China in the past decades, mainly because of its adverse impacts on ambient visibility and public health, such as the increase in mortality and morbidity related to acute respiratory diseases and chronic cardiovascular ailments among the population1,2,3,4,5,6. The toxicity of PM2.5 could be attributed to its specific chemical compositions, among which some transition metals (e.g., Cu, Mn, and Fe) deserve more attention for their catalytical nature. Once inhaled into human lungs, they play significant roles in the production of reactive oxygen species (ROS) and the consequential oxidative stress is one probable cause for PM2.5-induced organ injuries7,8,9,10,11,12. Besides their adverse health impacts, airborne transition metals are also recognized for their involvement in the secondary formation of sulfate in the atmosphere13,14. As a result, it is of great scientific interest to investigate the spatial distributions as well as major sources of trace metals. Such knowledge will lay a foundation for formulating cost-effective emission control policies.

Despite the environmental importance of trace metals, there is still a lack of precise and comprehensive source apportionment techniques designated for trace metals in PM2.5. In the calculation of source contributions to trace metals, there exist two distinct groups of methods, namely receptor-based models and chemical transport models (CTMs). Chemical mass balance and positive matrix factorization (PMF) are the two most extensively used receptor-based models in resolving the source contributions to PM2.5 and its components15,16,17,18,19. These models depend on the detailed measurement data of particle chemical composition and offer relatively reliable source contribution estimations. However, due to the scarcity of observation data, the receptor-based methods are inadequate in providing spatially resolved source apportionment results.

In contrast, CTMs equipped with source contribution modules can effectively generate regional distributions of pollutant concentrations and determine the contributions from different source sectors20,21,22,23. This advantage comes with the demand of a large number of spatially and temporally resolved input data, including the meteorological conditions and emission inventories. Another point to note is that the accuracy of source apportionment results is greatly influenced by the rigorousness of emission inventories. While metal emission inventories in China have been reported in some literature24,25,26,27, they have not been carefully evaluated and are expected to have large uncertainties. For instance, emission inventories of several heavy metals in PM2.5 in Guangdong Province, China for 2015 are estimated in two parallel studies by Liu et al. (2018)25 and Liu et al. (2021)27, as shown in Supplementary Fig. 1. However, the overall emission amounts of Mn, Cu, and Pb differ significantly by as much as a factor of three. Recently, hybrid models have been developed to overcome the disadvantages of these two types of methods28,29,30,31,32,33 (Supplementary Note 1). In hybrid models, observation data are used to reconcile CTM outputs or even infused into CTM results for new information. In Hu et al.28 and Ivey et al.29,30, CTM source apportionment results of PM2.5 species were adjusted to better match the observation data by introducing the adjustment factors, whereas Ying et al.32 used CTM source contributions to primary PM2.5 (PPM2.5) and measurement data of trace metals to deduce the nominal source profiles and consequentially the source apportionment information of those species. These works shed light on the possibility of producing more accurate regional source apportionment results.

This study aims to establish a protocol for regional source apportionment of trace metals using observation-constrained source-specific mass fractions. Here we consider a case in which observation data from multiple sites with different pollution characteristics are available and incorporated into model development. A comprehensive comparison of our method with those previously reported hybrid models28,29,30,31,32 is described in Supplementary Note 1, with details of the new model described in Section M3. We use the observation data collected for 10 monitoring sites in the Pearl River Delta (PRD) region, China during 2015 (see Table 1 and Supplementary Fig. 2 for information of sampling locations), together with the source apportionment results of PPM2.5 from a source-oriented Community Multiscale Air Quality Modeling System (CMAQ) in this region (Supplementary Fig. 3). The PRD region, home to several mega cosmopolises such as Hong Kong, Shenzhen, Guangzhou, and Macao, is one of the most economically advanced and densely populated areas on the southern coastal line in China34, where an extensive air quality monitoring network has been installed35. With the ongoing establishment of more monitoring stations across China36,37, the hybrid model proposed in this study can be readily applied to other areas to generate region-specific source profiles as well as source contributions for primary species in PM2.5.

To first demonstrate the improved prediction accuracy of the new hybrid model, Fig. 1 compares the model predictions against ambient measurements using three existing methods as well as the new hybrid model. Method A uses original PPM2.5 source contributions from CMAQ together with source profiles mapped from US EPA SPECIATE database38; method B utilizes original PPM2.5 contributions with profiles mapped from Liu et al. (2018)25. Supplementary Note 2 describes the synthesis of such source profiles in both methods A and B. Meanwhile, results from using the hybrid model in Ying et al.32 and the new model established in this study are denoted as methods C and D, respectively. Note that 10-fold cross validation, instead of the all-inclusive modelling, is used in methods C and D for model evaluation (Section M3).

a Seasonal averages of modelled concentrations of trace metals, POC, and EC vs. ambient measurements using four methods. (b) Mean fractional error (MFE) and mean fractional bias (MFB) for predicted daily concentrations from the four methods. Dash lines in panel (a) are one-to-one-ratio lines. Note that many dots from methods C and D are overlapped because of their close numbers.

Panel (a) in Fig. 1 gives the scatter plots of seasonally averaged predictions vs. measurements of each primary species. Data points from methods A and B tend to deviate notably from the one-to-one-ratio lines in all seasons, whereas data of methods C and D align well along the dash lines. For instance, the measured concentration of Cu is 21 ng/m3 in winter, and the predictions are 29 ng/m3 from method A, 6 ng/m3 from method B, 21 ng/m3 from method C, and 20 ng/m3 from method D (i.e., our new hybrid model). All seasonal measurements and predictions can be found in Supplementary Table 1.

Figure 1b compares the model errors (mean fractional error, MFE) and model bias (mean fractional bias, MFB). Both methods A and B give conspicuously worse predictions than the others. Between C and D, even though their model errors are very close to each other, results from the new hybrid model generally yield smaller model biases. The improved model performance of method D is a result of benefits from the revised regression model. To sum up, out of the 20 primary species analysed in this section, method D predicts 13 species with smaller model errors than method C, 16 species with less biases, and 14 species with higher correlation coefficients with the measurements. Supplementary Table 2 provides thorough information on the model evaluation metrics. Above results demonstrate the improved prediction accuracy of the new observation-constrained hybrid model.

With the refined source contributions and observation data of primary species, we deduce their mass fractions in emitted PPM2.5. Averages and relative standard deviations (RSDs) from the Monte Carlo simulation are shown in Table 2. For validation, the deduced source profiles are first compared to profiles from four published literature sources, i.e., Liu et al. (2018)25, Ying et al.32, US EPA source profiles38, and Liu et al. (2017)39 in Supplementary Tables 3 to 9. Note that the non-PRD source sector is excluded in comparison since it is a geographically lumped source.

For the area source sector, the deduced mass fractions of POC and EC are 40.1% and 12.1%, respectively. Both are within the selected area source profiles from the US EPA (8.41–49.0 % for POC and 1.52–77.1% for EC) in Supplementary Table 3. Nor do the heavy metals’ mass fractions show much peculiarity compared to the US EPA profiles38 or those presented in Liu et al. (2018)25 (Supplementary Table 4). In terms of the deduced mobile source profile in Supplementary Table 5, both POC mass fraction (49.5%) and EC fraction (42.6%) are comparable to numbers in the light-duty diesel vehicle (LDDV) exhaust source profile in the US EPA database, while Fe (2.22%) is highly compatible to the mobile source in Ying et al.32 (1.99% for Fe), and the vehicular exhaust source in Liu et al. (2017)39 (1.89% for Fe). Comparisons of other source sectors (Supplementary Table 6 to Supplementary Table 9) are not described one-by-one here to avoid verboseness. However, we would like to comment on the notable differences between the deduced sea salt source profile and US EPA sea salt data38 in Supplementary Table 9. The discrepancies in Cl and Na mass fractions mainly come from the fact that in this study, the mass fractions are derived from filter-based daily observation data, which are affected by chlorine depletion40,41. In contrast, the US EPA sea salt data38 are compiled for fresh sea salt particles, hence cannot be directly compared with our results.

With each species, the relative uncertainties of estimated mass fractions are generally acceptable for their major sources but rather high for less significant sources. For example, results in the next section show that the major contributors to ambient Fe are area sources, power generation, and mobile sources, and the relative uncertainties of their mass fractions are all between 5% and 10%. In comparison, Mn has area sources, non-PRD emission, and power generation as its principal sources, whose mass fractions of Mn have relative uncertainties between 7% and 27%.

Additionally, the derived source profiles show satisfactory mass closures for power generation (102%), mobile (118%), marine (119%), and area (85%) sources (Supplementary Table 10). Only industry source has a rather poor mass closure. This is noteworthy considering that no additional constraints are imposed in deriving these source profiles as the mass fractions are calculated independently for every single species. Given this study's emphasis on the source apportionment of trace metals, it is less important to guarantee mass closure in source profiles.

Seasonal averages of source contributions to EC, Fe, Cu, and Mn in 6 site groups are illustrated in Fig. 2. EC is included in the discussion for model verification, since it is a major primary species in PM2.5, whose measurement data are relatively reliable with well explored emission sources. The three metals are well known for their capability to induce the generation of ROS inside lung fluid and hence of epidemiological importance. Supplementary Fig. 5a to Supplementary Fig. 5m show the detailed source apportionment results of 13 primary species at the monitoring sites; each figure includes the comparisons between hybrid model predicted concentrations vs. observation data on a daily basis, seasonal averages of source contributions, and model evaluation metrics. The number of valid observations for each species varies because outliers and data below minimal detectable limits (MDLs) are excluded (Supplementary Fig. 4). Our source apportionment results and comparison with previous source apportionment studies in the same region42,43,44,45,46,47,48,49,50,51,52,53 are described below.

(a) EC, (b) Fe, (c) Cu, and (d) Mn.

Although POC and EC are not trace elements, they provide an additional evaluation of the method used in our study. For POC, the overall MFE is 0.34 (0.30–0.44 for different site groups). Mobile (32%) and area (31%) sectors are the two major local sources, while the non-PRD contribution is approximately 12% (Supplementary Fig. 5a). For EC (overall MFE = 0.38), predictions at sites in Hong Kong are satisfactory. In comparison, predictions at non-Hong Kong sites, i.e., MC (MFE = 0.55) and NS (MFE = 0.69), are notably more deviated from the observations (Fig. 2a and Supplementary Fig. 5b). This is probably because PPM2.5 apportionment results at non-Hong Kong sites are less optimal, and the mass fractions of EC might be location-dependent among these monitoring stations. For instance, the two Macao sites lie in the vicinity of a heliport (~80 m) and are close to the Macao International Airport (~1 km), both of which are important EC sources. Averaged across all monitoring sites, mobile (43%) and marine (27%) emissions are the top two local sources of EC, while the contribution from the non-PRD region is 15%. Since both POC and EC are the major components in PPM2.5, the non-PRD contributions resolved here correspond well with the study by Lu et al.54. The high contribution of mobile and marine emissions to EC is generally consistent with those reported by Huang et al.45 (42% vehicle and 18% marine) based on daily samples across six monitoring sites in the PRD region in 2015. Wang et al.52 used the PMF analysis for sampling data at Macao and concluded that 64% of EC can be attributed to a mixed source of vehicle and helicopter emissions.

The origins of Fe in PM2.5 are rather diverse in the PRD region. Area, power generation, and mobile sources are the most important, with contributions of 41%, 26%, and 26%, followed by contributions from marine and industry sources of 7% and 3% (Fig. 2b and Supplementary Fig. 5i). Higher contributions from the mobile source are expected at the near-road site MK. While mobile source contributions are relatively low at NS, higher impacts from the area and power generation sources still cause a high Fe level at this site. The hybrid model-derived source apportionment results for Fe are in good agreement with those derived from the PMF receptor modeling, which are available for Hong Kong from the study by Chow et al.43, with 35% of Fe from vehicular emission, and 32% from industrial/coal combustion.

These two transition metal species play important catalytical roles in atmospheric aqueous chemistry7,8,9,10,11. The model-predicted Cu is in good agreement with observations, with an overall Pearson's R equal to 0.66 and an MFE of 0.52, while the predictions at NS are appreciably worse (MFE = 0.82). On average, area (31%), industry (27%), and power plant (20%) sources constitute more than 75% of its concentration (Fig. 2c and Supplementary Fig. 5k). On the other hand, the reconstruction of Mn is acceptable in most locations with an overall MFE of 0.42, whereas the prediction error at NS (MFE = 0.61) is somewhat higher. Local sources of Mn account for about 75% of the total Mn, with major contributions from area (40%), power (17%), and marine emission (13%) (Fig. 2d and Supplementary Fig. 5h). Only a small selection of source apportionment studies in the PRD includes these two species. Fu et al.44, Huang et al.46, and Tao et al.49 all conclude that Cu and Mn are mostly attributed to traffic emission (60–74 % of Cu and 33–52 % of Mn). On the other hand, Tan et al.48 suggest that 61% of Cu and 49% of Mn are dust-related. Note that these receptor-based source apportionment results are inconsistent with the emission inventories, which have large variations among themselves. For example, emission inventories of heavy metals in Guangdong developed by Liu et al. (2018)25 imply that power (23%) and industry (45%) are the major sources for Cu, and industry (73%) for Mn. In Liu et al. (2021)27, industry (39%) and brake/tire wire (36%) are recognized as important sources for Cu, while industry (53%) is the dominant source for Mn. These disparities among different methodologies raise questions about the applicability of emission inventories of these trace metals, as well as urging the development of region-specific measurement-based source profiles. In comparison, the new model not only ensures model performance by validating results against ambient measurements (Supplementary Figs. 5k and 5h), and also provides uncertainty information in estimated source profiles (Table 2).

V and Ni, mainly emitted from residual oil combustion, are exclusively used as the tracers for marine ship emission in many receptor-based model studies in the PRD region42,43,44,45,46,47,49,50,51,52. According to a study on ship emissions in the PRD region, 88–91% of V and 54–63% of Ni is associated with this source49. This is consistent with our hybrid model results illustrated in Supplementary Fig. 5g and Supplementary Fig. 5j, where marine emission alone could contribute to 93–97% of total V and 74–91 % of Ni at these monitoring stations.

These two trace metals are often used as tracers for industrial sources in receptor-based model studies in the PRD region43,44,45,46,47,48,49,50,51,52,53, thus discussed together here (Supplementary Fig. 5l and Supplementary Fig. 5m). Like many other primary species, model performance at NS and MC are generally worse than others, although their overall model errors (0.44 for Zn and 0.37 for Pb) are low. Our results suggest that the industry sector is the top contributor to both Zn (33%) and Pb (42%). Additionally, non-PRD emissions also greatly influence Zn (25%) and Pb (33%) concentrations. Other source apportionment studies found that 46–92% of Zn and 38–98% of Pb are related to industrial sources in the PRD region43,45,48,50,51, whereas emission inventories estimate the industrial emissions of Zn and Pb are 59–76% and 68–74%, respectively25,27.

As for the seasonal variations, species such as Cu, Mn, Zn, and Pb have their highest concentrations in winter and lowest in summer, in consistent with their main sources being on-land sources from the north. On the contrary, elements contributed largely by marine vessels or sea salt emissions from the south reach their lowest concentrations in winter, as observed in V, Ni, and Na. Species from both sources, such as EC, do not show obvious seasonal patterns. This phenomenon has been examined in the past and linked with the monsoons in the PRD region: the north-northeast monsoons prevail from October to March, and south-southwest monsoons from April to September, leading to different seasonal patterns in concentration and source contribution variations55,56,57.

The source contributions to PPM2.5 apportioned by CMAQ in D3 with the original emission inventory (Supplementary Fig. 4) are improved using gridded scaling factors interpolated from those determined at the monitoring sites by inverse distance weighting (see Section M3). Notably, area and industrial sources are the most prominent contributors to PPM2.5 in the PRD region (Supplementary Fig. 6). For example, in Guangzhou, Foshan, and Zhuhai, the annual average area source contribution to PPM2.5 can exceed 10 μg/m3, while industrial contribution can surpass 6 μg/m3 in Foshan. Consequentially, in 2015 the annual PPM2.5 concentration is more than 15 μg/m3 in a large area covering southern Guangzhou, Foshan, Dongguan, Zhuhai, and Shenzhen, with concentrations as high as 30 μg/m3 in some small districts.

Annually averaged source contributions to Cu and Mn from different source sectors as well as their annual concentrations in the D3 domain are depicted in Fig. 3 and Fig. 4, respectively. Similar to the results at the monitoring stations, the most important sources for Cu and Mn in the PRD region are area, power generation, and industry sources, whereas non-PRD emission also contributes a substantial portion for Mn. Figure 3 implies that for some places in Guangzhou, Foshan, Dongguan, and Zhongshan, Cu concentration exceeds 50 ng/m3. For Mn, ambient concentrations of more than 30 ng/m3 are predicted in these areas. Since Cu and Mn are recognized to induce the generation of ROS, such high concentrations of Cu and Mn raise concerns about the potential of elevated aerosol toxicity in these districts. Other species in the source profiles can be readily analyzed in the same manner.

(a) Area sources, (b) industrial emission sector, (c) power generation emission, and (d) ambient concentration.

(a) Area sources, (b) non-PRD emission, (c) power generation emission, (d) industrial emission sector, (e) marine vessel emission, and (f) ambient concentration.

Although the hybrid model results agree well with the observation data for many species at most monitoring stations in this PRD case study, the model was unable to accurately reconstruct the observation data for NS and MC groups. One plausible explanation is that the emission inventory outside Hong Kong is problematic and cannot be remedied simply with scaling factors. Even so, when the combined Hong Kong and Nansha data are used to deduce mass fractions, the results are reliable enough to show excellent performance at other sites, further demonstrating our model's robustness. Herein, the implication for future model refinement is that an accurate emission inventory will always be a vital requirement in CTMs and any CTM-based hybrid model for source apportionment. One possible way to improve the emission inventories is to further examine the scaling factors and mass fractions constrained by the measurement data. For instance, an observation-constrained scaling factor larger than 1 is likely caused by the underestimation of its emission sources.

Finally, this case study has shown the advantages of our statistically tailored regression model33 in dealing with atmospheric data. Compared to the hybrid model by Ying et al.32, our method shows improved prediction accuracy in 10-fold cross validation. More trials could be done to test this regression method for other linear models in atmospheric sciences.

Table 1 and Supplementary Fig. 2 show the geographical locations of the monitoring stations where daily observation data are collected. Briefly speaking, 24-h filter-based sampling of PM2.5 is conducted every 6 days throughout 2015 at 10 monitoring stations spread in three cities in the PRD region, including 1 site in Guangzhou (Nansha; hereafter NS), 2 sites in Macao (Taipa Grande and Porto Exterior; MC), and 7 sites in Hong Kong (Central/West, Tung Chung, Tsuen Wan, Yuen Long, Kwai Chung, Mong Kok, and Clear Water Bay). Stations in Hong Kong are divided into four groups: HKc (denoting combined HK general urban sites, including Central/West, Tung Chung, Tsuen Wan, and Yuen Long), KC (Kwai Chung; near the City's shipping container terminal), MK (Mong Kok; downtown roadside and traffic impacted), and WB (Clear Water Bay; suburban). All data are categorized into 6 groups as per Table 1 in modeling and further analysis.

Sampling protocol and chemical analysis details have been described in the annual reports of PM2.5 speciation by the Hong Kong Environmental Protection Department (HKEPD) [e.g., ref. 58] and some previous studies43,59. In short, PM2.5 total mass concentration is determined gravimetrically using a digital balance (Sartorius AG, Model MC 5-0CE, Göttingen, Germany); water-soluble major ions (e.g., sulfate and nitrate) are extracted from the filters using double de-ionized water and measured using an Ion Chromatograph (Dionex DX-500, Thermo Fisher Scientific, MA, USA); OC and EC were analyzed using thermal/optical transmittance method on an Aerosol OCEC analyzer (Sunset Laboratory, OR, USA); and trace elements (e.g., Fe, Cu, Mn, Br, and Pb) using an energy dispersive X-ray fluorescence (ED-XRF, Epsilon 5, PANalytical, The Netherlands). The sampling program generates ca 60 sets of daily samples at each site, resulting in a total of 594 sets of daily PM2.5 speciation data (Table 1). Analytical specifications such as MDLs of each species are shown in Supplementary Table 11, and the distributions of all daily concentrations are plotted in Supplementary Fig. 4.

The Weather Research and Forecasting model (WRF) v3.5 is applied for the meteorological simulation for the entire year of 2015 using the same model schemes and domain settings as the two previous studies54,60. In this project, the smallest D3 domain covers the entire PRD region with a horizontal resolution of 3 km, as shown in Fig. 4. A bottom-up 3-km resolution emission inventory of PPM2.5 and other major pollutants provided by the HKEPD is used for the D3 domain54. Outside the PRD region, the Regional Emission inventory in ASia v3.1 (REAS3)61 is used to generate anthropogenic emissions. Emissions of non-methane hydrocarbons from REAS3 are used to estimate the emissions of primary VOCs needed for the Carbon Bond 6 mechanism using selected speciation profiles from the SPECAITE database developed by the US EPA62. The biogenic emissions are generated using the Model for Emissions of Gaseous and Aerosols from Nature (MEGAN) v2.1063. Windblown dust and sea salt emissions are generated using the inline module in the CMAQ model64.

The simulation of PPM2.5 and its source apportionment are carried out via a source-oriented version of CMAQ (ver. 5.2.1) with non-reactive source-tagged tracers32,65 to determine the contributions of the area, mobile, industry, marine, and power generation sectors to PPM2.5 in the PRD region (Fig. 5). In this study, the area source sector includes emissions from residential, non-point industrial, off-road engine exhaust, fugitive road dust, and agricultural sources (Supplementary Note 2). PPM2.5 due to emissions from other regions in the D3 domain and upwind sources outside D3 is lumped into the non-PRD sector. In addition, sea salt and windblown dust contributions to PPM2.5 are also tracked as separate groups. An illustration of the spatial distributions of the annual average sector contributions to PPM2.5 is shown in Supplementary Fig. 3.

City acronyms in alphabetical order are DG Dongguan, FS Foshan, GZ Guangzhou, HK Hong Kong, HZ Huizhou, JM Jiangmen, MC Macao, SZ Shenzhen, ZH Zhuhai, ZQ Zhaoqing, and ZS Zhongshan.

The procedures in the new observation-constrained hybrid model are illustrated in Panel (a) in Fig. 6, while panel (b) shows details of the 10-fold cross validation for model verification. The steps are described below:

(a) flowchart of the base run, and (b) flowchart of the 10-fold cross validation to examine the prediction accuracy of the hybrid model.

Step 1. A Bayesian inference approach is employed to deduce the primary organic carbon (POC) and secondary organic carbon (SOC) levels in each observation, following the method described by Liao et al.66

Step 2. Similar to Ying et al.32, the measurement-based primary PM2.5 (PPM2.5) concentration is estimated via subtracting the secondary inorganic and organic components from PM2.5.

Step 3. CMAQ-deduced source contributions to PPM2.5 are adjusted with source-specific scaling factors (\({\beta }_{j},{j}=1,\,2,\,\ldots ,{m}\)) as per Eq. 1 to improve the agreement between the predicted total PPM2.5 and the measurement-based PPM2.532,

where \({c}_{i}\) represents the measurement-based PPM2.5 concentration for the \({i}^{{th}}\) observation, and \({b}_{i,j}\) denotes the source contribution from the \({j}^{{th}}\) source sector for the \({i}^{{th}}\) observation estimated by CMAQ. The scaling factor calculation is performed individually for each site group to account for the spatial variability of the emission estimation biases. Data points with the fractional error (FE) between CMAQ-predicted PPM2.5 and measurement-based PPM2.5 larger than 0.8 are excluded.

The adjustment factors are determined via minimizing the objective function \(\widetilde{Q}\) in Eq. 2.

The first part of the equation, \({Q}_{0}\), is a unique function with log-transformation, representing the difference between CMAQ-predicted and measurement-based PPM2.5. As explained by Liao et al.33, using log-transformed objective function better estimates the regression coefficients for log-normally distributed atmospheric data. \({Q}_{p}\) in Eq. 2 is the penalty term (also known as the regularization), included to overcome potential overfitting problems. Otherwise, unrealistically large or small β values are likely to merge (Supplementary Note 3). The L-curve method is utilized to find the optimal \(\lambda\) value in Eq. 267(Supplementary Note 3).

The robustness of calculated scaling factors is evaluated using a Monte Carlo simulation technique. The relative uncertainties of \({c}_{i}\) (i.e., measurement-based PPM2.5) and \({b}_{i,j}\) (i.e., CMAQ source contributions to PM2.5) in Eq. 2 are set to 20% and 40%, respectively32. In each trial, randomly generated errors are added onto \({c}_{i}\) and \({b}_{i,j}\), whence one set of scaling factors (\({\beta }_{j}\)) is determined. In Supplementary Note 3, we analyze the uncertainties of \({\beta }_{j}\) from such Monte Carlo simulation trials. Eventually, the arithmetic means of scaling factors (\({\bar{\beta }}_{j}\)) are used to produce adjusted source contributions (\({{b\text{'}}}_{i,j}\)) for ensuing computation (Eq. 3).

Step 4. The deduction of mass fraction (\({\varphi }_{j,k}\)) of a give primary species (\(k\)) in the \({j}^{{th}}\) source emitted PPM2.5 follows Eq. 4,

where \({c}_{i,k}\) denotes its ambient level in the \({i}^{{th}}\) observation, and \({{b\text{'}}}_{i,j}\) is the scaled source contribution to PPM2.5 from Eq. 3. Step 4 differs from Step 3 in two aspects: first, contrary to the site-dependent scaling factors (\({\beta }_{j}\)), we assume that the mass fraction (\({\varphi }_{j,k}\)) of a species from the same source sector is uniform within the region; and second, the penalty term is no longer applicable, whereas upper and lower bounds of \({\varphi }_{j,k}\) are set to [0,1] in Eq. 4. The objective function in Eq. 5 is minimized to determine the optimal set of \({\varphi }_{j,k}\) values for one species at a time. Similar to Step 3, Monte Carlo simulation is carried out to estimate the uncertainties of deduced mass factions. The arithmetic means (\({\bar{\varphi }}_{j,k}\)) from simulations is used in Step 5.

Step 5. The product (\({{SC}}_{i,j,k}\)) in Eq. 6 is deemed as the source contribution from the \({j}^{{th}}\) source sector to a primary pollutant (\(k\)) in the \({i}^{{th}}\) observation.

By applying the PPM2.5 scaling factors derived in Step 3 and source profiles of primary species in Step 4, regional concentrations of the metal species and their source contributions in the modeling domain can be determined. Since the PPM2.5 scaling factors determined in Step 3 are location dependent, an inverse distance weighting interpolation scheme (Eq. 7), is used to estimate the scaling factors at all grid cells in the entire domain68,69.

where \(\vec{x}\) and \({\vec{x}}_{l}\) are the position vector of a grid cell and the \({l}^{{th}}\) monitor site, respectively. \({\beta }_{j}\) is the interpolated scaling factor for the \({j}^{{th}}\) PPM2.5 source.

10-fold cross validation is conducted to examine the prediction accuracy of this hybrid model (Fig. 6b). Since daily observation data are measured at precisely 10 monitoring stations, each trial uses data at nine sites as the training set and the remaining one as the testing set. That is, CMAQ-modeled PPM2.5 contributions and ambient measurement data in the training set are utilized as model inputs to generate location-dependent scaling factors as well as regional source profiles. These scaling factors are later extrapolated to the testing site using Kriging (Eq. 7). With adjusted CMAQ PPM2.5 source contributions and source profiles, predictions of primary species are made possible for testing set, and hereafter compared to daily measurements.

All data used in this study are available from the authors upon reasonable request.

A sample R script for hybrid modeling is publicly available on a GitHub repository and can be accessed via this link (https://github.com/CoCobalt27/loglm_hybridmodel).

Pui, D. Y. H., Chen, S.-C. & Zuo, Z. PM2.5 in China: measurements, sources, visibility and health effects, and mitigation. Particuology 13, 1–26 (2014).

Article Google Scholar

Luan, T., Guo, X., Guo, L. & Zhang, T. Quantifying the relationship between PM2.5 concentration, visibility and planetary boundary layer height for long-lasting haze and fog–haze mixed events in Beijing. Atmos. Chem. Phys. 18, 203–225 (2018).

Article Google Scholar

Cohen, A. J. et al. Estimates and 25-year trends of the global burden of disease attributable to ambient air pollution: an analysis of data from the global burden of diseases Study 2015. Lancet 389, 1907–1918 (2017).

Article Google Scholar

Wang, Q., Wang, J., He, M. Z., Kinney, P. L. & Li, T. A county-level estimate of PM2.5 related chronic mortality risk in China based on multi-model exposure data. Environ. Int. 110, 105–112 (2018).

Article Google Scholar

Hoek, G. et al. Long-term air pollution exposure and cardio- respiratory mortality: a review. Environ. Health 12, 43 (2013).

Article Google Scholar

Li, T. et al. Estimating mortality burden attributable to short-term PM2.5 exposure: a national observational study in China. Environ. Int. 125, 245–251 (2019).

Article Google Scholar

Lin, M. & Yu, J. Z. Effect of metal-organic interactions on the oxidative potential of mixtures of atmospheric humic-like substances and copper/manganese as investigated by the dithiothreitol assay. Sci. Total Environ. 697, 134012 (2019).

Article Google Scholar

Lin, M. & Yu, J. Z. Assessment of interactions between transition metals and atmospheric organics: ascorbic acid depletion and Hydroxyl radical formation in organic-metal mixtures. Environ. Sci. Technol. 54, 1431–1442 (2020).

Article Google Scholar

Rodrigo-Moreno, A., Poschenrieder, C. & Shabala, S. Transition metals: a double edge sward in ROS generation and signaling. Plant Signal. Behav. 8, e23425 (2013).

Article Google Scholar

See, S. W., Wang, Y. H. & Balasubramanian, R. Contrasting reactive oxygen species and transition metal concentrations in combustion aerosols. Environ. Res. 103, 317–324 (2007).

Article Google Scholar

Shafer, M. M. et al. Reactive oxygen species activity and chemical speciation of size-fractionated atmospheric particulate matter from Lahore, Pakistan: an important role for transition metals. J. Environ. Monit. 12, 704–715 (2010).

Article Google Scholar

Valavanidis, A., Vlachogianni, T., Fiotakis, K. & Loridas, S. Pulmonary oxidative stress, inflammation and cancer: respirable particulate matter, fibrous dusts and ozone as major causes of lung carcinogenesis through reactive oxygen species mechanisms. Int. J. Environ. Res. Public. Health 10, 3886–3907 (2013).

Article Google Scholar

Fu, X. et al. Modeling analysis of secondary inorganic aerosols over China: pollution characteristics, and meteorological and dust impacts. Sci. Rep. 6, 35992 (2016).

Article Google Scholar

Martin, L. R. & Good, T. W. Catalyzed oxidation of sulfur dioxide in solution: the iron-manganese synergism. Atmos. Environ. Part Gen. Top. 25, 2395–2399 (1991).

Article Google Scholar

Paatero, P. & Tapper, U. Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994).

Article Google Scholar

Hopke, P. K. Review of receptor modeling methods for source apportionment. J. Air Waste Manag. Assoc. 66, 237–259 (2016).

Article Google Scholar

Chow, J. C. & Watson, J. G. Review of PM2.5 and PM10 apportionment for fossil fuel combustion and other sources by the chemical mass balance receptor model. Energy Fuels 16, 222–260 (2002).

Article Google Scholar

Belis, C. A., Karagulian, F., Larsen, B. R. & Hopke, P. K. Critical review and meta-analysis of ambient particulate matter source apportionment using receptor models in Europe. Atmos. Environ. 69, 94–108 (2013).

Article Google Scholar

Zhang, Y., Cai, J., Wang, S., He, K. & Zheng, M. Review of receptor-based source apportionment research of fine particulate matter and its challenges in China. Sci. Total Environ. 586, 917–929 (2017).

Article Google Scholar

Ying, Q. & Kleeman, M. J. Source contributions to the regional distribution of secondary particulate matter in California. Atmos. Environ. 40, 736–752 (2006).

Article Google Scholar

Wagstrom, K. M., Pandis, S. N., Yarwood, G., Wilson, G. M. & Morris, R. E. Development and application of a computationally efficient particulate matter apportionment algorithm in a three-dimensional chemical transport model. Atmos. Environ. 42, 5650–5659 (2008).

Article Google Scholar

Wang, Z. S., Chien, C.-J. & Tonnesen, G. S. Development of a tagged species source apportionment algorithm to characterize three-dimensional transport and transformation of precursors and secondary pollutants. J. Geophys. Res. Atmos. 114, D21206 (2009).

Article Google Scholar

Kwok, R. H. F., Napelenok, S. L. & Baker, K. R. Implementation and evaluation of PM2.5 source contribution analysis in a photochemical model. Atmos. Environ. 80, 398–407 (2013).

Article Google Scholar

Tian, H. Z. et al. Quantitative assessment of atmospheric emissions of toxic heavy metals from anthropogenic sources in China: historical trend, spatial distribution, uncertainties, and control policies. Atmos. Chem. Phys. 15, 10127–10147 (2015).

Article Google Scholar

Liu, Y., Xing, J., Wang, S., Fu, X. & Zheng, H. Source-specific speciation profiles of PM2.5 for heavy metals and their anthropogenic emissions in China. Environ. Pollut. 239, 544–553 (2018).

Article Google Scholar

Sha, Q. et al. Anthropogenic atmospheric toxic metals emission inventory and its spatial characteristics in Guangdong province, China. Sci. Total Environ. 670, 1146–1158 (2019).

Article Google Scholar

Liu, S. et al. Significant but spatiotemporal-heterogeneous health risks caused by airborne exposure to multiple toxic trace elements in China. Environ. Sci. Technol. 55, 12818–12830 (2021).

Google Scholar

Hu, Y. et al. Fine particulate matter source apportionment using a hybrid chemical transport and receptor model approach. Atmos. Chem. Phys. 14, 5415–5431 (2014).

Article Google Scholar

Ivey, C. E., Holmes, H. A., Hu, Y. T., Mulholland, J. A. & Russell, A. G. Development of PM2.5 source impact spatial fields using a hybrid source apportionment air quality model. Geosci. Model Dev. 8, 2153–2165 (2015).

Article Google Scholar

Ivey, C. et al. Development of PM2.5 source profiles using a hybrid chemical transport-receptor modeling approach. Environ. Sci. Technol. 51, 13788–13796 (2017).

Article Google Scholar

Bates, J. T. et al. Source impact modeling of spatiotemporal trends in PM2.5 oxidative potential across the eastern United States. Atmos. Environ. 193, 158–167 (2018).

Article Google Scholar

Ying, Q. et al. Improve regional distribution and source apportionment of PM2.5 trace elements in China using inventory-observation constrained emission factors. Sci. Total Environ. 624, 355–365 (2018).

Article Google Scholar

Liao, K. et al. A multiple linear regression model with multiplicative log-normal error term for atmospheric concentration data. Sci. Total Environ. 767, 144282 (2021).

Article Google Scholar

Yeh, A. G. O. & Li, X. Economic development and agricultural land loss in the Pearl River Delta, China. Habitat Int. 23, 373–390 (1999).

Article Google Scholar

Zhong, L. et al. The Pearl River Delta regional air quality monitoring network - regional collaborative efforts on joint air quality management. Aerosol Air Qual. Res. 13, 1582–1597 (2013).

Article Google Scholar

Bai, K., Li, K., Wu, C., Chang, N.-B. & Guo, J. A homogenized daily in situ PM2.5 concentration dataset from the national air quality monitoring network in China. Earth Syst. Sci. Data 12, 3067–3080 (2020).

Article Google Scholar

Liu, G., Dong, X., Kong, Z. & Dong, K. Does national air quality monitoring reduce local air pollution? The case of PM2.5 for China. J. Environ. Manag. 296, 113232 (2021).

Article Google Scholar

Hsu, Y. K. & Divita, F. SPECIATE 4.2 - Speciation Database Development Documentation. https://swap.stanford.edu/was/20110202080108/http://www.epa.gov/nrmrl/pubs/600r09038/600r09038.pdf (2009).

Liu, Y. et al. China source profile shared service (CSPSS): The Chinese PM2.5 database for source profiles. Aerosol Air Qual. Res. 17, 1501–1514 (2017).

Article Google Scholar

Laskin, A. et al. Tropospheric chemistry of internally mixed sea salt and organic particles: Surprising reactivity of NaCl with weak organic acids. J. Geophys. Res. Atmos. 117, D15302 (2012).

Article Google Scholar

Adachi, K. & Buseck, P. R. Changes in shape and composition of sea-salt particles upon aging in an urban atmosphere. Atmos. Environ. 100, 1–9 (2015).

Article Google Scholar

Cheng, Y., Ma, Y. & Hu, D. Tracer-based source apportioning of atmospheric organic carbon and the influence of anthropogenic emissions on secondary organic aerosol formation in Hong Kong. Atmos. Chem. Phys. 21, 10589–10608 (2021).

Article Google Scholar

Chow, W. S. et al. Molecular and elemental marker-based source apportionment of fine particulate matter at six sites in Hong Kong, China. Sci. Total Environ. 813, 152652 (2022).

Article Google Scholar

Fu, S. et al. Insights into the source-specific health risk of ambient particle-bound metals in the Pearl River Delta region, China. Ecotoxicol. Environ. Saf. 224, 112642 (2021).

Article Google Scholar

Huang, X.-F. et al. Exploration of PM2.5 sources on the regional scale in the Pearl River Delta based on ME-2 modeling. Atmos. Chem. Phys. 18, 11563–11580 (2018).

Article Google Scholar

Huang, J. et al. Source apportionment of carbonaceous aerosols using hourly data and implications for reducing PM2.5 in the Pearl River Delta region of South China. Environ. Res. 210, 112960 (2022).

Article Google Scholar

Kuang, B. Y., Lin, P., Huang, X. H. H. & Yu, J. Z. Sources of humic-like substances in the Pearl River Delta, China: positive matrix factorization analysis of PM2.5 major components and source markers. Atmos. Chem. Phys. 15, 1995–2008 (2015).

Article Google Scholar

Tan, J. et al. Long-term trends of chemical characteristics and sources of fine particle in Foshan City, Pearl River Delta: 2008–2014. Sci. Total Environ. 565, 519–528 (2016).

Article Google Scholar

Tao, J. et al. Source apportionment of PM2.5 at urban and suburban areas of the Pearl River Delta region, south China - With emphasis on ship emissions. Sci. Total Environ. 574, 1559–1570 (2017).

Article Google Scholar

Wang, Q. Q. et al. Organic tracer-based source analysis of PM2.5 organic and elemental carbon: A case study at Dongguan in the Pearl River Delta, China. Atmos. Environ. 118, 164–175 (2015).

Article Google Scholar

Wang, Q. et al. Impact of Secondary Organic Aerosol Tracers on Tracer-Based Source Apportionment of Organic Carbon and PM2.5: A Case Study in the Pearl River Delta, China. ACS Earth Space Chem. 1, 562–571 (2017).

Article Google Scholar

Wang, Q. et al. Source apportionment of fine particulate matter in Macao, China with and without organic tracers: A comparative study using positive matrix factorization. Atmos. Environ. 198, 183–193 (2019).

Article Google Scholar

Zou, B.-B. et al. Source apportionment of PM2.5 pollution in an industrial city in southern China. Atmos. Pollut. Res. 8, 1193–1202 (2017).

Article Google Scholar

Lu, X. et al. Differences in concentration and source apportionment of PM2.5 between 2006 and 2015 over the PRD region in southern China. Sci. Total Environ. 673, 708–718 (2019).

Article Google Scholar

Qin, Y., Chan, C. K. & Chan, L. Y. Characteristics of chemical compositions of atmospheric aerosols in Hong Kong: spatial and seasonal distributions. Sci. Total Environ. 206, 25–37 (1997).

Article Google Scholar

Yu, J. Z. et al. Abundance and seasonal characteristics of elemental and organic carbon in Hong Kong PM10. Atmos. Environ. 38, 1511–1521 (2004).

Article Google Scholar

Kwok, R. H. F., Fung, J. C. H., Lau, A. K. H. & Fu, J. S. Numerical study on seasonal variations of gaseous pollutants and particulate matters in Hong Kong and Pearl River Delta Region. J. Geophys. Res. Atmos. 115, D16308 (2010).

Article Google Scholar

Chow, J. C., Watson, J. G., Cropper, P. M., Wang, X. & Kohl, S. D. Fine Particulate Matter (PM2.5) Sample Chemical Analysis January 1, 2015 – December 31, 2015, Final Report. https://www.epd.gov.hk/epd/sites/default/files/epd/english/environmentinhk/air/studyrpts/files/final_report_mvtmpms_2015.pdf (2016).

Huang, X. H. H. et al. Characterization of PM2.5 major components and source investigation in suburban hong kong: a one year monitoring study. Aerosol Air Qual. Res. 14, 237–250 (2014).

Article Google Scholar

Lu, X., Yao, T., Li, Y., Fung, J. C. H. & Lau, A. K. H. Source apportionment and health effect of NOx over the Pearl River Delta region in southern China. Environ. Pollut. 212, 135–146 (2016).

Article Google Scholar

Ohara, T. et al. An Asian emission inventory of anthropogenic emission sources for the period 1980–2020. Atmos. Chem. Phys. 7, 4419–4444 (2007).

Article Google Scholar

Wang, P. et al. Source apportionment of secondary organic aerosol in China using a regional source-oriented chemical transport model and two emission inventories. Environ. Pollut. 237, 756–766 (2018).

Article Google Scholar

Guenther, A. et al. Estimates of global terrestrial isoprene emissions using MEGAN (Model of emissions of gases and aerosols from nature). Atmos. Chem. Phys. 6, 3181–3210 (2006).

Article Google Scholar

Foroutan, H. et al. Development and evaluation of a physics-based windblown dust emission scheme implemented in the CMAQ modeling system. J. Adv. Model. Earth Syst. 9, 585–608 (2017).

Article Google Scholar

Hu, J. et al. Source contributions and regional transport of primary particulate matter in China. Environ. Pollut. 207, 31–42 (2015).

Article Google Scholar

Liao, K., Wang, Q., Wang, S. & Yu, J. Z. Bayesian inference approach to quantify primary and secondary organic carbon in fine particulate matter using major species measurements. Environ. Sci. Technol. 57, 5169–5179 (2023).

Article Google Scholar

Hansen, P. C. & O’Leary, D. P. The use of the L-Curve in the regularization of discrete Ill-posed problems. SIAM J. Sci. Comput. 14, 1487–1503 (1993).

Article Google Scholar

Cressie, N. The origins of kriging. Math. Geol. 22, 239–252 (1990).

Article Google Scholar

Oliver, M. A. & Webster, R. Kriging: a method of interpolation for geographical information systems. Int. J. Geogr. Inf. Syst. 4, 313–332 (1990).

Article Google Scholar

Download references

This work was partially supported by the Hong Kong Research Grant Council (R6011-18) to J.Z.Y. and a Hong Kong Ph.D. Fellowship to K.L.; Q.Y. and J.Z. are partially supported by a grant from the National Institutes of Health (R01 ES029509). The CMAQ model simulations were performed using the computer clusters at the Texas A&M High Performance Research Computing (https://hprc.tamu.edu/).

Department of Chemistry, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, 999077, China

Kezheng Liao & Jian Zhen Yu

Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX, 77843-3136, USA

Jie Zhang & Qi Ying

Division of Environment and Sustainability, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, 999077, China

Yiang Chen, Jimmy C. H. Fung & Jian Zhen Yu

Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, Hong Kong, 999077, China

Xingcheng Lu

Department of Mathematics, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, 999077, China

Jimmy C. H. Fung

You can also search for this author in PubMed Google Scholar

K.L. designed the hybrid model, analyzed the results and wrote the manuscript. J.Z. ran the CMAQ simulation in this study. Y.C., X.L. and J.C.H. Fung prepared emission inventory for CMAQ and WRF simulations. Q.Y. and J.Z.Y. revised and finalized the manuscript. All authors read and approved the final manuscript.

Correspondence to Qi Ying or Jian Zhen Yu.

The authors declare no competing interests.

Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

Liao, K., Zhang, J., Chen, Y. et al. Regional source apportionment of trace metals in fine particulate matter using an observation-constrained hybrid model. npj Clim Atmos Sci 6, 65 (2023). https://doi.org/10.1038/s41612-023-00393-4

Download citation

Received: 01 January 2023

Accepted: 01 June 2023

Published: 09 June 2023

DOI: https://doi.org/10.1038/s41612-023-00393-4

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative