Skip to: Analysis in Python
Wind may either import air pollution, or it may dilute and disperse it. Plots of mean concentrations of particulate matter (PM) fractions measured at three types of monitoring sites show common trends as well as notable differences:
Marylebone Rd and N. Kensington are urban roadside and urban background sites. Plots for the two sites show negative correlations between wind speed and fine PM. Levels at N. Kensington show an increased sensitivity to wind speed and they decline steeply until reaching background levels. Wind shows no signifcant effect on the coarse fraction.
At Middlesborough, an urban industrial site, levels of fine PM follow a similar trend to N. Kensington. For coarse PM, there is no signficant correlation up until wind speeds of above 14 m/s, when there is a steep linear increase. This may reflect the resuspension of dust from a nearby steelworks site which is an important local source of PM10.
Analysis in Python
The FacetGrid class from the Seaborn library is useful for visualising the relationship between two variables where the relationship is conditioned by some other variable(s). It takes in a pandas DataFrame as the data source and draws multiple plots of the same relationship for different levels of a third variable. Different colours may be used to represent levels of another variable, as shown in a previous post:
To visualise the relationship between wind speed and the PM fractions, a DataFrame containing mean PM concentration values for each 1 m/s increment in wind speed can be created from the values returned by the groupby() method along with mean().
To obtain a data structure which can be plotted using FacetGrid, a column containing values for the conditioning variable(s) is needed. In the DataFrame shown above the values for the conditioning variable are contained within the column labels. It therefore needs rearranging to create a table structure where there is a column providing the corresponding site type and PM fraction for each numerical value.
This can be achieved with the melt() method from the pandas library, which converts a DataFrame from a wide format into a long format. The columns set as measured variables (value_vars), or any columns not set as identification variables (id_vars), are “unpivoted” to the row axis. This leaves just two non-identifier columns, ‘variable’ and ‘value’.
This DataFrame can now be plotted. Plot titles are set automatically but can be specified, as shown in lines 3 to 6 of the following code: