import os
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from matplotlib.collections import PatchCollection
from matplotlib.colors import ListedColormap
infile = "https://cdn.knmi.nl/knmi/map/page/klimatologie/gegevens/daggegevens/etmgeg_260.zip"
rawdata = pd.read_csv(infile, skiprows=51, skipinitialspace=True)
data = pd.DataFrame({
"date" : pd.to_datetime(rawdata['YYYYMMDD'], format='%Y%m%d'),
"Tmean_degC" : rawdata['TG'] * 0.1,
})
data = data.set_index(data['date'], )
# the earlier parts of the dataframe have lots of missing fields, lets limit our dataset to the last 100-ish years
data = data.loc[data['date'].dt.year.between(1920,2023)]Climate Stripes
Climate Stripes
In this tutorial we will continue with the debilt meteodata. You have already studied this data, but now we will use this dataset to recreate the famous climate stripes. This is a really nice example of accessible storytelling with data.
For quick reference on how to create and customize plots, you can use the Matplotlib Cheatsheets. These provide a handy overview of various plotting functions and customization options that will be useful throughout this tutorial. Feel free to check them out whenever you need a quick guide to Matplotlib’s features!
Begin with loading the data with your (or our) code from last week.
Explore the rawdata and the dataframe in the variable explorer and by using rawdata.info and data.info().
List all changes made from the raw data up until now.
Calculate annual temperature anomalies
To create the climate stripes, we first need to calculate the annual temperature anomalies. These anomalies represent the difference between each year’s average temperature and the overall mean temperature across the dataset. By visualizing the anomalies, we can illustrate the warming (or cooling) trends over time.
In this step, we will: 1. Resample the daily data into annual averages. 2. Calculate the temperature anomaly for each year by subtracting the long-term average temperature from each year’s mean temperature.
The anomalies form the basis for the climate stripes, where each stripe will be colored based on whether the temperature anomaly is above or below the average. This allows us to visually represent how much warmer or cooler a year was compared to the baseline.
#create annual averages
annualdata = data.resample("YE").mean()Calculate the temperature anomaly and add it to the annualdata dataframe by completing the code below. You can calculate the long-term mean using annualdata[‘Tmean_degC’].mean().
annualdata['Tanomaly'] = Solution
Q2:
annualdata['Tanomaly'] = annualdata['Tmean_degC'] - annualdata['Tmean_degC'].mean() Use the plot() function to create a line plot of the temperature anomalies over time. Specifically, use the Tanomaly column of the annualdata dataframe to generate the plot. What trends do you observe? Are there noticeable periods of warming or cooling? Tip: check the Matplotlib Cheatsheets.
Solution
Q3:
annualdata['Tanomaly'].plot()Warming trend is visible. But still some cooler years.
At this point, you’ve created a simple line plot of the temperature anomalies over time. This gives us a good first look at how temperatures have deviated from the average, allowing us to see trends such as periods of warming or cooling.
However, this is still a basic plot. It shows the overall trend, but it lacks the visual storytelling impact of the famous climate stripes. In the next steps, we will transform this data into a more striking, minimalistic visualization where each year’s anomaly is represented as a colored stripe.
Creating the Climate Stripes
Now, we are ready to move beyond the basic line plot and start building the climate stripes visualization. In this section of the code, we use PatchCollection from Matplotlib.
#we need the start and end year to use in plotting
FIRST = annualdata['date'].dt.year.min()
LAST = annualdata['date'].dt.year.max()#define the plot
fig = plt.figure(figsize=(10, 3))
ax = fig.add_axes([0, 0, 1, 1])
#generate the collection of patches
col = PatchCollection([
Rectangle((y, 0), 1, 1)
for y in range(FIRST, LAST + 1)
])
# set data, colormap and color limits
col.set_array(annualdata['Tanomaly'])
ax.add_collection(col)
ax.set_ylim(0, 1)
ax.set_xlim(FIRST, LAST + 1)Why do we define FIRST and LAST? What is the purpose of using PatchCollection in this code? What does the col.set_array(annualdata[‘Tanomaly’]) line do in this code? How does it link the temperature anomaly data to the colors of the stripes in the plot?
Solution
Q4: Identify Start and End Year: We define FIRST and LAST to capture the range of years in the dataset. PatchCollection is used to create the “stripes” for each year. col.set_array(annualdata[‘Tanomaly’]) assigns the color of each stripe based on the temperature anomaly.
At this stage, we’ve built the structure for the climate stripes, but we still need to fine-tune it by adding color. Choose a colormap from the following Matplotlib colormap documentation or use the Matplotlib Cheatsheets.
What type of colormap (e.g., uniform, sequential, diverging, qualitative, cyclic, etc.) do you think is most suited for displaying temperature anomalies? Why?
Choose a colormap that you would apply to this figure and explain why you think it’s effective for visualizing temperature anomalies.
Solution
Q5: A diverging colormap is ideal for (temperature) anomalies because it highlights deviations from a central baseline with distinct colors for both positive and negative values, making it easier to visualize and interpret periods of warming and cooling. Q6: The coolwarm, bwr or seismic colormaps are effective for visualizing temperature anomalies because they use a diverging color scheme with blue shades representing cooler anomalies and red shades representing warmer anomalies.
Add your chosen colormap to the climate stripes visualization to show the temperature differences more clearly.
Hint 1 (click to reveal)
Use col.set_cmap() to set a colormap for your stripes.
Solution
Q4: diverging, BwR of Coolwarm liggen voor de hand.
col.set_cmap('coolwarm')We’re enhancing our climate stripes plot to make it more visually striking and informative. To do this, we apply a custom color scale that smoothly transitions from blue for cooler temperatures to red for warmer temperatures. We use the YRANGE variable to adjust the color limits, which helps to clearly define the range of temperature anomalies and make the differences more noticeable.
# this is a very basic version, but we can do better with a custom color scale, adding a line and some more scaling
# we need a standard field with the years
annualdata['year'] = annualdata['date'].dt.year
FIRST = annualdata['year'].min()
LAST = annualdata['year'].max()
#and here we can set a custom maximum range of our data, the colors are balanced around 0, you can play with this range to get milder or darker colors
YRANGE = 3
#the custom colors
cmap = ListedColormap([
'#08306b', '#08519c', '#2171b5', '#4292c6',
'#6baed6', '#9ecae1', '#c6dbef', '#deebf7',
'#fee0d2', '#fcbba1', '#fc9272', '#fb6a4a',
'#ef3b2c', '#cb181d', '#a50f15', '#67000d',
])fig = plt.figure(figsize=(10, 3))
ax = fig.add_axes([0, 0, 1, 1])
col = PatchCollection([
Rectangle((x, -YRANGE), 1, 2*YRANGE)
for x in range(FIRST, LAST+1)
])
# set data, colormap and color limits
col.set_array(annualdata['Tanomaly'])
ax.add_collection(col)
col.set_cmap(cmap)
col.set_clim(-YRANGE, +YRANGE)
ax.set_xlim(FIRST, LAST + 1)
ax.set_ylim(-YRANGE, +YRANGE) #use set fixed range (YRANGE) or annualdata['Tanomaly].min() and .max()
annualdata.plot(x='year', y='Tanomaly', lw=2, color='w',ax=ax, legend=False)
#add ax.set_axis_off() to get just the pretty stripes without the axesTry experimenting with the YRANGE value. How does changing this parameter affect the visual intensity of the stripes? Does it make the trends more or less pronounced? What value of YRANGE do you think best represents the data?
Customize the colormap by adjusting the colors in ListedColormap(). Do you think any other colors would work better? Explain your choice.
Solution
Q7: Something like 2 or 3 works well.
Q8: Red and blue work quite well for temperature anomalies.
You’ve learned how to analyze temperature anomalies and create striking climate stripes plots. Of course, this approach isn’t limited to temperature alone—you can also apply it to visualize precipitation anomalies. By adapting the techniques we’ve covered, you can uncover and communicate trends in precipitation just as effectively.
Using the skills and techniques you’ve learned, create a climate stripes plot for precipitation anomalies. Choose the colormap to effectively show the range of precipitation values, adjust the YRANGE for clear visualization, and add any extra touches to make your plot stand out. How does your plot help in understanding precipitation trends over time? Share your thoughts and any challenges you faced!
Hint (click to reveal)
Check tutorial 5 again to get the code on how to get the precipitation data from the raw data. Remember: very low values of precipitation are denoted with -1, we want to set those to 0 so these values do not interfere when summarizing or plotting the data.Solution
Q9:
import os
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from matplotlib.collections import PatchCollection
from matplotlib.colors import ListedColormap
infile = "https://cdn.knmi.nl/knmi/map/page/klimatologie/gegevens/daggegevens/etmgeg_260.zip"
rawdata = pd.read_csv(infile, skiprows=51, skipinitialspace=True)
data = pd.DataFrame({
"date" : pd.to_datetime(rawdata['YYYYMMDD'], format='%Y%m%d'),
"precip_mm" : rawdata['RH'] * 0.1,
})
data = data.set_index(data['date'], )
# fix negative values in precip (-1 is used to indicate values < 0.5 instead of 0 in original dataset)
data.loc[data["precip_mm"] < 0.0, "precip_mm"] = 0.0
# the earlier parts of the dataframe have lots of missing fields, lets limit our dataset to the last 100-ish years
data = data.loc[data['date'].dt.year.between(1920,2023)]
#create annual sums
annualdata = data.resample("YE").mean()
annualdata['precip_mm'].mean()
annualdata['Panomaly'] = annualdata['precip_mm'] - annualdata['precip_mm'].mean()
annualdata['Panomaly'].plot()
#we need the satrt and end year to use in plotting
annualdata['year'] = annualdata['date'].dt.year
FIRST = annualdata['date'].dt.year.min()
LAST = annualdata['date'].dt.year.max()
YRANGE = 1.5
#define the plot
fig = plt.figure(figsize=(10, 3))
ax = fig.add_axes([0, 0, 1, 1])
#generate the collection of patches
col = PatchCollection([
Rectangle((x, -YRANGE), 1, 2*YRANGE)
for x in range(FIRST, LAST+1)
])
# set data, colormap and color limits
col.set_array(annualdata['Panomaly'])
col.set_cmap('BrBG')
ax.add_collection(col)
ax.set_ylim(-YRANGE, +YRANGE)
ax.set_xlim(FIRST, LAST + 1)
annualdata.plot(x='year', y='Panomaly', lw=2, color='k',ax=ax, legend=False)