With the changing climate, California is witnessing more pronounced wildfires with every passing year. The 10 largest fires in California since 1932, have occurred since 2000. 2021 continued to be yet another year of intense and frequent wildfires for the United States. For this analysis, we will explore the California and Oregon fires using Blue Sky’s Zuri datasets.

Introducing Shape Explorer

Before we can query anything from the Blue Sky APIs, we need the shape IDs for the relevant regions. Hence, we are introducing Shape Explorer that can be found in SpaceTime. The Shape Explorer can be accessed from the More dropdown as seen at the right corner!

The goal behind Shape Explorer is to enable the use of consistent shape IDs which are unique and represent a specific boundary at different administrative levels. It will help in avoiding any non-uniformity that might arise due to the names of places which may vary from person to person making the system more robust & objective.

The Shape Explorer can be accessed from the More dropdown in SpaceTime as seen at the right corner!

The Shape Explorer can be accessed from the More dropdown in SpaceTime as seen at the right corner!

This is the Shape Explorer. To get the Shape ID for a region of interest just search for the region of interest and copy its ID as demonstrated above!

This is the Shape Explorer. To get the Shape ID for a region of interest just search for the region of interest and copy its ID as demonstrated above!

This tutorial assumes that the user has Python and Jupyter Notebook installed. For detailed instructions on installing or setting up Python or Jupyter Notebook, please refer to our previous tutorial.

Setting up the environment

Fire up Jupyter Notebook in the terminal by typing jupyter notebook. Note: As we will download more data, Jupyter Notebook may throw

💡 IOPub data rate exceeded error

So, it is better to run the command below from your terminal right at the start!

💡 jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10

Create a new Python 3 notebook and run the following commands in jupyter notebook

pip install requests
pip install pandas
pip install json
pip install plotly
pip install seaborn

To ensure all the plots from this analysis are correctly reproduced the following packages will be required.

# This package helps you reach out to the API
import requests 
# This package is necessary to ensure you receive data in a JSON format
import json  
# Required to reproduce the plots included in this tutorial
import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import calendar
from matplotlib.dates import DateFormatter
%matplotlib inline

Fetching the data

After the packages have been imported, the next step lies in determining the endpoint which in turn will fetch the GHG Emissions data and this can be found in Dev Portal docs.

https://gateway.blueskyhq.io/api/zuri/fires?api-key={INSERT YOUR KEY HERE}
&shapeId={shapeId}
&startDate={startDate}
&endDate={endDate}
&raw={raw}
&keys={keys}

California and Oregon fires from 2018 to 2021

params = {
    "api-key": "#INSERT YOUR KEY HERE", 
    "shapeId": "1cca7a4d-7f32-4de8-a5d1-23281bf21753", #INSERT SHAPE ID OF INTEREST
    "startDate": "2018-01-01T00:00:00.000Z",
    "endDate": "2021-12-31T00:00:00.000Z",
    "raw": "false",
		"adminLevelBucket":"3", # "3" will fetch the data for California
    "keys": "frp,co2_emission_estimate",
}

The above parameters will fetch frp (average), co2_emission_estimate data of fires for the Shape ID (here California ~ 1cca7a4d-7f32-4de8-a5d1-23281bf21753) between the 1st of January 2019 and the 31st of December 2021. By default, it uses VIIRS data and we will continue this tutorial with the same.

adminLevelBuckethelps you bucket the data spatially. Eg, if we have to find GHG emissions for all the states in the US, we would use shapeID of USA & adminLevelBucket equal to 3. The API would then return the data, spatially bucketed for each state in the US. The table below helps explain how to use the adminLevelBucket parameter.

Levels Admin Region
0 World
1 Continent
2 Country
3 State
4 District/County
5 Electoral/Sub-District
california_data = requests.get("https://gateway.blueskyhq.in/api/zuri/fires?", params = params_california).json()
california = pd.DataFrame(california_data['data'])

Similarly, fire emissions data can be downloaded for Oregon using Oregon’s Shape ID.

# Combining the two datasets to create a new one
combined = pd.concat([california, oregon])

The combined data would look like this!

The combined data would look like this!

sns.set()
sns.barplot(x = 'year', y = 'fires', hue = 'States', data = combined, estimator= sum)
plt.xlabel('Year')
plt.ylabel('Fires')
plt.title('California and Oregon fires between 2018 and 2021')
plt.tight_layout()
plt.show()

sns.set()
sns.barplot(x = 'year', y = 'co2_emission_estimate', hue = 'States', data = combined, estimator = sum)
plt.yticks([50000000, 100000000, 150000000, 200000000, 250000000, 300000000],['50M','100M','150M', '200M', '250M', '300M'])
plt.xlabel('Year')
plt.ylabel('$CO_{2}$ Emissions Estimate (in tonnes)')
plt.title('California and Oregon $CO_{2}$ emissions (in tonnes) from fires \n between 2018 and 2021')
plt.tight_layout()
plt.show()

The bar plots are showing the sum of the total number of fires in California and Oregon between 2018 and 2021 and their confidence intervals (indication of the uncertainty) as rectangular bars.

The bar plots are showing the sum of the total number of fires in California and Oregon between 2018 and 2021 and their confidence intervals (indication of the uncertainty) as rectangular bars.

The bar plots are showing sum of total tonnes of CO2 emissions in million tonnes from fires in California and Oregon between 2018 and 2021 and their confidence intervals as rectangular bars.

The bar plots are showing sum of total tonnes of CO2 emissions in million tonnes from fires in California and Oregon between 2018 and 2021 and their confidence intervals as rectangular bars.

GHG Emissions from California and Oregon for 2021

For the rest of the tutorial, we will focus on analysing the 2021 fires in California and Oregon.

Let’s get 2021 fires data for California and Oregon from our combined dataframe.

comb_2021 = combined[combined["year"]==2021]
comb_2021 = comb_2021.groupby(["States", "month"]).agg({
"fires": "sum",
"co2_emission_estimate": "sum"
}).reset_index()

Creating barplots using seaborn to visualize 2021 fires and emissions.

sns.set()
sns.barplot(x = 'month', y = 'fires', hue = 'States', data = comb_2021, estimator= sum)
plt.xlabel('Month')
plt.ylabel('Fires')
plt.title('California and Oregon fires in 2021')
plt.tight_layout()
plt.show()

sns.set()
sns.barplot(x = 'month', y = 'co2_emission_estimate', hue = 'States', data = comb_2021, estimator= sum)
plt.yticks([25000000,50000000, 75000000, 100000000, 125000000, 150000000, 175000000],['25M','50M','75M','100M','125M', '150M', '175M'])
plt.xlabel('Month')
plt.ylabel('$CO_{2}$ Emissions Estimate (in tonnes)')
plt.title('California and Oregon $CO_{2}$ emissions (in tonnes) from fires \n in 2021')
plt.show()

Fires in California and Oregon in 2021

Fires in California and Oregon in 2021

GHG emissions in million tonnes from fires in California and Oregon in 2021

GHG emissions in million tonnes from fires in California and Oregon in 2021

Further Analysis for the United States of America

Fires in the United States between 2018 to 2021

The procedure for getting the data using Shape ID remains the same as we did for California and Oregon.

The dataframe after downloading the USA country level data will look like below!

The dataframe after downloading the USA country level data will look like below!

usa['fires'] = pd.to_numeric(usa['fires'])
usa['co2_emission_estimate'] = pd.to_numeric(usa['co2_emission_estimate'])

usa["year"] = pd.to_datetime(usa["datetime"]).dt.year
usa['date'] = pd.to_datetime(usa["datetime"]).dt.date

usa.index = pd.to_datetime(usa.index)
usa_18_21 = usa.groupby("year").agg({"fires": "sum", "co2_emissions_estimate": "sum"}).reset_index()

The resulting dataframe gives us the total emissions that took place over the 4 years!

The resulting dataframe gives us the total emissions that took place over the 4 years!

💡 We are looking forward to including timeBucket as a parameter as it will enable getting the data in monthly, yearly and many more temporal formats!

Let’s create a simple line plot using plotly that will help us visualize how the fires have evolved from 2018 till 2021. Let’s first get the data for the 4 years of interest!

fig = go.Figure(go.Scatter(
    x = usa_18_21["year"],
    y = usa_18_21["fires"], ))

fig.update_layout(
    xaxis = dict(
        tickmode = 'array',
        tickvals = [2018, 2019, 2020, 2021],
        ticktext = ['2018', '2019', '2020', '2021']
    )
)

fig.update_layout(
    title="Fires in USA from 2018 - 2021",
    xaxis_title="Year",
    yaxis_title="Fire Count",
    font=dict(
        family="Times New Roman"    
    )
)

fig.show()

Aggregated Greenhouse Gas emissions associated with biomass burning, Zuri, Blue Sky Analytics

Aggregated Greenhouse Gas emissions associated with biomass burning, Zuri, Blue Sky Analytics

Fires distribution for all the US states

params_states = {
    "api-key": "#INSERT YOUR KEY HERE", 
    "shapeId": "b96447fb-f7af-4bfa-89d7-13b84eb18b87", #INSERT SHAPE ID OF INTEREST
    "startDate": "2018-01-01T00:00:00.000Z",
    "endDate": "2021-12-31T00:00:00.000Z",
    "raw": "false",
		"adminLevelBucket":"3", ## 3 will fetch the data for all states 
    "keys": "co2_emission_estimate",
}
usastates = requests.get("https://gateway.blueskyhq.in/api/zuri/fires?", params = params_states).json()
print(json.dumps(usastates, indent=2))
us_states = pd.DataFrame(usastates['data'])

Now we will follow the same steps as we did in the previous case to add more columns like date, month etc. The resulting dataframe would look like this!

Now we will follow the same steps as we did in the previous case to add more columns like date, month etc. The resulting dataframe would look like this!

Let us now see which are the states that contribute the most to GHG emissions using a query filter co2_emission_estimate greater than 5 million tonnes.

fig2 = px.scatter(us_states.query("co2_emission_estimate >= 5000000"), x="fires", y="co2_emission_estimate",
                 size="fires", color="name",
                 hover_name="name", log_x=True, size_max=60, title="Top 2 US States contributing to GHG emissions from Fires from 2018 - 2021")
fig2.update_layout(
    font_family="Times New Roman"
)
fig2.show()

The radius of the bubbles represents the number of fires and the location in the vertical direction represents CO2 emissions.

The radius of the bubbles represents the number of fires and the location in the vertical direction represents CO2 emissions.

From the figure above, we can conclude that the states that contributed most to the GHG emissions between 2018 - 2021 were California, Oregon, Alaska and Colorado.

Key Takeaways

  • In the USA, the total number of fires has increased by 20% and the CO2 emissions by 28% in 2021 since 2018.

  • California and Oregon contribute enormously to the GHG emissions resulting from fires. In 2021, the USA emitted 423.6 million tonnes of CO2 of which 225 million tonnes; around 53%; of emissions were from California and Oregon only.

  • In 2021, California and Oregon combined witnessed almost 250K fires resulting in a whopping 250M tonnes of CO2 emissions.

  • Most fires were seen from July to September when the temperature is already very high.