Imagine this. You live in Patna. With winter arriving, you notice pollution levels seem to go up in the city. It is a hunch but you do not know if it actually does because there are no Air Quality Monitors around your neighbourhood. You want to find out how bad air pollution is right now in the city and see if it warrants more attention. How do you find this out?
In this tutorial, you will learn how to find PM 2.5 data from our Developer Portal and use it.
Prerequisites
They are as follows:
-
Python
-
Jupyter Notebook
-
Knowing how to use our Developer Portal
If you need help with any of these, you can refer to this tutorial, where we have explained all of this in detail.
Introducing SpatialAQ
If you live in India, you know air pollution is a big problem in parts of the country. However, what you probably do not know is how bad it is. The reason is quite simple - as a country, we lack sufficient ground monitors to truly understand the situation. (Read here for more details)
To address this gap, we developed the SpatialAQ - an air quality dataset generated in-house using satellite data and proprietary ML models that help us generate high-resolution air quality data for all of India!
Setting up API calls
Start your Jupyter notebook. You do this by going to the Terminal and typing jupyter notebook
-
Once the notebook opens up, you start by importing a few python packages:
-
Once the packages are imported, review the documentation to determine the endpoint for getting SpatialAQ data. According to our documentation, the data you are looking for is at the below-mentioned endpoint.
-
Enter the following parameters in the endpoint.
-
Now make your API call
-
You will see the following output. This is the result of the API call you just made.
If you receive the data in the format above, it means you have been successful in passing the appropriate parameters. You get the PM2.5 values of pollution for today. Do you want to expand the scope of the data? Then, read on.
Making the API call for one month of air quality data
Let us repeat the above process to get data for Patna for a month.
#You feed the parameters to the endpoint and request the data gods to give us this
mydict={
'api-key': 'INSERT YOUR KEY HERE',
'product': 'pm25',
'region': 'patna',
'regionType': 'district',
'duration': '1d'
}
patna_request1m=requests.get(endpoint,params=mydict).json()
print(json.dumps(patna_request1m, indent=2))
Notice, we have replaced 1d
with 1m
which represents a duration of 1 month. The above code returns the data for PM2.5 levels for a whole month for Patna.
{
"data": [
{
"datetime": "2021-12-31T00:00:00.000Z",
"pm25": 95.47887185534591
},
{
"datetime": "2022-01-01T00:00:00.000Z",
"pm25": 94.78063853948288
},
...
...
...
{
"datetime": "2022-01-29T00:00:00.000Z",
"pm25": 117.39931647449336
}
],
"meta": {
"duration": "1m",
"region": "patna",
"regionType": "district"
}
}
Analyzing the air quality data
The above output is in JSON
. You can do basic analysis by converting it from JSON
format to a table. For this, you will need to import another python package called pandas
.
import pandas as pd
You will now store the above JSON data in a pandas dataframe/table.
df= pd.DataFrame(patnarequest1m['data'])
Once this data is loaded onto a pandas dataframe, you can query it to find insights, as demonstrated below.
-
Worst and best days for pollution in the above duration
-
Cleanest air day
-
Average pollution level in this duration
Plotting the data
Summary statistics can give us a good insight from the data, but sometimes graphs can do a better job. You will need to follow the following steps to generate a graph.
-
Import the following packages
-
Send the above data frame to the matplotlib library to plot a chart.
This nifty piece of code quickly generates a chart showing how PM2.5 levels have varied in this duration.
As you have noticed, you were able to get data for the city of Patna in fewer than ten lines of code. You can extend this analysis to other regions and other durations with the same amount of code.