In this Python tutorial, we will learn how to read and write JSON files using Python. In the first part, we are going to use the Python package json to create and read a JSON file as well as write a JSON file. After that, we are going to use Pandas read_json method to load JSON files into Pandas dataframe. Here, we will learn how to read from a JSON file locally and from an URL as well as how to read a nested JSON file using Pandas.

Finally, as a bonus, we will also learn how to manipulate data in Pandas dataframes, rename columns, and plot the data using Seaborn.

What is a JSON File?

JSON, short for JavaScript Object Notation, is a compact, text based format used to exchange data. This format that is common for downloading, and storing, information from web servers via so-called Web APIs. JSON is a text-based format and  when opening up a JSON file, we will recognize the structure. That is, it is not so different from Python’s structure for a dictionary.

  • Save
Example JSON file

In the first Python parsing json example, we are going to use the Python module json to create a JSON file. After we’ve done that we are going to load the JSON file. In this Python JSON tutorial, we start by create a dictionary for our data:

import json # Creating a Python Dictionary data = {"Sub_ID":["1","2","3","4","5","6","7","8" ], "Name":["Erik", "Daniel", "Michael", "Sven", "Gary", "Carol","Lisa", "Elisabeth" ], "Salary":["723.3", "515.2", "621", "731", "844.15","558", "642.8", "732.5" ], "StartDate":[ "1/1/2011", "7/23/2013", "12/15/2011", "6/11/2013", "3/27/2011","5/21/2012", "7/30/2013", "6/17/2014"], "Department":[ "IT", "Manegement", "IT", "HR", "Finance", "IT", "Manegement", "IT"], "Sex":[ "M", "M", "M", "M", "M", "F", "F", "F"]} print(data)
Code language: Python (python)
  • Save
Python dictionary

Parsing JSON in Python

In the next Python read a JSON file example, we are going to read the JSON file, that we created above. Reading a JSON file in Python is pretty easy, we open the file using open(). After this is done, we read the JSON file using the load() method.

# Python Parsing JSON with open('data.json') as json_file: data = json.load(json_file) print(data)
Code language: Python (python)

Now, it was pretty easy to use Python to read a ,json file, right? Now, before going on learning how to save a JSON file in Python, it is worth mentioning that we can also parse Excel files. For example, see the post about reading xlsx files in Python for more information-

Saving to a JSON file

In this section, we will learn how to write a JSON file in Python. In Python, this can be done using the module json . This module enables us both to read and write content to and from a JSON file. This module converts the JSON format to Python’s internal format for Data Structures. So we can work with JSON structures just as we do in the usual way with Python’s own data structures.

Python Read JSON File Example:

In the example code below, we will learn how to make a JSON file in Python. First, we start by importing the json module. After we’ve done that, we open up a new file and use the dump method to write a json file using Python.

import json # Parse JSON with open('data.json', 'w') as outfile: json.dump(data, outfile)
Code language: Python (python)

It is, of course, possible to save JSON to other formats, such as xlsx, and CSV, and we will learn how to export a Pandas dataframe to CSV, later in this blog post.

How to Use Pandas to Load a JSON File

Now, if we are going to work with the data we might want to use Pandas to load the JSON file into a Pandas dataframe. This will enable us to manipulate data, do summary statistics, and data visualization using Pandas built-in methods. Note, we will cover this briefly later in this post also.

Pandas Read Json Example:

In the next example we are going to use Pandas read_json method to read the JSON file we wrote earlier (i.e., data.json). It’s fairly simple we start by importing pandas as pd:

import pandas as pd # Read JSON as a dataframe with Pandas: df = pd.read_json('data.json') df
Code language: Python (python)

The output, when working with Jupyter Notebooks, will look like this:

  • Save

It’s also possible to convert a dictionary to a Pandas dataframe. Make sure to check that post out for more information.

Data Manipulation using Pandas

Now that we have loaded the JSON file into a Pandas dataframe we are going use Pandas inplace method to modify our dataframe. We start by setting the Sub_ID column as index.

df.set_index('Sub_ID', inplace=True) df
Code language: Python (python)
  • Save

Pandas JSON to CSV Example

Now when we have loaded a JSON file into a dataframe we may want to save it in another format. For instance, we may want to save it as a CSV file and we can do that using Pandas to_csv method. It may be useful to store it in a CSV, if we prefer to browse through the data in a text editor or Excel.

In the Pandas JSON to CSV example below, we carry out the same data manipulation method.

# Saving .json to .csv in Python df.to_csv("data.csv")
Code language: Python (python)
  • Save
Pandas json to CSV example file

In a more recent post, you will learn how to convert JSON to Excel (.xlsx) in Python with Pandas, among other Python packages.

Pandas to JSON Example

In this section, we are going to learn how to save Pandas dataframe to JSON. If we, for instance, have our data stored in a CSV file, locally, but want to enable the functionality of the JSON files we will use Pandas to_json method:

df = pd.read_csv("data.csv") # Save dataframe to JSON format df.to_json("data.json")
Code language: Python (python)

Learn more about working with CSV files using Pandas in the  Pandas Read CSV Tutorial

How to Load JSON from an URL

We have now seen how easy it is to create a JSON file, write it to our hard drive using Python Pandas, and, finally, how to read it using Pandas. However, as previously mentioned, many times the data in stored in the JSON format are on the web.

Thus, in this section of the Python json guide, we are going to learn how to use Pandas read_json method to read a JSON file from a URL. Most often, it’s fairly simple we just create a string variable pointing to the URL:

url = "" # Parsing JSON from URL: df = pd.read_json(url) df.head()
Code language: Python (python)
  • Save

Load JSON from an URL Second Example

When loading some data, using Pandas read_json seems to create a dataframe with dictionaries within each cell. One way to deal with these dictionaries, nested within dictionaries, is to work with the Python module request. This module also has a method for parsing JSON files. After we have parsed the JSON file we will use the method json_normalize to convert the JSON file to a dataframe.

  • Save
Pandas Dataframe from JSON
import requests from import json_normalize url = "" resp = requests.get(url=url) # Handling nested JSON: df = json_normalize(resp.json()) df.head()
Code language: Python (python)
  • Save

As can be seen in the image above, the column names are quite long. This is quite impractical when we are going to create a time series plot, later, using Seaborn. We are now going to rename the columns so they become a bit easier to use.

In the code example below, we use Pandas rename method together with the Python module re. That is, we are using a regular expression to remove “statistics.# of” and “statistics.” from the column names. Finally, we are also replacing dots (“.”) with underscores (“_”) using the str.replace method:

import re df.rename(columns=lambda x: re.sub("statistics.# of","",x), inplace=True) df.rename(columns=lambda x: re.sub("statistics.","",x), inplace=True) df.columns = df.columns.str.replace("[.]", "_") df.head()
Code language: Python (python)

Renaming columns is something that is quicker, and easier, done using the Python package Pyjanitor (thanks Shantanu Oak, for pointing this out in the comment section). See how to use Pandas and Pyjanitor to rename columns, among other things, using Pyjanitor:

Time Series Plot from JSON Data using Seaborn

In the last example, in this post, we are going to use Seaborn to create a time series plot. The data we loaded from JSON to a dataframe contains data about delayed and canceled flights. We are going to use Seaborns lineplot method to create a time series plot of the number of canceled flights throughout 2003 to 2016, grouped by carrier code.

%matplotlib inline import matplotlib.pyplot as plt import seaborn as sns fig = plt.figure(figsize=(10, 7)) g = sns.lineplot(x="timeyear", y="flightscancelled", ci=False, hue="carriercode", data=df) g.set_ylabel("Flights Cancelled",fontsize=20) g.set_xlabel("Year",fontsize=20) plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
Code language: Python (python)
  • Save

Note, we changed the font size as well as the x- and y-axis’ labels using the methods set_ylabel and set_xlabel. Furthermore, we also moved the legend using the legend method from matplotlib.

For more about exploratory data analysis using Python:


In this post, we have learned how to write a JSON file from a Python dictionary, how to load that JSON file using Python and Pandas. Furthermore, we have also learned how to use Pandas to load a JSON file from a URL to a dataframe, how to read a nested JSON file to a dataframe.

Here’s a link to a Jupyter Notebook containing all code examples in this post.

  • Save
Share via
Copy link
Powered by Social Snap