Press "Enter" to skip to content

How to Read and Write JSON Files using Python and Pandas

In this post, we will learn how to read and write JSON files using Python. In the first part, we are going to use the Python package json to create and read a JSON file as well as write a JSON file. After that, we are going to use Pandas json method to load JSON files into Pandas dataframe. Here, we will learn how to read from a JSON file locally and from an URL as well as how to read a nested JSON file using Pandas.

Finally, as a bonus, we will also learn how to manipulate data in Pandas dataframes, rename columns, and plot the data using Seaborn.

What is a JSON File?

JSON, short for JavaScript Object Notation, is a compact, text based format used to exchange data. This format that is common for downloading, and storing, information from web servers via so-called Web APIs. JSON is a text-based format and  when opening up a JSON file, we will recognize the structure. That is, it is not so different from Python’s structure for a dictionary.

Example JSON file

In the first Python parsing json example, we are going to use the Python module json to create a JSON file. After we’ve done that we are going to load the JSON file. In this Python JSON tutorial, we start by create a dictionary for our data:

import json

data = {"Sub_ID":["1","2","3","4","5","6","7","8" ],
        "Name":["Erik", "Daniel", "Michael", "Sven",
                "Gary", "Carol","Lisa", "Elisabeth" ],
        "Salary":["723.3", "515.2", "621", "731", 
                  "844.15","558", "642.8", "732.5" ],
        "StartDate":[ "1/1/2011", "7/23/2013", "12/15/2011",
                     "6/11/2013", "3/27/2011","5/21/2012", 
                     "7/30/2013", "6/17/2014"],
        "Department":[ "IT", "Manegement", "IT", "HR", 
                      "Finance", "IT", "Manegement", "IT"],
        "Sex":[ "M", "M", "M", 
              "M", "M", "F", "F", "F"]}

Python dictionary

Parsing JSON in Python

In the next Python parsing JSON example, we are going to read the JSON file, that we created above. Reading a JSON file in Python is pretty easy, we open the file using open. After this is done, we read the JSON file using the load method.

with open('data.json') as json_file:
    data = json.load(json_file)


Saving to a JSON file

In this section, we will learn how to write a json file in Python. In Python, this can be done using the module json . This module enables us both read and write content to and from a JSON file. This module converts the JSONs format to Python’s internal format for Data Structures. So we can work with JSON structures just as we do in the usual way with Python’s own data structures.

Python JSON Example:

In the example code below, we will learn how to make a json file in Python. First, we start by importing the json module. After we’ve done that, we open up a new file and use the dump method to write a json file using Python.

import json
with open('data.json', 'w') as outfile:
    json.dump(data, outfile)

It is, of course, possible to save JSON to other formats, such as xlsx, and CSV, and we will learn how to export a Pandas dataframe to CSV, later in this blog post.

How to Use Pandas to Load a JSON File

Now, if we are going to work with the data we might want to use Pandas to load the JSON file into a Pandas dataframe. This will enable us to manipulate data, do summary statistics, and data visualization using Pandas built-in methods. Note, we will cover this briefly later in this post also.

Pandas Read Json Example:

In the next example we are going to use Pandas read_json method to read the JSON file we wrote earlier (i.e., data.json). It’s fairly simple we start by importing pandas as pd:

import pandas as pd

df = pd.read_json('data.json')


The output, when working with Jupyter Notebooks, will look like this:

Data Manipulation using Pandas

Now that we have loaded the JSON file into a Pandas dataframe we are going use Pandas inplace method to modify our dataframe. We start by setting the Sub_ID column as index.

df.set_index('Sub_ID', inplace=True)

Pandas JSON to CSV Example

Now when we have loaded a JSON file into a dataframe we may want to save it in another format. For instance, we may want to save it as a CSV file and we can do that using Pandas to_csv method. It may be useful to store it in a CSV, if we prefer to browse through the data in a text editor or Excel.

In the Pandas JSON to CSV example below, we carry out the same data manipulation method.

Pandas json to CSV example file

Pandas to JSON Example

In this section, we are going to learn how to save Pandas dataframe to JSON. If we, for instance, have our data stored in a CSV file, locally, but want to enable the functionality of the JSON files we can Pandas to_json method:

df = pd.read_csv("data.csv")

# Save dataframe to JSON format


Learn more about working with CSV files using Pandas in the  Pandas Read CSV Tutorial

How to Load JSON from an URL

We have now seen how easy it is to create a JSON file, write it to our hard drive using Python Pandas, and, finally, how to read it using Pandas. However, as previously mentioned, many times the data in stored in the JSON format are on the web.

Thus, in this section of the Python json guide, we are going to learn how to use Pandas read_json method to read a JSON file from an URL. Most often, it’s fairly simple we just create a string variable pointing to the URL:

url = ""
df = pd.read_json(url)

Load JSON from an URL Second Example

When loading some data, using Pandas read_json seems to create a dataframe with dictionaries within each cell. One way to deal with these dictionaries, nested within dictionaries, is to work with the Python module request. This module also have a method for parsing JSON files. After we have parsed the JSON file we will use the method json_normalize to convert the JSON file to a dataframe.

Pandas Dataframe from JSON
import requests
from import json_normalize

url = ""
resp = requests.get(url=url)

df = json_normalize(resp.json())

As can be seen in the image above, the column names are quite long. This is quite impractical when we are going to create a time series plot, later, using Seaborn. We are now going to rename the columns so they become a bit easier to use.

In the code example below, we use Pandas rename method together with the Python module re. That is, we are using a regular expression to remove “statistics.# of” and “statistics.” from the column names. Finally, we are also replacing dots (“.”) with underscores (“_”) using the str.replace method:

import re

df.rename(columns=lambda x: re.sub("statistics.# of","",x), 
df.rename(columns=lambda x: re.sub("statistics.","",x), 

df.columns = df.columns.str.replace("[.]", "_")

Renaming columns is something that is quicker, and easier, done using the Python package Pyjanitor (thanks Shantanu Oak, for pointing this out in the comment section). See how to use Pandas and Pyjanitor to rename columns, amongst other things, using Pyjanitor:

Time Series Plot from JSON Data using Seaborn

In the last example, in this post, we are going to use Seaborn to create a time series plot. The data we loaded from JSON to a dataframe contains data about delayed and canceled flights. We are going to use Seaborns lineplot method to create a time series plot of the number of canceled flights throughout 2003 to 2016, grouped by carrier code.

%matplotlib inline

import matplotlib.pyplot as plt
import seaborn as sns

fig = plt.figure(figsize=(10, 7))
g = sns.lineplot(x="timeyear", y="flightscancelled", ci=False,
             hue="carriercode", data=df)

g.set_ylabel("Flights Cancelled",fontsize=20)

plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

Note, we changed the font size as well as the x- and y-axis’ labels using the methods set_ylabel and set_xlabel. Furthermore, we also moved the legend using the legend method from matplotlib.

For more about exploratory data analysis using Python:


In this post we have learned how to write a JSON file from a Python dictionary, how to load that JSON file using Python and Pandas. Furthermore, we have also learned how to use Pandas to load a JSON file from an URL to a dataframe, how to read a nested JSON file to a dataframe.

Here’s a link to a Jupyter Notebook containing all code examples in this post.


    • Hey Shantanu,

      Thanks for your comment and your suggestion. I will look into the clean_names method provided by pyjanitor.



    • Thanks again Shantanu for letting me know about Pyjanitor. I ended up writing a blog post using it,



Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: