In this Pandas tutorial, we will learn how to import data from JSON to Excel in Python. This guide will cover 4 simple steps making use of Python’s json module, and the Python packages requests and Pandas.
The structure of this tutorial is as follows. In the first section, we will have a quick look at a basic example of how to convert JSON to an Excel file with Pandas and Python. After we have seen and briefly learned, the syntax we will continue with a section covering some examples on when this knowledge may be useful. In the third section, we will have a look at the prerequisites of this Python tutorial and how to install Pandas. After we are sure we have everything needed, we will go through four steps on how to save JSON to Excel in Python. Here, we will start by reading the JSON file from the hard drive and saving it as an Excel file. Furthermore, we will also look at an example when reading JSON from a URL and saving it as a .xlsx file. Finally, we will also use the Python package JSON to excel converter.
Basic Code Example to import JSON to Excel with Python
Here is the easiest way to convert JSON data to an Excel file using Python and Pandas:
import pandas as pd df_json = pd.read_json(‘DATAFILE.json’) df_json.to_excel(‘DATAFILE.xlsx’)
Briefly explained, we first import Pandas and then we create a dataframe using the read_json method. Finally, we save the dataframe to a .xlsx file with the to_excel method. Remember to change the string to the path we are you store the JSON data as well as where you want to save the Excel file.
In this section, the problem will be briefly formulated. Now, although we can work with data using only Python and its packages, we might collaborate with people that want to work in Excel. It is, of course, true that Microsoft Excel is packed with features to keep and organize tabular data. Furthermore, it might also be easier to get an overview of the data in Excel. Now, a lot of open data resources are storing data in a variety of different file formats. But sometimes, we might find out that the data is stored in the JSON format only. In some cases, we might want to obtain data from JSON and save it to Excel and send it to our collaborators. Now, in other cases, we just want to explore data ourselves, as mentioned before, using Excel.
Despite it is possible to import JSON to Excel with Microsoft Excel itself, the optimal way is automating the conversion by importing data from JSON to Excel worksheets programmatically. Especially, if we already know Python.
In this section, you will learn what you need to have installed to convert JSON data to Excel files.
Obviously, to follow this guide you will need to have Python, requests, and Pandas installed. Secondly, and not that obvious maybe, you will also need to install the package openpyxl. This package is what will be used, by Pandas, to create Excel files (e.g., .xlsx).
Now, there is one very easy way to install both Python and Pandas: installing a Python scientific distribution such as Anaconda, ActivePython, or Canopy (see here for a list of Python distributions). For example, if you install Anaconda you will get a library with useful Python packages, including Pandas.
How to Install Pandas and openpyxl
If you already have Python installed, you can use pip to install Python packages. To install Pandas and openpyxl using pip open up Windows Command Prompt or the Terminal you prefer and type the following code:
pip install pandas openpyxl
Sometimes, pip will warn us that there’s a newer version. If this is the case now, read the post about upgrading pip for more information.
Note, that it is generally a good idea to create a virtual environment and install your new packages in this environment. Now, that you have all that you need we will continue to the next section of this tutorial. If you use a tool such as pipx you can automatically install Python packages in virtual environments. Here’s a YouTube video on how to install Pandas and Anaconda:
4 Steps to Convert JSON to Excel in Python
In this section we will go through, step-by-step, how to import JSON data and save it as an Excel file using Python. Here’s a summary of what this chapter will cover: 1) importing pandas and json, 2) reading the JSON data from a directory, 3) converting the data to a Pandas dataframe, and 4) using Pandas to_excel method to export the data to an Excel file.
1. Importing the Pandas and json Packages
First, we start by importing Pandas and json:
import json import pandas as pd
Now, it may be evident but we are importing json and then Pandas to use json to read the data from the file and, after we have done this, we will use Pandas to save it as a .xlsx file.
2. Reading the JSON file
Now, we are ready to import the data from a file using the load method:
<pre><code class="lang-py">with open('./SimData/save_to_excel.json') as json_file: data = json.load(json_file)</code></pre>
As you can see in the code chunk above, we are first opening a file with Python (i.e., as json_file) with the with-statement. Moreover, in the with-statement we are using the load method. This is when we are actually reading the JSON file.
3. Creating a Pandas Dataframe
Now, before we can save the data, we have imported we need to create a dataframe:
df = pd.DataFrame(data)
In the code chunk above, we used Pandas DataFrame class to create a dataframe from the JSON data we have, in the previous section, loaded into Python. Basically, what we do is similar to converting a Python dictionary to a Pandas dataframe. Note, adding new columns to the dataframe, before saving it, is of course also possible.
4. Saving the Imported Data as a .xlsx File
Finally, we can use the to_excel method that is available in the dataframe object:
Now, if we want to convert JSON data to other formats such as CSV, SPSS, or STATA to name a few we can use other methods available in the dataframe object. For example, if we want to save the JSON data to a CSV file we can use the to_csv method. If you are interested in learning more about reading and writing data files with Python and Pandas check out the following blog posts:
- How to Read and Write Excel Files with Pandas
- How to Read and Write Stata (.dta) Files in R with Haven
Now, it is worth mentioning here that we actually can skip step 2 and use Pandas read_json method to import the JSON data to a dataframe. Here’s a full working example on how to save JSON data with only the Pandas package:
import pandas as pd df = pd.read_json('./SimData/save_to_excel.json') df.to_excel('./SimData/exported_json_data.xlsx')
In the next section, we will have a look at how to read JSON data from a URL and save it as an Excel file.
JSON to Excel: Reading data from a URL
In the previous section, we learned, step-by-step, how to automatically convert JSON data to an Excel file in Python. Here, we are going to use the same steps, but we are going to read the data from a URL. First, we are going to look at a simple example where we can use the same code as above. However, we are going to change the string so that it is pointing at the URL location instead of a directory.
import pandas as pd df = pd.read_json('http://api.open-notify.org/iss-now.json') df.to_excel('locationOfISS.xlsx')
Now, as you can see, we managed to read this JSON file with Python and save it as an Excel file using Pandas. In the next section, we will look at a bit more complex JSON file where the data is nested.
Nested JSON data to Excel
In this section, we will look at a bit more complex example. Now, there are times when the JSON data is nested and if we use the method above the Excel file will be messy. Luckily, we can fix this by using json_normalize from Pandas and the requests module:
import requests from pandas.io.json import json_normalize url = "https://think.cs.vt.edu/corgis/datasets/json/airlines/airlines.json" resp = requests.get(url=url) df = json_normalize(resp.json()) # Writing Excel File: df.to_excel('Airlines.xlsx')
In the code chunk above, we first imported requests and then json_normalize. Second, we created a string with the URL to the JSON data we want to save as an Excel file. Now, the next thing we do to use the get method which sends a GET request to our URL. Briefly, this is pretty much reading in the webpage that we want, and we can, then, use the json method to get the json data. Finally, we used the json_normalize to create a dataframe that we save as an Excel file. Again, saving the data we read from the JSON file is done with the to_excel method.
Now, as you may have noticed in the image of the output (i.e., the .xlsx files opened in Excel) we have a row that is not part of our original data. This row is the index row from the Pandas dataframe and we can, of course, get rid of this. In the next section, we will look at some of the arguments of the to_excel method. For example, if we want to get rid of the index row, we will learn how to do that. If you are interested check out the tutorial on how to read and write JSON with Python and Pandas.
Import JSON to Excel and Specifying the Sheet Name
In this section, we will specify the sheet name using the sheet_name argument. Now, it might be obvious, but the string we input will be the name of the Worksheet in the Excel file:
In the example code above, we named the sheet ‘Session1’. However, we still can see the index column in the Excel file. In the next section, we will have a look on how to remove this index.
JSON to Excel with Pandas Removing the Index Column
To remove the index column when converting JSON to Excel using Pandas we just use the index argument:
Now, this argument takes a bool (True/False) and the default is True. Thus, in the code example above we set it to False and we get this Excel file:
Of course, there are plenty of more arguments that we could use when converting JSON to Excel with Pandas. For instance, we can set the encoding of the excel file. Another thing we can do is to set the engine: openpyxl or xlsxwriter. Note, to use the to_excel method to save the JSON file to an Excel file you need to have one of them installed on your computer (see the previous section). See the documentation for more information on how to use the arguments of the to_excel method.
Other Options: the JSON to excel converter Package
Now, there are of course other methods and/or Python packages that we can use to convert JSON to Excel in Python. For example, we could work with the json module and choosing one of xlsxwriter or openpyxl packages. It is worth noting, however, that the resulting Python script will be a bit more complex. Furthermore, there is a Python package created just for the purpose of converting JSON to Excel: JSON to excel converter.
JSON to Excel with the JSON to excel converter package
Here’s a how to quickly convert JSON to an .xlsx file:
from json_excel_converter import Converter from json_excel_converter.xlsx import Writer url = 'https://think.cs.vt.edu/corgis/datasets/json/broadway/broadway.json' resp = requests.get(url=url) conv = Converter() conv.convert(resp.json(), Writer(file='./SimData/ExcelData.xlsx'))
This package is very easy to use. In the example code above, we converted the airlines.json data to an Excel file. Remember, this data is nested and one very neat thing with the JSON to excel converter package is that it can handles this very nicely (see the image below for the resulting Excel file).
Note, however, that you need to install the xlsxwriter package to be able to use the JSON to excel converter. Both can be installed using pip:
pip install json-excel-converter[xlsxwriter].
In this post, we have covered a lot of things related to opening JSON files and saving them as Excel files using Python. Certainly, knowing how to convert JSON data to Excel files might be useful in many situations. For instance, if we are getting the data and collaborating with someone who prefers using Excel. To summarize, in this tutorial we have used the Python packages Pandas, json, requests, and JSON to excel converter to read JSON files and save them as Excel files. First, we had a quick look at the syntax, then we learned 4 steps to converting JSON to Excel. After that, we have also learned how to read the JSON data from a URL, and how to work a bit with some of the arguments of the to_excel method.
Hopefully, you learned something from this Python tutorial and if you did find it useful: please do share it on your social media accounts. Finally, if you have anything you would like to learn please leave a comment below and I might put that together in a new tutorial.