In this Pandas tutorial, we will go through how to rename columns in a Pandas dataframe. First, we will learn how to rename a single column. Second, we will go on with renaming multiple columns. In the third example, we will also have a quick look at how to rename grouped columns. Finally, we will change the column names to lowercase.

First, however, we are goign to look at a simple code example on how to rename a column in a Pandas dataframe.

Rename a Column in Pandas Dataframe

To rename a column we can use the rename method:

df.rename(columns={'OldName':'NewName'}, 
          inplace=True)

Why Bother with Renaming Variables?

Now, when we are working with a dataset, whether it is big data or a smaller data set, the columns may have a name that needs to be changed. For instance, if we have scraped our data from HTML tables using Pandas read_html the column names may not be suitable for our displaying our data, later. Furthermore, this is at many times part of the pre-processing of our data.

  • Save

Prerequisites

Now, before we go on and learning how to rename columns, we need to have Python 3.x and Pandas installed. Now, Python and Pandas can be installed by installing a scientific Python distribution, such as Anaconda or ActivePython. On the other hand, Pandas can be installed, as many Python packages, using Pip: pip install pandas. Refer to the blog post about installing Python packages for more information.

If we install Pandas, and we get a message that there is a newer version of Pip, we can upgrade pip using pip, conda, or Anaconda navigator.

How to Rename Columns in Pandas?

renaming columns in pandas
  • Save

So how do you change column names in Pandas? Well, after importing the data, we can change the column names in the Pandas dataframe by either using df.rename(columns={'OldName':'NewName'}, inplace=True or assigning a list the columns method; df.columns = list_of_new_names.

Example Data

In this renaming columns in Pandas dataframe tutorial, we are going to read an Excel file with Pandas to import data. More information about importing data from Excel files can be found in the Pandas read excel tutorial, previously posted on this blog.

import pandas as pd


xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'

df = pd.read_excel(xlsx_url, index_col=0)

Now, we used the index_col argument, because the first column in the Excel file we imported is the index column. If we want to get the column names from the Pandas dataframe we can use df.columns:

df.columns
  • Save

Now, it is also possible that our data is stored in other formats such as CSV, SPSS, Stata, or SAS. Make sure to check out the post on how to use Pandas read_csv to learn more about importing data from .csv files.

How to Rename a Single Column in Pandas

In the first example, we will learn how to rename a single column in Pandas dataframe. Note, in the code snippet below we use df.rename to change the name of the column “Subject ID” and we use the inplace=True to get the change permanent.

# inplace=True to affect the dataframe
df.rename(columns = {'Subject ID': 'SubID'}, 
          inplace=True)

df.head()
  • Save

Renaming Multiple Columns in Pandas Example 1

To rename columns in Pandas dataframe we do as follows:

  1. Get the column names by using df.columns
  2. Use the df.rename, put in a dictionary of the columns we want to rename

Here’s a working example on renaming columns in Pandas.

import pandas as pd

xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'
df = pd.read_excel(xlsx_url, index_col=0)
print(df.columns)

df.rename(columns = {'RT':'ResponseTime', 
                    'First Name':'Given Name',
                    'Subject ID':'SubID'},
         inplace=True)

Renaming Columns in Pandas Example 2

Another example on how to rename many columns in Pandas dataframe is to assign a list of new column names to df.columns:

import pandas as pd

xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'
df = pd.read_excel(xlsx_url, index_col=0)

# New column names
new_cols = ['SubID', 'Given Name', 'Day', 'Age', 'ResponseTime', 'Gender']

# Renaming the columns
df.columns = new_cols

df.head()
changing name on multiple columns in pandas dataframe
  • Save
resulting dataframe with renamed columns

As can be seen in the code above, we imported our data from an Excel file, we created a list with the new column names. Finally, we renamed the columns with df.columns.

Renaming Columns while Importing Data

In this section, we are going to learn how to rename columns while reading the Excel file. Now, this is also very simple. To accomplish this we create a list before we use the read_excel method. Note, we need to add a column name, in the list, for the index column.

import pandas as pd


xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'

# New column names
new_cols = ['Index', 'Subject_ID', 'Given Name', 'Day', 'Age', 'ResponseTime', 'Gender']

df = pd.read_excel(xlsx_url, names=new_cols, index_col=0)

df.columns
renaming columns in pandas
  • Save
new column names

Importantly, when changing the name of the columns while reading the data we need to know the number of columns before we load the data.

Renaming Grouped Columns in Pandas Dataframe

In this section, we are going to rename grouped columns in Pandas dataframe. First, we are going to use Pandas groupby method (if needed, check the post about Pandas groupby method for more information). Second, we are going rename the grouped columns using Python list comprehension and df.columns, among other dataframe methods.

import pandas as pd
import numpy as np


xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'
df = pd.read_excel(xlsx_url, index_col=0)

grouped = df.groupby('Day').agg({'RT':[np.mean, np.median, np.std]})
grouped
renaming columns in grouped dataframe
  • Save
Grouped dataframe

Now, as you can see in the image above, we have a dataframe with multiple indexes. In the next code chunk, however, we are going to rename the grouped dataframe.

grouped.columns =  ['_'.join(x) for x in grouped.columns.ravel()]
grouped
renaming columns in grouped pandas dataframe
  • Save
Renamed column names in Pandas

Note, in the code above we also used Pandas ravel method to flatten the output to an ndarray. Now, there are other ways we can change the name of columns in Pandas dataframe. For instance, if we only want to change the column names so that the names are in lower case we can use str.lower.

df.rename(columns=str.lower).head()

Importantly, if we want the change to be permanent we need to add the inplace=True argument.

Changing Column Names in Pandas Dataframe to Lowercase

To change all column names to lowercase we can use the following code:

df.rename(columns=str.lower)
changing column names in pandas
  • Save
Column names changed to lowercase

Now, this is one way to preprocess data in Python with pandas. In another post, on this blog, we can learn about data cleaning in Python with Pandas and Pyjanitor.

Conclusion

In this post, we have learned all we need to know about renaming columns in Pandas dataframes. First, we learned how to rename a single column. Subsequently, we renamed many columns usinsg two methods; df.rename and df.columns. Third, we renamed the columns while importing the data from an Excel file. In the third example, we learned how to rename grouped dataframes in Pandas. Finally, we also changed the column names to lowercase.

changing column names in pandas
  • Save
Share via
Copy link