Renaming columns in Pandas can be a lifesaver when working with data. Sometimes, we might receive data from a colleague or download it from the internet, and the column names might not be informative or consistent with our analysis. This could lead to confusion and errors in data interpretation, so it’s essential to know how to rename columns in a pandas dataframe.

For instance, imagine you are working on a project where you have to analyze data collected from multiple sources. Each source might use different column names, making it difficult to compare and combine datasets. Renaming the columns can help standardize the names, making it easier to work with the data.

Thankfully, pandas provide a straightforward way to rename columns in a dataframe, allowing us to manipulate data quickly and efficiently.

This Pandas tutorial will cover the basics of renaming columns in pandas. We will start by exploring how to rename a single column, then move on to renaming multiple columns simultaneously. We will also learn to rename columns while importing data and change column names to lowercase. By the end of this post, you will have a solid understanding of how to rename columns in pandas and how to apply these skills to your data analysis projects.

How to Rename a Column in Pandas Dataframe

First of all, renaming columns in Pandas dataframe is very simple: to rename a column in a dataframe we can use the rename method:

df.rename(columns={'OldName':'NewName'},
inplace=True)Code language: Python (python)

In the code example above, the column “OldName” will be renamed “NewName”. Furthermore, the use of the inplace parameter make this change permanent to the dataframe.

Why Bother with Renaming Columns with Pandas?

Now, when working with a dataset, whether big data or a smaller data set, the columns may have a name that needs to be changed. For instance, if we have scraped our data from HTML tables using Pandas read_html, the column names may not be suitable for displaying our data later. Furthermore, this is, at many times, part of the pre-processing of our data.

pandas rename columns
  • Save

Prerequisites

Now, before we go on and learn how to rename columns, we need to have Python 3.x and Pandas installed. Python and Pandas can be installed with a scientific Python distribution, such as Anaconda or ActivePython. On the other hand, Pandas can be installed, as many Python packages, using Pip: pip install pandas. Refer to the blog post about installing Python packages for more information.

If we install Pandas and we get a message that there is a newer version of Pip, we can upgrade pip using pip, conda, or Anaconda navigator.

How to Rename Columns in Pandas?

renaming columns in pandas

So how do you change column names in Pandas? Well, after importing the data, we can change the column names in the Pandas dataframe by either using df.rename(columns={'OldName':'NewName'}, inplace=True or assigning a list the columns method; df.columns = list_of_new_names.

Example Data

This tutorial will read an Excel file with Pandas to import data. More information about importing data from Excel files can be found in the Pandas read excel tutorial, previously posted on this blog.

import pandas as pd

xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'

df = pd.read_excel(xlsx_url, index_col=0)Code language: Python (python)

Now, we used the index_col argument, because the first column in the Excel file we imported is the index column. If we want to get the column names from the Pandas dataframe we can use df.columns:

df.columnsCode language: Python (python)
  • Save

Now, it is also possible that our data is stored in other formats such as CSV, SPSS, Stata, or SAS. Check out the post on how to use Pandas read_csv to learn more about importing data from .csv files.

How to Rename a Column in Pandas

In the first example, we will learn how to rename a single column in Pandas dataframe. Note, in the code snippet below, we use df.rename to change the name of the column “Subject ID” and we use the inplace=True to get the change permanent.

# inplace=True to affect the dataframe
df.rename(columns = {'Subject ID': 'SubID'}, 
          inplace=True)

df.head()Code language: Python (python)
pandas dataframe
  • Save

In the code chunk above, we used the Pandas library to rename a column in a dataframe. The dataframe is referred to as df. We used the rename function with the columns argument to specify the current column name and the new name. Specifically, we renamed the Subject ID column to SubID by passing a dictionary with the old column name as the key and the new name as the value: {'Subject ID': 'SubID'}.

To ensure that the change is made in the original dataframe and not a copy, we set the inplace argument to True. This argument allows us to modify the dataframe directly without creating a copy.

To summarize, this code is an example of how to use Pandas to rename a column in a dataframe, allowing us to customize the column names to suit our needs better.

In the next section, we will look at how to use Pandas’ rename function to rename multiple columns.

How To Rename Columns in Pandas: Example 1

To rename columns in Pandas dataframe we do as follows:

  1. Get the column names by using df.columns (if we don’t know the names)
  2. Use the df.rename, use a dictionary of the columns we want to rename as input.

Here’s a working example on renaming columns in Pandas:

import pandas as pd

xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'
df = pd.read_excel(xlsx_url, index_col=0)
print(df.columns)

df.rename(columns = {'RT':'ResponseTime', 
                    'First Name':'Given Name',
                    'Subject ID':'SubID'},
         inplace=True)Code language: Python (python)
pandas rename column
  • Save

In the code chunk above, we imported the pandas library as pd and used it to read an Excel file from a URL. Note that we stored the URL in the variable xlsx_url and passed it as an argument to the pd.read_excel() function. We also set index_col=0 to use the first column of the Excel file as the index of the resulting dataframe, which we assigned to the variable df. We then printed the column names of the dataframe using df.columns. Next, we used the df.rename() function to rename three columns of the dataframe. We passed a dictionary as an argument to the columns parameter of the rename function. The keys of the dictionary represent the old column names, while the values represent the new column names. We used the inplace=True parameter to modify the original dataframe, rather than returning a new one. The columns that were renamed were ‘RT‘ to 'ResponseTime', 'First Name' to 'Given Name‘, and 'Subject ID' to 'SubID'.

Renaming Columns in Pandas Example 2

Another example of how to rename many columns in Pandas dataframe is to assign a list of new column names to df.columns:

import pandas as pd

xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'
df = pd.read_excel(xlsx_url, index_col=0)

# New column names
new_cols = ['SubID', 'Given Name', 'Day', 'Age', 'ResponseTime', 'Gender']

# Renaming the columns
df.columns = new_cols

df.head()Code language: Python (python)
changing name on multiple columns in pandas dataframe
  • Save
resulting dataframe with renamed columns

In the code chunk above, we first create a variable xlsx_url that links to an Excel file stored on GitHub. Then, we use the pd.read_excel() function from the pandas package to read the Excel file and store it as a dataframe in the variable df. The index_col=0 argument specifies that the first column of the Excel file should be used as the index column of the dataframe.

Next, we create a list called new_cols containing the new column names we want to assign to the dataframe columns. We then use the df.columns attribute to access the column names of the dataframe, and assign the new column names to the dataframe columns using the = operator and the new_cols list.

Renaming Columns while Importing Data

In this section, we will learn how to rename columns while reading the Excel file. Now, this is also very simple. To accomplish this, we create a list before we use the read_excel method. Note we need to add a column name in the list for the index column.

import pandas as pd

xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'

# New column names
new_cols = ['Index', 'Subject_ID', 'Given Name', 'Day', 'Age', 'ResponseTime', 'Gender']

df = pd.read_excel(xlsx_url, names=new_cols, index_col=0)

df.columnsCode language: Python (python)
renaming columns in pandas
  • Save
new column names

In the code chunk above, we first create a list called new_cols that contains the new column names we want to assign to our dataframe. We include the existing column names we want to keep and any new ones we want to add.

Next, we use the “pd.read_excel” function to read the Excel file located at the “xlsx_url" and create a pandas dataframe. We also specify the names parameter as the list of new column names we just created, and set the index column as the first column in the dataframe using the index_col parameter.

By doing this, we can ensure that the column names match our desired naming convention and that the correct column is used as the index column.

Importantly, when changing the name of the columns while reading the data, we need to know the number of columns before we load the data.

  • Save

In the following example, we will learn how to rename grouped columns in Pandas dataframe.

Renaming Grouped Columns in Pandas Dataframe

In this section, we are going to rename grouped columns in Pandas dataframe. First, we are going to use Pandas groupby method (if needed, check the post about Pandas groupby method for more information). Second, we are going rename the grouped columns using Python list comprehension and df.columns, among other dataframe methods.

import pandas as pd
import numpy as np

xlsx_url = 'https://github.com/marsja/jupyter/blob/master/SimData/play_data.xlsx?raw=true'
df = pd.read_excel(xlsx_url, index_col=0)

grouped = df.groupby('Day').agg({'RT':[np.mean, np.median, np.std]})
groupedCode language: Python (python)
renaming columns in grouped dataframe
  • Save
Grouped dataframe

As you can see in the image above, we have a dataframe with multiple indexes. In the next code chunk, however, we will rename the grouped dataframe.

grouped.columns =  ['_'.join(x) for x in grouped.columns.ravel()]
groupedCode language: Python (python)
renaming columns in grouped pandas dataframe
  • Save
Renamed column names in Pandas

Note, in the code above, we also used Pandas ravel method to flatten the output to an ndarray. Now, there are other ways we can change the name of columns in Pandas dataframe. For instance, if we only want to change the column names so that the names are in lower case, we can use str.lower.

df.rename(columns=str.lower).head()Code language: Python (python)

Importantly, if we want the change to be permanent, we must add the inplace=True argument.

Changing Column Names in Pandas Dataframe to Lowercase

To change all column names to lowercase we can use the following code:

df.rename(columns=str.lower)Code language: Python (python)
changing column names in pandas
  • Save
Column names changed to lowercase

Now, this is one way to preprocess data in Python with pandas. In another post on this blog, we can learn about data cleaning in Python with Pandas and Pyjanitor.

Video Guide: Pandas Rename Column(s)

If you prefer to learn audiovisually, here’s a YouTube Tutorial covering changing the variable names in Pandas dataframe.

Conclusion: How to use Pandas to Rename a Column

In this post, we have explored how to use Pandas to rename a column in a dataframe. Renaming columns is essential in data analysis, making the dataset more readable and understandable. Sometimes, we may need to rename columns if we have data from different sources with different naming conventions or if we receive data from other researchers who used different naming schemes.

We discussed the importance of renaming variables and reviewed the prerequisites needed to perform this task in Pandas. We then explored different methods to rename columns in Pandas, including renaming single and multiple columns using dictionaries.

We also covered renaming columns while importing data and renaming grouped columns in a dataframe. Additionally, we learned how to change column names to lowercase using Pandas.

Renaming columns is an easy process in Pandas that can significantly improve the readability and organization of data. With the knowledge gained from this post, readers can confidently and efficiently rename columns in their datasets to meet their specific needs.

In conclusion, renaming columns is critical for anyone working with data. Using Pandas to rename columns, data analysts and scientists can make their datasets more organized and easily understood. With the examples and techniques this post covers, readers can now master the art of renaming columns in their Pandas dataframes.

So, there you have it! Now you know how to use Pandas to rename a column and can easily handle any dataset. Do not forget to share this post with your colleagues and friends, as well as comment below if you have any questions or suggestions for future tutorials.

  • Save
Share via
Copy link
Powered by Social Snap