Register Login

How to Rename Pandas DataFrame Column in Python

Pandas is one of the most common libraries for data analysis. It has different data structures: Series, DataFrames, and Panels. These data structures help in defining the data in a specific order and structure. The DataFrame is the most commonly used data structure, and renaming its column is another essential technique that most data analysts have to do frequently.

In this article, you will learn how to rename a DataFrame column in Python.

What do you mean by renaming a DataFrame column?

It is always possible to rename the label of a column in DataFrame. This process is called renaming the DataFrame column. Many a time, it is essential to fetch a cluster of data from one DataFrame and place it in a new DataFrame and adjust the column name according to the data.
That is where data analysts use the following methods or techniques to rename the DataFrame columns.

Method 1: Using the rename() function:

The very common and usual technique of renaming the DataFrame columns is by calling the rename() method. Here we need to define the specific information related to the columns that we want to rename. It takes the replaced value in the form of a key:value pair within a dictionary.

Program:

import pandas as pd
profile = {'Developer': ['Karl', 'Zeus', 'Su','Bill', 'Woz'],
              'Hacker': ['Kevin', 'Woz', 'Poulsen','Vivek', 'Manu'],
               'Seller': ['Dee', 'Sue', 'Karl','Steve', 'Ron']}
profile_pd = pd.DataFrame(profile)
print(profile_pd)
profile_pd.rename(columns = {'Hacker':'HACKER'}, inplace = True)
print("\n After modifying second column: \n", profile_pd.columns)
print(profile_pd)

Output:

Explanation:

First we will have to import the module Pandas and alias it with a name (here pd). Next, we create a basic dictionary, which has a list nesting it. We then use the pd.DataFrame() and used the dictionary as the DataFrame. We then printed the DataFrame using the print() function.

Now, with that DataFrame object, we have used the rename() method and changed the column name by passing a key-value pair and enabling the inplace parameter to ‘True’.

This will change the spelling of 'Hacker' to 'HACKER'. After modifying second column, we simply displayed the overall DataFrame using the print(). It will now show the new string / column-name that we have updated.

Method 2: Passing a list as a Parameter:

Another way of changing the column label from default numbers to any string is by passing a list of strings as column names. This is a useful technique when programmers have to change the complete set of column names. Just programmers have to keep in mind the number of columns require for the change.

Program:

import numpy as np
import pandas as pd
new_columns = ['Name', 'ID']
data = np.array([["Karl", 26],["Suezane", 40],["Dee", 16]])
df = pd.DataFrame(data)
print(df)
print()
df = pd.DataFrame(data, columns = new_columns)
print(df)

Output:

Explanation:

First we will have to import the module Numpy and alias it with a name (here np). We also need to import the module Pandas and alias it with a name (here pd). We create a list for our new column name. Then, we create a NumPy array and pass a series of lists in nested form and named it as data.

Then we create a DataFrame using that NumPy array. This is of course another way of creating DataFrame in Python. Then we print that DataFrame. Now we will use the pd.DataFrame(data, columns = new_columns) where we will pass the new column names as the columns value. This will replace the old column name with the new column name.

Method 3: Using the add_prefix() and add_suffix():

There are two more methods that help in changing the column label by adding some suffix or prefix to an already existing column name. These two methods do not entirely change or replace the previously existing label.

Their syntaxes are:

dataframe_name.add_prefix('prefix-string')
dataframe_name.add_suffix('suffix-string')

Program:

import pandas as pd
profile = {'Developer': ['Karl', 'Zeus', 'Su','Bill', 'Woz'],
              'Hacker': ['Kevin', 'Woz', 'Poulsen', 'Vivek', 'Manu'],
               'Seller': ['Dee', 'Sue', 'Karl', 'Steve', 'Ron']}
profile_pd = pd.DataFrame(profile)
print(profile_pd)
print(profile_pd.add_prefix('New_'))

Output:

Explanation:

First, we will have to import the module Pandas and alias it with a name (here pd). Next, we create a basic dictionary, which has a list nesting it. We then use the pd.DataFrame() and used the dictionary as the DataFrame. We then printed the DataFrame using the print() function.

Now, with that DataFrame object, we have used the add.prefix() method to change the column name. The add_prefix()  will add a specific string at the beginning of all the column names. We put the entire operation under the print() function to display the result.

Program:

import pandas as pd
profile = {'Developer': ['Karl', 'Zeus', 'Su','Bill', 'Woz'],
'Hacker': ['Kevin', 'Woz', 'Poulsen','Vivek', 'Manu'],
'Seller': ['Dee', 'Sue', 'Karl','Steve', 'Ron']}
profile_pd = pd.DataFrame(profile)
print(profile_pd)
print()
print(profile_pd.add_suffix('_New'))

Output:

Explanation:

First, we will have to import the module Pandas and alias it with a name (here pd). Next, we create a basic dictionary, which has a list nesting it. We then use the pd.DataFrame() and used the dictionary as the DataFrame. We then printed the DataFrame using the print() function.

Now, with that DataFrame object, we have used the add.suffix() method to change the column name. The add_suffix()  will add a specific string at the beginning of all the column names. We put the entire operation under the print() function to display the result.

Method 4: Using Lambda and Regular expression:

Programmers can rename a column by eliminating any common portion of the column labels all at a time. It is possible using the lambda and the regular expression developed in a meaningful sequence.

Program:

import pandas as pd
import re
profile = {'NewDeveloper': ['Karl', 'Zeus', 'Su','Bill', 'Woz'],
			'NewHacker': ['Kevin', 'Woz', 'Poulsen','Vivek', 'Manu'],
            'NewSeller': ['Dee', 'Sue', 'Karl','Steve', 'Ron']}
profile_pd = pd.DataFrame(profile)
print(profile_pd)
print()
profile_pd = profile_pd.rename(columns=lambda x: re.sub('New','',x))
print(profile_pd)

Output:

Explanation:

First we will have to import the module Pandas and alias it with a name (here pd). Also, we have to import re (regular expression). Next, we create a basic dictionary, which has a list nesting it. We then use the pd.DataFrame() and used the dictionary as the DataFrame. We then printed the DataFrame using the print() function.

Now, with that DataFrame object, we have used the rename() method and within the column parameter, we will create a lambda expression that will add the ‘New’ because of the re.sub() method which adds a subscript to all the previously expositing column names. After modifying second column, we simply displayed the overall updated DataFrame using the print(). 

Conclusion:

All these techniques are important and have their own significance. But the most commonly used technique is the rename() method. But if you don’t want to change the entire column, and simply want to add a new sub string to it, then using the add_suffix() or add_prefix() are the best choice.

You may also use the lambda and regular expression technique as shown above, but as you all know regular expressions and lambdas are expensive for programming to process.

Thus, it will render less efficiency compare to other techniques. The simplest and the most efficient among all these are the method 1 and two that does not require additional computation and methods for processing.