Register Login

How to Iterate Over Rows in a Pandas DataFrame?

Python is one of the most widely used programming languages for data analysis. Python has an incredible ecosystem of data-centric packages, which users commonly use to import and efficiently reuse any large module.

Pandas and NumPy are two of the most popular Python Packages, making importing and analyzing different data much more efficient. In this article, we will be discussing how to iterate over rows in a Pandas DataFrame.  

What is Iteration?

In general, Iteration is a term that signifies repeating elements, one after the other, until some limiting criteria get completed. For that, users can use a loop, either explicit or implicit. It will go over that group of elements, that is, Iteration.

Different methods to iterate over rows in a DataFrame in Pandas:

Let us learn the different ways to iterate over rows in Pandas DataFrame:

  1. index attribute
  2. loc[] function
  3. iterrows()
  4. iloc[] function
  5. itertuples()
  6. apply() method

Method 1: Using index attribute to iterate DataFrames:

Code Snippet:

# First, we import pandas package with name as pd
import pandas as pd
# Define a dictionary containing employee data
data = {'Employee Name': ['Ashish', 'Binesh', 'Plaban', 'Akash'],
		'Age': [29, 31, 27, 34],
		'Department': ['Sales Department', 'Marketing Department', 'General Management', 'HR department'],
		'Salary': [56000, 49000, 60000, 45000]}
# Now we will convert the dictionary into DataFrame
demo = pd.DataFrame(data, columns=['Employee Name', 'Age', 'Department', 'Salary'])
print("Given DataFrame :\n", demo)
print("\nIterating over rows with index attribute :\n")
# iterate through each row and select
# 'Employee Name' and Salary column respectively
for i in demo.index:
    print(demo['Employee Name'][i], demo['Salary'][i])

Output:

Explanation:

In the above example, we have used the index attribute of DataFrames to iterate the elements. We have written the index attribute as the “demo.index”, which displays the job data of some fictional employees.

The index attribute will show the data value we will pass as the index value in the attribute.

Method 2: Using loc[] function to iterate DataFrames:

Code Snippet:

# First, we import pandas package with name as pd
import pandas as pd
# Define a dictionary containing employee data
data = {'Employee Name': ['Ashish', 'Binesh', 'Ashnir', 'Akash'],
		'Age': [29, 31, 27, 34],
		'Department': ['Sales Department', 'Marketing Department', 'General Management', 'HR department'],
		'Salary': [56000, 49000, 60000, 45000]}
# Now we will convert the dictionary into DataFrame
demo = pd.DataFrame(data, columns=['Employee Name', 'Age', 'Department', 'Salary'])
print("Given DataFrame :\n", demo)
print("\nIterating over rows with loc function :\n")
# iterate through each row and select
# Data value in the column respectively
for i in range(len(demo)):
	print(demo.loc[i, "Employee Name"], demo.loc[i, "Age"], demo.loc[i, "Department"], demo.loc[i, "Salary"])

Output:

Explanation:

In the above example, we have used the loc[] function of DataFrames to iterate the elements. The len() function returns the length, i.e., the number of elements of the DataFrame object. We can print all the employee details, written within the function value "demo.loc[i, "..."."

Method 3: Using the iterrows() function to iterate DataFrames:

Code Snippet:

# First, we import pandas package with name as pd
import pandas as pd
# Define a dictionary containing employee data
data = {'Employee Name': ['Ashish', 'Binesh', 'Plaban', 'Akash'],
		'Age': [29, 31, 27, 34],
		'Department': ['Sales Department', 'Marketing Department', 'General Management', 'HR department'],
		'Salary': [56000, 49000, 60000, 45000]}
# Now we will convert the dictionary into DataFrame
demo = pd.DataFrame(data, columns=['Employee Name', 'Age', 'Department', 'Salary'])
print("Given DataFrame :\n", demo)
print("\nIterating over rows with iterrows() function :\n")
# 'Employee Name' ‘Department’, and ‘Salary’ column respectively
# iterate through each row and select
for i, row in demo.iterrows():
    print(row["Employee Name"], row["Age"], row["Department"], row["Salary"])

Output:

Explanation:

In the above code snippet, we have used the iterrows() function that arranges the data set of some fictional employees regarding their job details. In the for loop, i denotes the index column, and the Pandas DataFrame holds the data for the index in all columns.

Method 4: Using the iloc[] function to iterate DataFrames:

Code Snippet:

# First, we import pandas package with name as pd
import pandas as pd
# Define a dictionary containing employee data
data = {'Employee Name': ['Ashish', 'Binesh', 'Plaban', 'Akash'],
		'Age': [29, 31, 27, 34],
		'Department': ['Sales Department', 'Marketing Department', 'General Management', 'HR department'],
		'Salary': [56000, 49000, 60000, 45000]}
# Now we will convert the dictionary into DataFrame
demo = pd.DataFrame(data, columns=['Employee Name', 'Age', 'Department', 'Salary'])
print("Given DataFrame :\n", demo)
print("\nIterating over rows with iloc function :\n")
# iterate through each row and select
# 'Name' and 'Age' column respectively
for i in range(len(demo)):
    print(demo.iloc[i, 0], demo.iloc[i, 3], demo.iloc[i, 2])

Output:

Explanation:

The above code snippet shows how to iterate Pandas DataFrame using the iloc[] function. We have written the iloc[] function as the demo.iloc[i, 0], where the i signifies the index column, "0" signifies the data value.

Method 5: Using itertuples() function to iterate DataFrame:

Code Snippet:

# First, we import pandas package with name as pd
import pandas as pd
# Define a dictionary containing employee data
data = {'Name': ['Ashish', 'Binesh', 'Plaban', 'Akash'],
		'Age': [29, 31, 27, 34],
		'Department': ['Sales Department', 'Marketing Department', 'General Management', 'HR department'],
		'Salary': [56000, 49000, 60000, 45000]}
# Now we will convert the dictionary into DataFrame
demo = pd.DataFrame(data, columns=['Employee Name', 'Age', 'Department', 'Salary'])
print("Given DataFrame :\n", demo)
print("\nIterating over rows with itertuples() function :\n")
# iterate through each row and select
# 'Name' and 'Age' column respectively
for row in demo.itertuples(index=True, name='Pandas'):
    print(getattr(row, "Department"), getattr(row, "Salary"))

Output:

Explanation:

In the above code sample, we have iterated over rows in Pandas DataFrame using the itertuples() method. The itertuples() method has two arguments, index, and name. The index contains either True or False, and the name holds the variable name, which contains the data value.

Method 6: Using the apply() function to iterate DataFrames:

Code Snippet:

# First, we import pandas package with name as pd
import pandas as pd
# Define a dictionary containing employee data
data = {'Employee Name': ['Ashish', 'Binesh', 'Plaban', 'Akash'],
		'Age': [29, 31, 27, 34],
		'Department': ['Sales Department', 'Marketing Department', 'General Management', 'HR department'],
		'Salary': [56000, 49000, 60000, 45000]}
# Now we will convert the dictionary into DataFrame
demo = pd.DataFrame(data, columns=['Employee Name', 'Age', 'Department', 'Salary'])
print("Given DataFrame :\n", demo)
print("\nIterating over rows with apply() function :\n")
# iterate through each row and select
# 'Name' and 'Age' column respectively
print(demo.apply(lambda row: row["Employee Name"] + " " + 
               str(row["Age"]), axis=1))

Output:

Explanation:

Here, in the above code snippet, we have used the apply()  method of DataFrames to iterate elements of the data. The method allows users to apply a function along one of the axes of the Pandas DataFrame, and it has a default value of 0, which is the index or row axis.

Conclusion:

Iterating over pandas DataFrames is generally slow and not a best practice. Users should only go through the process only when this is necessary. The pandas package provides a rich collection of built-in functions and methods optimized to function on large pandas objects.

So, users should always favor these over any other iterative processes. This article has catered to the six different methods to iterate over rows in Pandas DataFrame.


×