How can I iterate over rows in a Pandas DataFrame?

Q: What is the best way to iterate over the rows of a panda DataFrame?

For smaller datasets, you can use iterrows() and for smaller datasets, or for all those datasets that are performance-critical, you can use itertuples().

Q: Is itertuples() faster than iterrows()?

Yes, itertuples() is faster as it avoids or does not convert every row to a series object.

When working with the pandas, you must have come across a need to process each row in a DataFrame. Although pandas are designed for faster and vectorized operations, row-wise iteration is also important in particular scenarios. Let’s dive into this blog to learn about various methods through which we can iterate over DataFrames and when to use each of them.

Method 1: Using iterrows – For smaller datasets
Method 2: Using itertuples() – For larger datasets
Method 3: Using apply() – For complex row-wise transformations
Method 4: Index-based Iteration (iloc[] or loc[]) – For specific rows
Which Method : When to Use
Conclusion
Method to Iterate Over Rows in Pandas Dataframe – FAQs

Method 1: Using iterrows() – For smaller datasets

The iterrows() method allows you to loop through each row in an (index, Series) pair. Even Though it is simple to use, it is slower for larger datasets. It is best used for small datasets that require quick operations.

Let us now understand this with the help of an example:

Imagine you are grading a small list of students based on their marks.

a. Code

Python

b. Output

Eva scored 85 and got grade A.  
Bobby scored 62 and got grade B.  
Charles scored 90 and got grade A.

Method 2: Using itertuples() – For larger datasets

itertuples() returns the values as namedtuples, which makes it faster and memory-efficient compared to that of iterrows(). It is best to be used for larger datasets where performance is important.

Now, if we take an example where you are required to calculate the salary of your employees in a larger dataset:

a. Code

Python

b. Output

Harry earns 36000 annually.  
Hermione earns 48000 annually.  
Ron earns 42000 annually.

Method 3: Using apply() – For complex row-wise transformations

The apply() enables you to apply a function to each row and column. It is mostly useful or ideal if you want to perform concise and vectorized row-wise calculations. It is mostly used for complex or mathematical operations across rows and columns.

Now, we can take an example where you want to calculate the Body Mass Index(BMI) for a group of people.

a. Code

Python

b. Output

	Name	Weight (kg)	Height (m)	BMI
0	Eva	70	1.75	22.857143
1	Bobby	85	1.80	26.234568

Method 4: Index-based Iteration (iloc[] or loc[]) – For specific rows

iloc[] and loc[] give you precise indexing when you want to process or update the specific rows in a dataframe. It is very useful when you need control over rows to access, modify them, and apply conditional updates.

If we take an example where you are trying to flag transactions above a certain amount in a financial dataset.

test_data = {‘Transaction ID’: [101, 102, 103], ‘Amount’: [500, 1500, 750]}

a. Code

Python

b. Output

	Transaction ID	Amount	Flag
0	101	500	Normal
1	102	1500	High
2	103	750	Normal

Which Method: When to Use

Method	Best For
iterrows()	Smaller datasets or the ones that require quick exploratory tasks.
itertuples()	It is best when you have larger datasets that require better performance.
apply()	It is used when there is a requirement for complex row-wise transformations or vectorized logic.
iloc[]/loc[]	This gives you precise control over particular rows with conditional logic.

Conclusion

In conclusion, while there are multiple ways to iterate over rows in a Pandas DataFrame, the choice depends on your task’s complexity and dataset size. For small datasets or custom logic, you can use iterrows() or apply(). For better performance on larger datasets, use vectorized operations or itertuples(). If you want to learn more about this technique on data manipulation using Pandas, then you should check out our Data Science Course using Python.

Method to Iterate Over Rows in Pandas Dataframe – FAQs

What is the best way to iterate over the rows of a panda DataFrame?

For smaller datasets, you can use iterrows() and for smaller datasets, or for all those datasets that are performance-critical, you can use itertuples().

How do you iterate over multiple rows in pandas?

If you want to iterate over multiple rows in pandas, you can use slicing with iloc[] or loc[] to iterate over a subset of rows. Code:

for _, row in df.loc[0:5].iterrows():

print(row)

What is the alternative to loop in pandas?

Vectorized operations and methods such as apply() or transform() are the alternatives that are faster to explicit loops.

Is itertuples() faster than iterrows()?

Yes, itertuples() is faster as it avoids or does not convert every row to a series object.

How can I iterate over rows in a Pandas DataFrame?

Table of Contents

Method 1: Using iterrows() – For smaller datasets

a. Code

b. Output

Method 2: Using itertuples() – For larger datasets

a. Code

b. Output

Method 3: Using apply() – For complex row-wise transformations

a. Code

b. Output

Method 4: Index-based Iteration (iloc[] or loc[]) – For specific rows

a. Code

b. Output

Which Method: When to Use

Conclusion

Method to Iterate Over Rows in Pandas Dataframe – FAQs

About the Author