Iterate over rows in a dataframe in Pandas
Pandas
import modules
import pandas as pd
import numpy as np
create dummy dataframe
raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'red', 'yellow', "green"],
'grade': [88, 92, 95, 70]}
df = pd.DataFrame(raw_data, columns = ['name', 'age', 'favorite_color', 'grade'])
df
name | age | favorite_color | grade | |
---|---|---|---|---|
0 | Willard Morris | 20 | blue | 88 |
1 | Al Jennings | 19 | red | 92 |
2 | Omar Mullins | 22 | yellow | 95 |
3 | Spencer McDaniel | 21 | green | 70 |
Iterate over rows in Pandas dataframe
Using iterrows:
for index, row in df.iterrows():
print (row["name"], row["age"])
Willard Morris 20
Al Jennings 19
Omar Mullins 22
Spencer McDaniel 21
Using itertuples:
for row in df.itertuples(index=True, name='Pandas'):
print (getattr(row, "name"), getattr(row, "age"))
Willard Morris 20 Al Jennings 19 Omar Mullins 22 Spencer McDaniel 21
If you wish to modify the rows you're iterating over, then df.apply is preferred:
def valuation_formula(x):
return x * 0.5
df['age_half'] = df.apply(lambda row: valuation_formula(row['age']), axis=1)
df.head()
name | age | favorite_color | grade | age_half | |
---|---|---|---|---|---|
0 | Willard Morris | 20 | blue | 88 | 10.0 |
1 | Al Jennings | 19 | red | 92 | 9.5 |
2 | Omar Mullins | 22 | yellow | 95 | 11.0 |
3 | Spencer McDaniel | 21 | green | 70 | 10.5 |