import modules
import pandas as pd
Create some dummy data
raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'blue', 'yellow', "green"],
'grade': [88, 92, 95, 70]}df = pd.DataFrame(raw_data)df.head()
|
age |
favorite_color |
grade |
name |
0 |
20 |
blue |
88 |
Willard Morris |
1 |
19 |
blue |
92 |
Al Jennings |
2 |
22 |
yellow |
95 |
Omar Mullins |
3 |
21 |
green |
70 |
Spencer McDaniel |
Select rows based on column value:
#To select rows whose column value equals a scalar, some_value, use ==:df.loc[df['favorite_color'] == 'yellow']
|
age |
favorite_color |
grade |
name |
2 |
22 |
yellow |
95 |
Omar Mullins |
Select rows whose column value is in an iterable array:
#To select rows whose column value is in an iterable array, which we'll define as array, you can use isin:array = ['yellow', 'green']df.loc[df['favorite_color'].isin(array)]
|
age |
favorite_color |
grade |
name |
2 |
22 |
yellow |
95 |
Omar Mullins |
3 |
21 |
green |
70 |
Spencer McDaniel |
Select rows based on multiple column conditions:
#To select a row based on multiple conditions you can use &:array = ['yellow', 'green']df.loc[(df['age'] == 21) & df['favorite_color'].isin(array)]
|
age |
favorite_color |
grade |
name |
3 |
21 |
green |
70 |
Spencer McDaniel |
Select rows where column does not equal a value:
#To select rows where a column value does not equal a value, use !=:df.loc[df['favorite_color'] != 'yellow']
|
age |
favorite_color |
grade |
name |
0 |
20 |
blue |
88 |
Willard Morris |
1 |
19 |
blue |
92 |
Al Jennings |
3 |
21 |
green |
70 |
Spencer McDaniel |
Select rows whose column value is not in an iterable array:
#To return a rows where column value is not in an iterable array, use ~ in front of df:array = ['yellow', 'green']df.loc[~df['favorite_color'].isin(array)]
|
age |
favorite_color |
grade |
name |
0 |
20 |
blue |
88 |
Willard Morris |
1 |
19 |
blue |
92 |
Al Jennings |