logo

[python-ds] 판다스: 조건 검색

 

참/거짓으로 선택

pd.options.display.max_rows = 6
df['age'] > 40
0        False
1         True
         ...
32559    False
32560     True
Name: age, Length: 32561, dtype: bool
df[df['age'] > 40]
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
649Private1601879th5Married-spouse-absentOther-serviceNot-in-familyBlackFemale0016Jamaica<=50K
................................................
3255453Private321865Masters14Married-civ-spouseExec-managerialHusbandWhiteMale0040United-States>50K
3255858Private151910HS-grad9WidowedAdm-clericalUnmarriedWhiteFemale0040United-States<=50K
3256052Self-emp-inc287927HS-grad9Married-civ-spouseExec-managerialWifeWhiteFemale15024040United-States>50K

13443 rows × 15 columns

 

쿼리

df.query('age > 40')
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
649Private1601879th5Married-spouse-absentOther-serviceNot-in-familyBlackFemale0016Jamaica<=50K
................................................
3255453Private321865Masters14Married-civ-spouseExec-managerialHusbandWhiteMale0040United-States>50K
3255858Private151910HS-grad9WidowedAdm-clericalUnmarriedWhiteFemale0040United-States<=50K
3256052Self-emp-inc287927HS-grad9Married-civ-spouseExec-managerialWifeWhiteFemale15024040United-States>50K

13443 rows × 15 columns

 

AND

AND는 &로 표시. 괄호에 유의.

40세 넘는 남자

(df['age'] > 40) & (df['sex'] == 'Male')
0        False
1         True
2        False
         ...
32558    False
32559    False
32560    False
Length: 32561, dtype: bool
df[(df['age'] > 40) & (df['sex'] == 'Male')]
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
752Self-emp-not-inc209642HS-grad9Married-civ-spouseExec-managerialHusbandWhiteMale0045United-States>50K
................................................
3255043Self-emp-not-inc27242Some-college10Married-civ-spouseCraft-repairHusbandWhiteMale0050United-States<=50K
3255243Private84661Assoc-voc11Married-civ-spouseSalesHusbandWhiteMale0045United-States<=50K
3255453Private321865Masters14Married-civ-spouseExec-managerialHusbandWhiteMale0040United-States>50K

9497 rows × 15 columns

df.query('age > 40 and sex == "Male"')
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
752Self-emp-not-inc209642HS-grad9Married-civ-spouseExec-managerialHusbandWhiteMale0045United-States>50K
................................................
3255043Self-emp-not-inc27242Some-college10Married-civ-spouseCraft-repairHusbandWhiteMale0050United-States<=50K
3255243Private84661Assoc-voc11Married-civ-spouseSalesHusbandWhiteMale0045United-States<=50K
3255453Private321865Masters14Married-civ-spouseExec-managerialHusbandWhiteMale0040United-States>50K

9497 rows × 15 columns

 

OR

OR는 |로 표시. 괄호에 유의.

남편 또는 아내

(df['relationship'] == 'Husband') | (df['relationship'] == 'Wife')
0        False
1         True
2        False
         ...
32558    False
32559    False
32560     True
Name: relationship, Length: 32561, dtype: bool
df[(df['relationship'] == 'Husband') | (df['relationship'] == 'Wife')]
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
428Private338409Bachelors13Married-civ-spouseProf-specialtyWifeBlackFemale0040Cuba<=50K
................................................
3255627Private257302Assoc-acdm12Married-civ-spouseTech-supportWifeWhiteFemale0038United-States<=50K
3255740Private154374HS-grad9Married-civ-spouseMachine-op-inspctHusbandWhiteMale0040United-States>50K
3256052Self-emp-inc287927HS-grad9Married-civ-spouseExec-managerialWifeWhiteFemale15024040United-States>50K

14761 rows × 15 columns

df['relationship'].unique()
array(['Not-in-family', 'Husband', 'Wife', 'Own-child', 'Unmarried',
       'Other-relative'], dtype=object)
df.query('relationship == "Husband" or relationship == "Wife"')
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
428Private338409Bachelors13Married-civ-spouseProf-specialtyWifeBlackFemale0040Cuba<=50K
................................................
3255627Private257302Assoc-acdm12Married-civ-spouseTech-supportWifeWhiteFemale0038United-States<=50K
3255740Private154374HS-grad9Married-civ-spouseMachine-op-inspctHusbandWhiteMale0040United-States>50K
3256052Self-emp-inc287927HS-grad9Married-civ-spouseExec-managerialWifeWhiteFemale15024040United-States>50K

14761 rows × 15 columns

 

isin

df['relationship'].isin(['Husband', 'Wife'])
0        False
1         True
2        False
         ...
32558    False
32559    False
32560     True
Name: relationship, Length: 32561, dtype: bool
df[df['relationship'].isin(['Husband', 'Wife'])]
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
428Private338409Bachelors13Married-civ-spouseProf-specialtyWifeBlackFemale0040Cuba<=50K
................................................
3255627Private257302Assoc-acdm12Married-civ-spouseTech-supportWifeWhiteFemale0038United-States<=50K
3255740Private154374HS-grad9Married-civ-spouseMachine-op-inspctHusbandWhiteMale0040United-States>50K
3256052Self-emp-inc287927HS-grad9Married-civ-spouseExec-managerialWifeWhiteFemale15024040United-States>50K

14761 rows × 15 columns

5 in [1, 2, 5]
True
df.query('relationship in ["Husband", "Wife"]')
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
150Self-emp-not-inc83311Bachelors13Married-civ-spouseExec-managerialHusbandWhiteMale0013United-States<=50K
353Private23472111th7Married-civ-spouseHandlers-cleanersHusbandBlackMale0040United-States<=50K
428Private338409Bachelors13Married-civ-spouseProf-specialtyWifeBlackFemale0040Cuba<=50K
................................................
3255627Private257302Assoc-acdm12Married-civ-spouseTech-supportWifeWhiteFemale0038United-States<=50K
3255740Private154374HS-grad9Married-civ-spouseMachine-op-inspctHusbandWhiteMale0040United-States>50K
3256052Self-emp-inc287927HS-grad9Married-civ-spouseExec-managerialWifeWhiteFemale15024040United-States>50K

14761 rows × 15 columns

 

NOT

NOT은 ~로 표시

남편/아내 제외

~df['relationship'].isin(['Husband', 'Wife'])
0         True
1        False
2         True
         ...
32558     True
32559     True
32560    False
Name: relationship, Length: 32561, dtype: bool
df[~df['relationship'].isin(['Husband', 'Wife'])]
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
039State-gov77516Bachelors13Never-marriedAdm-clericalNot-in-familyWhiteMale2174040United-States<=50K
238Private215646HS-grad9DivorcedHandlers-cleanersNot-in-familyWhiteMale0040United-States<=50K
649Private1601879th5Married-spouse-absentOther-serviceNot-in-familyBlackFemale0016Jamaica<=50K
................................................
3255522Private310152Some-college10Never-marriedProtective-servNot-in-familyWhiteMale0040United-States<=50K
3255858Private151910HS-grad9WidowedAdm-clericalUnmarriedWhiteFemale0040United-States<=50K
3255922Private201490HS-grad9Never-marriedAdm-clericalOwn-childWhiteMale0020United-States<=50K

17800 rows × 15 columns

df.query('not (age < 30 and race == "Black")')
df.query('not (relationship in ["Husband", "Wife"])')
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
039State-gov77516Bachelors13Never-marriedAdm-clericalNot-in-familyWhiteMale2174040United-States<=50K
238Private215646HS-grad9DivorcedHandlers-cleanersNot-in-familyWhiteMale0040United-States<=50K
649Private1601879th5Married-spouse-absentOther-serviceNot-in-familyBlackFemale0016Jamaica<=50K
................................................
3255522Private310152Some-college10Never-marriedProtective-servNot-in-familyWhiteMale0040United-States<=50K
3255858Private151910HS-grad9WidowedAdm-clericalUnmarriedWhiteFemale0040United-States<=50K
3255922Private201490HS-grad9Never-marriedAdm-clericalOwn-childWhiteMale0020United-States<=50K

17800 rows × 15 columns

in의 경우에는 not in도 가능

df.query('relationship not in ["Husband", "Wife"]')
ageworkclassfnlwgteducationeducation_nummarital_statusoccupationrelationshipracesexcapital_gaincapital_losshours_per_weeknative_countryincome
039State-gov77516Bachelors13Never-marriedAdm-clericalNot-in-familyWhiteMale2174040United-States<=50K
238Private215646HS-grad9DivorcedHandlers-cleanersNot-in-familyWhiteMale0040United-States<=50K
649Private1601879th5Married-spouse-absentOther-serviceNot-in-familyBlackFemale0016Jamaica<=50K
................................................
3255522Private310152Some-college10Never-marriedProtective-servNot-in-familyWhiteMale0040United-States<=50K
3255858Private151910HS-grad9WidowedAdm-clericalUnmarriedWhiteFemale0040United-States<=50K
3255922Private201490HS-grad9Never-marriedAdm-clericalOwn-childWhiteMale0020United-States<=50K

17800 rows × 15 columns

Previous
판다스: 인덱스