================ by Jawad Haider
07 - Pandas Excercise Solutions¶
Pandas Exercises - Solutions¶
TASK: Import pandas
TASK: Read in the bank.csv file that is located under the 01-Crash-Course-Pandas folder. Pay close attention to where the .csv file is located! Please don’t post to the QA forums if you can’t figure this one out, instead, run our solutions notebook directly to see how its done.
TASK: Display the first 5 rows of the data set
age | job | marital | education | default | balance | housing | loan | contact | day | month | duration | campaign | pdays | previous | poutcome | y | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 30 | unemployed | married | primary | no | 1787 | no | no | cellular | 19 | oct | 79 | 1 | -1 | 0 | unknown | no |
1 | 33 | services | married | secondary | no | 4789 | yes | yes | cellular | 11 | may | 220 | 1 | 339 | 4 | failure | no |
2 | 35 | management | single | tertiary | no | 1350 | yes | no | cellular | 16 | apr | 185 | 1 | 330 | 1 | failure | no |
3 | 30 | management | married | tertiary | no | 1476 | yes | yes | unknown | 3 | jun | 199 | 4 | -1 | 0 | unknown | no |
4 | 59 | blue-collar | married | secondary | no | 0 | yes | no | unknown | 5 | may | 226 | 1 | -1 | 0 | unknown | no |
TASK: What is the average (mean) age of the people in the dataset?
41.17009511170095
TASK: What is the marital status of the youngest person in the dataset?
503
'single'
TASK: How many unique job categories are there?
12
TASK: How many people are there per job category? (Take a peek at the expected output)
management 969
blue-collar 946
technician 768
admin. 478
services 417
retired 230
self-employed 183
entrepreneur 168
unemployed 128
housemaid 112
student 84
unknown 38
Name: job, dtype: int64
**TASK: What percent of people in the dataset were married? **
# Many, many ways to do this one! Here is just one way:
100*df['marital'].value_counts()['married']/len(df)
# df['marital].value_counts()
61.86684361866843
TASK: There is a column labeled “default”. Use pandas’ .map() method to create a new column called “default code” which contains a 0 if there was no default, or a 1 if there was a default. Then show the head of the dataframe with this new column.
age | job | marital | education | default | balance | housing | loan | contact | day | month | duration | campaign | pdays | previous | poutcome | y | default code | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 30 | unemployed | married | primary | no | 1787 | no | no | cellular | 19 | oct | 79 | 1 | -1 | 0 | unknown | no | 0 |
1 | 33 | services | married | secondary | no | 4789 | yes | yes | cellular | 11 | may | 220 | 1 | 339 | 4 | failure | no | 0 |
2 | 35 | management | single | tertiary | no | 1350 | yes | no | cellular | 16 | apr | 185 | 1 | 330 | 1 | failure | no | 0 |
3 | 30 | management | married | tertiary | no | 1476 | yes | yes | unknown | 3 | jun | 199 | 4 | -1 | 0 | unknown | no | 0 |
4 | 59 | blue-collar | married | secondary | no | 0 | yes | no | unknown | 5 | may | 226 | 1 | -1 | 0 | unknown | no | 0 |
TASK: Using pandas .apply() method, create a new column called “marital code”. This column will only contained a shortened code of the possible marital status first letter. (For example “m” for “married” , “s” for “single” etc… See if you can do this with a lambda expression. Lots of ways to do this one!
age | job | marital | education | default | balance | housing | loan | contact | day | month | duration | campaign | pdays | previous | poutcome | y | default code | marital code | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 30 | unemployed | married | primary | no | 1787 | no | no | cellular | 19 | oct | 79 | 1 | -1 | 0 | unknown | no | 0 | m |
1 | 33 | services | married | secondary | no | 4789 | yes | yes | cellular | 11 | may | 220 | 1 | 339 | 4 | failure | no | 0 | m |
2 | 35 | management | single | tertiary | no | 1350 | yes | no | cellular | 16 | apr | 185 | 1 | 330 | 1 | failure | no | 0 | s |
3 | 30 | management | married | tertiary | no | 1476 | yes | yes | unknown | 3 | jun | 199 | 4 | -1 | 0 | unknown | no | 0 | m |
4 | 59 | blue-collar | married | secondary | no | 0 | yes | no | unknown | 5 | may | 226 | 1 | -1 | 0 | unknown | no | 0 | m |
TASK: What was the longest lasting duration?
3025
TASK: What is the most common education level for people who are unemployed?
secondary 68
tertiary 32
primary 26
unknown 2
Name: education, dtype: int64
TASK: What is the average (mean) age for being unemployed?
40.90625
Great Job! Thats the end of this part.¶
Don't forget to give a star on github and follow for more curated Computer Science, Machine Learning materials