[1]:
import transportation_tutorials as tt
import pandas as pd
import numpy as np
To answer the question, use the following data files:
[2]:
districts = pd.read_csv(tt.data('FL-COUNTY-BY-DISTRICT'))
districts.head()
[2]:
County | District | |
---|---|---|
0 | Charlotte | 1 |
1 | Collier | 1 |
2 | DeSoto | 1 |
3 | Glades | 1 |
4 | Hardee | 1 |
[3]:
bridges = pd.read_csv(tt.data('FL-BRIDGES'))
# Recall the necessary cleaning for the bridges data file
bridges = bridges.replace('-', 0)
bridges[['Poor #', 'SD #']] = bridges[['Poor #', 'SD #']].astype(int)
bridges.fillna(0, inplace=True)
bridges.head()
[3]:
County | Total # | Good # | Fair # | Poor # | SD # | Total Area | Good Area | Fair Area | Poor Area | SD Area | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | ALACHUA (001) | 111 | 64 | 47 | 0 | 0 | 64767 | 55794 | 8973 | 0.0 | 0.0 |
1 | BAKER (003) | 89 | 30 | 52 | 7 | 8 | 32162 | 19369 | 12282 | 510.0 | 623.0 |
2 | BAY (005) | 122 | 49 | 63 | 10 | 11 | 210039 | 98834 | 109628 | 1577.0 | 10120.0 |
3 | BRADFORD (007) | 62 | 23 | 37 | 2 | 2 | 9330 | 5492 | 3217 | 620.0 | 620.0 |
4 | BREVARD (009) | 241 | 160 | 81 | 0 | 0 | 364138 | 204179 | 159959 | 0.0 | 0.0 |
We need to create a table that answers (1). The first step is to attach the district number for each row (County) in the bridges table. The County
field in the bridges table is not just the county name in upper case, but also a three digit code number, but the districts file only has plain names in title case. We’ll need to strip the code numbers from bridges
(six characters, including the 3 digits, two parentheses, and a space), and convert the names in districts
to all upper case,
so there is an exact match for merging.
[4]:
bridges['County'] = bridges['County'].str[:-6]
districts['County'] = districts['County'].str.upper()
[5]:
bridges_2 = pd.merge(
bridges,
districts[['County','District']],
on='County',
)
[6]:
bridges_2.head()
[6]:
County | Total # | Good # | Fair # | Poor # | SD # | Total Area | Good Area | Fair Area | Poor Area | SD Area | District | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ALACHUA | 111 | 64 | 47 | 0 | 0 | 64767 | 55794 | 8973 | 0.0 | 0.0 | 2 |
1 | BAKER | 89 | 30 | 52 | 7 | 8 | 32162 | 19369 | 12282 | 510.0 | 623.0 | 2 |
2 | BAY | 122 | 49 | 63 | 10 | 11 | 210039 | 98834 | 109628 | 1577.0 | 10120.0 | 3 |
3 | BRADFORD | 62 | 23 | 37 | 2 | 2 | 9330 | 5492 | 3217 | 620.0 | 620.0 | 2 |
4 | BREVARD | 241 | 160 | 81 | 0 | 0 | 364138 | 204179 | 159959 | 0.0 | 0.0 | 5 |
The we need to use transform
to compute the share of total ‘SD Area’ across the district in each county.
[7]:
bridges_2['SD Area Share in District'] = bridges_2.groupby('District')[['SD Area']].transform(lambda x: (x / x.sum()))
[8]:
bridges_2
[8]:
County | Total # | Good # | Fair # | Poor # | SD # | Total Area | Good Area | Fair Area | Poor Area | SD Area | District | SD Area Share in District | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ALACHUA | 111 | 64 | 47 | 0 | 0 | 64767 | 55794 | 8973 | 0.0 | 0.0 | 2 | 0.000000 |
1 | BAKER | 89 | 30 | 52 | 7 | 8 | 32162 | 19369 | 12282 | 510.0 | 623.0 | 2 | 0.010061 |
2 | BAY | 122 | 49 | 63 | 10 | 11 | 210039 | 98834 | 109628 | 1577.0 | 10120.0 | 3 | 0.071854 |
3 | BRADFORD | 62 | 23 | 37 | 2 | 2 | 9330 | 5492 | 3217 | 620.0 | 620.0 | 2 | 0.010012 |
4 | BREVARD | 241 | 160 | 81 | 0 | 0 | 364138 | 204179 | 159959 | 0.0 | 0.0 | 5 | 0.000000 |
5 | BROWARD | 689 | 535 | 150 | 4 | 7 | 1192081 | 952849 | 238309 | 923.0 | 1372.0 | 4 | 0.036054 |
6 | CALHOUN | 49 | 19 | 29 | 1 | 1 | 76300 | 50437 | 25863 | 0.0 | 0.0 | 3 | 0.000000 |
7 | CHARLOTTE | 207 | 172 | 35 | 0 | 1 | 250385 | 229102 | 21284 | 0.0 | 1511.0 | 1 | 0.105179 |
8 | CITRUS | 41 | 32 | 9 | 0 | 0 | 21903 | 19948 | 1955 | 0.0 | 0.0 | 7 | 0.000000 |
9 | CLAY | 77 | 29 | 45 | 3 | 3 | 68282 | 19034 | 48860 | 388.0 | 388.0 | 2 | 0.006266 |
10 | COLLIER | 356 | 281 | 73 | 2 | 2 | 190375 | 158006 | 32207 | 162.0 | 162.0 | 1 | 0.011277 |
11 | COLUMBIA | 113 | 60 | 49 | 4 | 4 | 51928 | 32910 | 18304 | 715.0 | 715.0 | 2 | 0.011546 |
12 | DIXIE | 55 | 39 | 11 | 5 | 5 | 19017 | 10921 | 7033 | 1063.0 | 1063.0 | 2 | 0.017166 |
13 | DUVAL | 763 | 479 | 262 | 22 | 22 | 1795247 | 932820 | 811836 | 50590.0 | 50590.0 | 2 | 0.816969 |
14 | ESCAMBIA | 243 | 116 | 111 | 16 | 18 | 585435 | 259541 | 233423 | 92471.0 | 92585.0 | 3 | 0.657368 |
15 | FLAGLER | 96 | 72 | 23 | 1 | 1 | 70184 | 65303 | 4881 | 0.0 | 0.0 | 5 | 0.000000 |
16 | FRANKLIN | 52 | 6 | 44 | 2 | 2 | 211234 | 979 | 209810 | 445.0 | 445.0 | 3 | 0.003160 |
17 | GADSDEN | 120 | 63 | 49 | 8 | 8 | 138044 | 57793 | 77427 | 2824.0 | 2824.0 | 3 | 0.020051 |
18 | GILCHRIST | 11 | 3 | 8 | 0 | 0 | 4052 | 670 | 3382 | 0.0 | 0.0 | 2 | 0.000000 |
19 | GLADES | 59 | 43 | 16 | 0 | 0 | 42294 | 35081 | 7213 | 0.0 | 0.0 | 1 | 0.000000 |
20 | GULF | 62 | 18 | 38 | 6 | 6 | 44915 | 35451 | 5690 | 3774.0 | 3774.0 | 3 | 0.026796 |
21 | HAMILTON | 61 | 30 | 31 | 0 | 0 | 34327 | 21766 | 12560 | 0.0 | 0.0 | 2 | 0.000000 |
22 | HARDEE | 91 | 64 | 26 | 1 | 1 | 26997 | 20293 | 6548 | 155.0 | 155.0 | 1 | 0.010789 |
23 | HENDRY | 81 | 63 | 18 | 0 | 0 | 18807 | 14827 | 3980 | 0.0 | 0.0 | 1 | 0.000000 |
24 | HERNANDO | 58 | 53 | 5 | 0 | 0 | 29667 | 27895 | 1772 | 0.0 | 0.0 | 7 | 0.000000 |
25 | HIGHLANDS | 63 | 42 | 20 | 1 | 2 | 19106 | 10743 | 8147 | 216.0 | 281.0 | 1 | 0.019560 |
26 | HILLSBOROUGH | 808 | 654 | 152 | 2 | 2 | 1361594 | 1048235 | 311399 | 1959.0 | 1959.0 | 7 | 0.260748 |
27 | HOLMES | 156 | 71 | 77 | 8 | 9 | 86481 | 53332 | 32662 | 487.0 | 566.0 | 3 | 0.004019 |
28 | INDIAN RIVER | 122 | 84 | 37 | 1 | 1 | 145658 | 82890 | 62621 | 147.0 | 147.0 | 4 | 0.003863 |
29 | JACKSON | 146 | 72 | 72 | 2 | 3 | 77392 | 47785 | 29398 | 209.0 | 241.0 | 3 | 0.001711 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
34 | LEON | 152 | 86 | 64 | 2 | 2 | 103547 | 63816 | 39518 | 212.0 | 212.0 | 3 | 0.001505 |
35 | LEVY | 75 | 31 | 36 | 8 | 8 | 18031 | 5843 | 10466 | 1722.0 | 1722.0 | 2 | 0.027808 |
36 | LIBERTY | 77 | 22 | 49 | 6 | 6 | 25453 | 6217 | 18611 | 625.0 | 625.0 | 3 | 0.004438 |
37 | MADISON | 59 | 31 | 27 | 1 | 1 | 39957 | 26896 | 13061 | 0.0 | 0.0 | 2 | 0.000000 |
38 | MANATEE | 199 | 133 | 65 | 1 | 1 | 269545 | 157364 | 103091 | 9090.0 | 9090.0 | 1 | 0.632744 |
39 | MARION | 69 | 55 | 13 | 1 | 1 | 73228 | 45865 | 27364 | 0.0 | 0.0 | 5 | 0.000000 |
40 | MARTIN | 128 | 95 | 33 | 0 | 0 | 295820 | 248523 | 47297 | 0.0 | 0.0 | 4 | 0.000000 |
41 | MIAMI-DADE | 957 | 804 | 146 | 7 | 13 | 1743024 | 1315449 | 404966 | 22609.0 | 33739.0 | 6 | 0.322624 |
42 | MONROE | 100 | 55 | 39 | 6 | 6 | 527717 | 198916 | 257963 | 70838.0 | 70838.0 | 6 | 0.677376 |
43 | NASSAU | 102 | 50 | 48 | 3 | 3 | 100719 | 53027 | 45884 | 644.0 | 644.0 | 2 | 0.010400 |
44 | OKALOOSA | 216 | 111 | 92 | 13 | 13 | 269312 | 77988 | 171330 | 19993.0 | 19993.0 | 3 | 0.141953 |
45 | OKEECHOBEE | 65 | 45 | 20 | 0 | 0 | 28011 | 22654 | 5357 | 0.0 | 0.0 | 1 | 0.000000 |
46 | ORANGE | 754 | 601 | 150 | 3 | 3 | 1031407 | 779028 | 251598 | 780.0 | 780.0 | 5 | 0.028880 |
47 | OSCEOLA | 236 | 167 | 69 | 0 | 0 | 203856 | 160646 | 43210 | 0.0 | 0.0 | 5 | 0.000000 |
48 | PALM BEACH | 604 | 516 | 81 | 7 | 7 | 805336 | 647923 | 120878 | 36535.0 | 36535.0 | 4 | 0.960083 |
49 | PASCO | 123 | 105 | 18 | 0 | 0 | 94623 | 88405 | 6218 | 0.0 | 0.0 | 7 | 0.000000 |
50 | PINELLAS | 414 | 314 | 97 | 3 | 3 | 1185162 | 989323 | 190285 | 5554.0 | 5554.0 | 7 | 0.739252 |
51 | POLK | 357 | 273 | 82 | 2 | 3 | 228351 | 207445 | 20013 | 893.0 | 1559.0 | 1 | 0.108520 |
52 | PUTNAM | 68 | 25 | 40 | 3 | 3 | 85997 | 16783 | 69008 | 206.0 | 206.0 | 2 | 0.003327 |
53 | SANTA ROSA | 190 | 97 | 89 | 4 | 5 | 211179 | 39776 | 167685 | 3719.0 | 3838.0 | 3 | 0.027250 |
54 | SARASOTA | 295 | 230 | 64 | 1 | 2 | 224007 | 181891 | 41825 | 290.0 | 1608.0 | 1 | 0.111931 |
55 | SEMINOLE | 143 | 126 | 17 | 0 | 0 | 225280 | 182245 | 43035 | 0.0 | 0.0 | 5 | 0.000000 |
56 | SUMTER | 76 | 59 | 17 | 0 | 1 | 95533 | 88154 | 7379 | 0.0 | 665.0 | 5 | 0.024622 |
57 | SUWANNEE | 37 | 23 | 11 | 3 | 3 | 28317 | 18950 | 7368 | 1999.0 | 1999.0 | 2 | 0.032282 |
58 | TAYLOR | 94 | 22 | 63 | 9 | 9 | 24629 | 10014 | 12520 | 2095.0 | 2095.0 | 2 | 0.033832 |
59 | UNION | 28 | 7 | 20 | 1 | 1 | 12519 | 5843 | 5698 | 978.0 | 978.0 | 2 | 0.015794 |
60 | VOLUSIA | 227 | 144 | 76 | 7 | 7 | 326665 | 164435 | 147618 | 14612.0 | 14612.0 | 5 | 0.541025 |
61 | WAKULLA | 50 | 21 | 23 | 6 | 6 | 17783 | 10011 | 7649 | 123.0 | 123.0 | 3 | 0.000873 |
62 | WALTON | 219 | 125 | 86 | 8 | 9 | 104108 | 69478 | 34056 | 575.0 | 627.0 | 3 | 0.004452 |
63 | WASHINGTON | 96 | 57 | 34 | 5 | 6 | 72116 | 46988 | 23224 | 1904.0 | 3220.0 | 3 | 0.022862 |
64 rows × 13 columns
We can use idxmax
to get the index of the highest value in this new column, and the answer to part (2).
[9]:
bridges_2.loc[bridges_2['SD Area Share in District'].idxmax()]
[9]:
County PALM BEACH
Total # 604
Good # 516
Fair # 81
Poor # 7
SD # 7
Total Area 805336
Good Area 647923
Fair Area 120878
Poor Area 36535
SD Area 36535
District 4
SD Area Share in District 0.960083
Name: 48, dtype: object
Palm Beach County only has a few structurally deficient bridges, but they are big ones, including two that span the intracostal waterway.