age insurance provider
1 23 Aetna Dr. Zhang
2 47 BCBS Dr. Foyle
3 38 Medicaid Dr. Zhang
The nycflights13 package contains information on flights from NYC airports in 2013. The data is stored across several data frames:
airlines: information on each airlineairports: information on each airportflights: information on each flightplanes: information on each planeweather: hourly weather dataQuestion: What is the advantage of storing this data in multiple tables, instead of one BIG table?
patientsdoctorsofficesinsuranceI want to add location information to the patient table. What should the resulting table look like?
age insurance provider location
1 23 Aetna Dr. Zhang Winston-Salem
2 47 BCBS Dr. Foyle Greensboro
3 38 Medicaid Dr. Zhang Winston-Salem
patients), and add more columnsjoin_by specifies how to link the tablesFlights information:
# A tibble: 3 × 5
time_hour origin dest tailnum carrier
<dttm> <chr> <chr> <chr> <chr>
1 2013-01-01 05:00:00 EWR IAH N14228 UA
2 2013-01-01 05:00:00 LGA IAH N24211 UA
3 2013-01-01 05:00:00 JFK MIA N619AA AA
Weather information
# A tibble: 3 × 4
origin time_hour temp wind_speed
<chr> <dttm> <dbl> <dbl>
1 EWR 2013-01-01 01:00:00 39.0 10.4
2 EWR 2013-01-01 02:00:00 39.0 8.06
3 EWR 2013-01-01 03:00:00 39.0 11.5
Question: What if I want to get information about the weather for each flight?
# A tibble: 6 × 7
time_hour origin dest tailnum carrier temp wind_speed
<dttm> <chr> <chr> <chr> <chr> <dbl> <dbl>
1 2013-01-01 05:00:00 EWR IAH N14228 UA 39.0 12.7
2 2013-01-01 05:00:00 LGA IAH N24211 UA 39.9 15.0
3 2013-01-01 05:00:00 JFK MIA N619AA AA 39.0 15.0
4 2013-01-01 05:00:00 JFK BQN N804JB B6 39.0 15.0
5 2013-01-01 06:00:00 LGA ATL N668DN DL 39.9 16.1
6 2013-01-01 05:00:00 EWR ORD N39463 UA 39.0 12.7
Suppose our tables looked like this:
How would we specify the columns to link the tables?
Suppose our tables looked like this:
Patients in the system:
Suppose I want insurance information only for the patients who have an accepted insurance. What should the final table look like?
Patients in the system:
https://sta279-f25.github.io/class_activities/ca_06.html
For next time, read: