This example details how to use the appc package to add air pollution
exposure estimates for exact locations and time periods defined by
geocoded coordinates and a “key” date. For this example workflow, we
will simulate 20 random locations in Wayne County, Michigan and dates of
birth during 2022, but in actuality this can be any set of geocoded
lat and lon columns with corresponding
dates.
d <-
tigris::counties("MI", year = 2021, progress_bar = FALSE) |>
suppressWarnings() |>
filter(GEOID == 26163) |>
sf::st_sample(20) |>
sf::st_coordinates() |>
tibble::as_tibble() |>
rename(lat = Y, lon = X) |>
mutate(dob = sample(seq(as.Date("2023-01-01"), as.Date("2023-12-31"), by = 1), size = 20))
d
#> # A tibble: 20 × 3
#> lon lat dob
#> <dbl> <dbl> <date>
#> 1 -83.5 42.3 2023-10-31
#> 2 -82.9 42.4 2023-11-27
#> 3 -83.4 42.1 2023-10-18
#> 4 -83.2 42.2 2023-05-10
#> 5 -83.2 42.3 2023-12-09
#> 6 -83.3 42.3 2023-05-11
#> 7 -82.9 42.4 2023-10-21
#> 8 -83.3 42.2 2023-08-22
#> 9 -83.2 42.2 2023-03-06
#> 10 -83.4 42.2 2023-08-19
#> 11 -83.2 42.1 2023-05-30
#> 12 -83.5 42.3 2023-05-27
#> 13 -83.2 42.2 2023-09-15
#> 14 -83.3 42.1 2023-11-16
#> 15 -83.4 42.2 2023-02-09
#> 16 -83.5 42.3 2023-07-12
#> 17 -83.5 42.4 2023-06-05
#> 18 -83.4 42.3 2023-01-12
#> 19 -83.3 42.3 2023-10-24
#> 20 -83.2 42.2 2023-02-11For this example, we want to estimate the average fine particulate matter from 90 days prior to birth until the date of birth. Using the dob column as the start date, we will define an end date in our example data and also define the s2 cell identifers based on the latitude and longitude coordinates.
d <- d |>
mutate(d, dob_plus_90 = dob + 90,
s2 = s2::as_s2_cell(s2::s2_geog_point(lon, lat)))
head(d)
#> # A tibble: 6 × 5
#> lon lat dob dob_plus_90 s2
#> <dbl> <dbl> <date> <date> <s2cell>
#> 1 -83.5 42.3 2023-10-31 2024-01-29 883b539fde6f468f
#> 2 -82.9 42.4 2023-11-27 2024-02-25 882529be4afb184b
#> 3 -83.4 42.1 2023-10-18 2024-01-16 883b5cadcae041d3
#> 4 -83.2 42.2 2023-05-10 2023-08-08 883b36dae69ea671
#> 5 -83.2 42.3 2023-12-09 2024-03-08 883b339ae99f458f
#> 6 -83.3 42.3 2023-05-11 2023-08-09 883b4b9750a4f19dNext, use the newly created columns to call the
predict_pm25_date_range() function, which is a simpler
implementation of predict_pm25 that takes vectors of start and end dates
instead of lists of date vectors for each s2 cell location:
dplyr::mutate(d, pm25 = predict_pm25_date_range(s2, dob, dob_plus_90))
#> ℹ (down)loading random forest model
#> loaded rf_pm_v1 in 13s
#> ✔ (down)loading random forest model [13.2s]
#>
#> ℹ checking that s2 are within the contiguous US
#> ✔ checking that s2 are within the contiguous US [59ms]
#>
#> ℹ adding coordinates
#> ✔ adding coordinates [26ms]
#>
#> ℹ adding elevation
#> ✔ adding elevation [1.9s]
#>
#> ℹ adding HMS smoke data
#> intersecting smoke data ■■■■■■■ 20% | ETA: 4s
#> intersecting smoke data ■■■■■■■■■■■■■■■■■■■ 60% | ETA: 3s
#> ℹ adding HMS smoke data✔ adding HMS smoke data [7.9s]
#>
#> ℹ adding NARR
#> ✔ adding NARR [958ms]
#>
#> ℹ adding gridMET
#> ✔ adding gridMET [895ms]
#>
#> ℹ adding MERRA
#> ✔ adding MERRA [20.6s]
#>
#> ℹ adding time components
#> ✔ adding time components [115ms]
#> # A tibble: 20 × 6
#> lon lat dob dob_plus_90 s2 pm25
#> <dbl> <dbl> <date> <date> <s2cell> <list>
#> 1 -83.5 42.3 2023-10-31 2024-01-29 883b539fde6f468f <tibble [91 × 2]>
#> 2 -82.9 42.4 2023-11-27 2024-02-25 882529be4afb184b <tibble [91 × 2]>
#> 3 -83.4 42.1 2023-10-18 2024-01-16 883b5cadcae041d3 <tibble [91 × 2]>
#> 4 -83.2 42.2 2023-05-10 2023-08-08 883b36dae69ea671 <tibble [91 × 2]>
#> 5 -83.2 42.3 2023-12-09 2024-03-08 883b339ae99f458f <tibble [91 × 2]>
#> 6 -83.3 42.3 2023-05-11 2023-08-09 883b4b9750a4f19d <tibble [91 × 2]>
#> 7 -82.9 42.4 2023-10-21 2024-01-19 8824d53e3f2addd3 <tibble [91 × 2]>
#> 8 -83.3 42.2 2023-08-22 2023-11-20 883b462c96efefa9 <tibble [91 × 2]>
#> 9 -83.2 42.2 2023-03-06 2023-06-04 883b37c7f61125bf <tibble [91 × 2]>
#> 10 -83.4 42.2 2023-08-19 2023-11-17 883b44e161d94b5d <tibble [91 × 2]>
#> 11 -83.2 42.1 2023-05-30 2023-08-28 883b3f38bc72bc79 <tibble [91 × 2]>
#> 12 -83.5 42.3 2023-05-27 2023-08-25 883b5129692a69f7 <tibble [91 × 2]>
#> 13 -83.2 42.2 2023-09-15 2023-12-14 883b362d7f446e2d <tibble [91 × 2]>
#> 14 -83.3 42.1 2023-11-16 2024-02-14 883b41493a95f44f <tibble [91 × 2]>
#> 15 -83.4 42.2 2023-02-09 2023-05-10 883b451132452249 <tibble [91 × 2]>
#> 16 -83.5 42.3 2023-07-12 2023-10-10 883b56ca718e3a5b <tibble [91 × 2]>
#> 17 -83.5 42.4 2023-06-05 2023-09-03 8824abeaf2f4640b <tibble [91 × 2]>
#> 18 -83.4 42.3 2023-01-12 2023-04-12 883b4ef3cd3ded53 <tibble [91 × 2]>
#> 19 -83.3 42.3 2023-10-24 2024-01-22 883b496b4a4bf401 <tibble [91 × 2]>
#> 20 -83.2 42.2 2023-02-11 2023-05-12 883b3733a728db21 <tibble [91 × 2]>If interested in average exposures (and their uncertainties) from start date to end date (instead of daily exposures), we can extract those directly:
d_pm25 <- with(d, predict_pm25_date_range(s2, dob, dob_plus_90, average = TRUE))
#> ℹ (down)loading random forest model
#> loaded rf_pm_v1 in 0s
#> ✔ (down)loading random forest model [21ms]
#>
#> ℹ checking that s2 are within the contiguous US
#> ✔ checking that s2 are within the contiguous US [81ms]
#>
#> ℹ adding coordinates
#> ✔ adding coordinates [66ms]
#>
#> ℹ adding elevation
#> ✔ adding elevation [632ms]
#>
#> ℹ adding HMS smoke data
#> intersecting smoke data ■■■■■ 15% | ETA: 12s
#> intersecting smoke data ■■■■■■■■■■ 30% | ETA: 11s
#> intersecting smoke data ■■■■■■■■■■■■■■■■ 50% | ETA: 8s
#> intersecting smoke data ■■■■■■■■■■■■■■■■■■■ 60% | ETA: 7s
#> intersecting smoke data ■■■■■■■■■■■■■■■■■■■■■■■ 75% | ETA: 5s
#> intersecting smoke data ■■■■■■■■■■■■■■■■■■■■■■■■■■■ 85% | ETA: 3s
#> ℹ adding HMS smoke data✔ adding HMS smoke data [24s]
#>
#> ℹ adding NARR
#> ✔ adding NARR [2s]
#>
#> ℹ adding gridMET
#> ✔ adding gridMET [2.1s]
#>
#> ℹ adding MERRA
#> ✔ adding MERRA [3.6s]
#>
#> ℹ adding time components
#> ✔ adding time components [22ms]
bind_cols(d, bind_rows(d_pm25))
#> # A tibble: 20 × 7
#> lon lat dob dob_plus_90 s2 pm25 pm25_se
#> <dbl> <dbl> <date> <date> <s2cell> <dbl> <dbl>
#> 1 -83.5 42.3 2023-10-31 2024-01-29 883b539fde6f468f 9.05 0.165
#> 2 -82.9 42.4 2023-11-27 2024-02-25 882529be4afb184b 9.06 0.156
#> 3 -83.4 42.1 2023-10-18 2024-01-16 883b5cadcae041d3 8.23 0.146
#> 4 -83.2 42.2 2023-05-10 2023-08-08 883b36dae69ea671 15.0 0.369
#> 5 -83.2 42.3 2023-12-09 2024-03-08 883b339ae99f458f 9.23 0.177
#> 6 -83.3 42.3 2023-05-11 2023-08-09 883b4b9750a4f19d 14.5 0.357
#> 7 -82.9 42.4 2023-10-21 2024-01-19 8824d53e3f2addd3 8.48 0.159
#> 8 -83.3 42.2 2023-08-22 2023-11-20 883b462c96efefa9 7.04 0.107
#> 9 -83.2 42.2 2023-03-06 2023-06-04 883b37c7f61125bf 8.15 0.123
#> 10 -83.4 42.2 2023-08-19 2023-11-17 883b44e161d94b5d 7.13 0.116
#> 11 -83.2 42.1 2023-05-30 2023-08-28 883b3f38bc72bc79 14.8 0.448
#> 12 -83.5 42.3 2023-05-27 2023-08-25 883b5129692a69f7 14.5 0.327
#> 13 -83.2 42.2 2023-09-15 2023-12-14 883b362d7f446e2d 7.17 0.121
#> 14 -83.3 42.1 2023-11-16 2024-02-14 883b41493a95f44f 9.01 0.158
#> 15 -83.4 42.2 2023-02-09 2023-05-10 883b451132452249 7.26 0.113
#> 16 -83.5 42.3 2023-07-12 2023-10-10 883b56ca718e3a5b 9.62 0.134
#> 17 -83.5 42.4 2023-06-05 2023-09-03 8824abeaf2f4640b 14.0 0.386
#> 18 -83.4 42.3 2023-01-12 2023-04-12 883b4ef3cd3ded53 7.63 0.128
#> 19 -83.3 42.3 2023-10-24 2024-01-22 883b496b4a4bf401 8.58 0.160
#> 20 -83.2 42.2 2023-02-11 2023-05-12 883b3733a728db21 7.68 0.134