This example details how to use the appc package to add air pollution
exposure estimates for exact locations and time periods defined by
geocoded coordinates and a “key” date. For this example workflow, we
will simulate 20 random locations in Wayne County, Michigan and dates of
birth during 2022, but in actuality this can be any set of geocoded
lat and lon columns with corresponding
dates.
d <-
tigris::counties("MI", year = 2021, progress_bar = FALSE) |>
suppressWarnings() |>
filter(GEOID == 26163) |>
sf::st_sample(20) |>
sf::st_coordinates() |>
tibble::as_tibble() |>
rename(lat = Y, lon = X) |>
mutate(dob = sample(seq(as.Date("2023-01-01"), as.Date("2023-12-31"), by = 1), size = 20))
d
#> # A tibble: 20 × 3
#> lon lat dob
#> <dbl> <dbl> <date>
#> 1 -83.5 42.3 2023-10-31
#> 2 -82.9 42.4 2023-11-27
#> 3 -83.4 42.1 2023-10-18
#> 4 -83.2 42.2 2023-05-10
#> 5 -83.2 42.3 2023-12-09
#> 6 -83.3 42.3 2023-05-11
#> 7 -82.9 42.4 2023-10-21
#> 8 -83.3 42.2 2023-08-22
#> 9 -83.2 42.2 2023-03-06
#> 10 -83.4 42.2 2023-08-19
#> 11 -83.2 42.1 2023-05-30
#> 12 -83.5 42.3 2023-05-27
#> 13 -83.2 42.2 2023-09-15
#> 14 -83.3 42.1 2023-11-16
#> 15 -83.4 42.2 2023-02-09
#> 16 -83.5 42.3 2023-07-12
#> 17 -83.5 42.4 2023-06-05
#> 18 -83.4 42.3 2023-01-12
#> 19 -83.3 42.3 2023-10-24
#> 20 -83.2 42.2 2023-02-11For this example, we want to estimate the average fine particulate matter from 90 days prior to birth until the date of birth. We define these dates and create a list-col of dates for each location in our example data:
d <- d |>
mutate(
start_date = dob - 90,
end_date = dob
) |>
rowwise() |>
mutate(dates = list(seq(start_date, end_date, by = 1))) |>
ungroup()
d
#> # A tibble: 20 × 6
#> lon lat dob start_date end_date dates
#> <dbl> <dbl> <date> <date> <date> <list>
#> 1 -83.5 42.3 2023-10-31 2023-08-02 2023-10-31 <date [91]>
#> 2 -82.9 42.4 2023-11-27 2023-08-29 2023-11-27 <date [91]>
#> 3 -83.4 42.1 2023-10-18 2023-07-20 2023-10-18 <date [91]>
#> 4 -83.2 42.2 2023-05-10 2023-02-09 2023-05-10 <date [91]>
#> 5 -83.2 42.3 2023-12-09 2023-09-10 2023-12-09 <date [91]>
#> 6 -83.3 42.3 2023-05-11 2023-02-10 2023-05-11 <date [91]>
#> 7 -82.9 42.4 2023-10-21 2023-07-23 2023-10-21 <date [91]>
#> 8 -83.3 42.2 2023-08-22 2023-05-24 2023-08-22 <date [91]>
#> 9 -83.2 42.2 2023-03-06 2022-12-06 2023-03-06 <date [91]>
#> 10 -83.4 42.2 2023-08-19 2023-05-21 2023-08-19 <date [91]>
#> 11 -83.2 42.1 2023-05-30 2023-03-01 2023-05-30 <date [91]>
#> 12 -83.5 42.3 2023-05-27 2023-02-26 2023-05-27 <date [91]>
#> 13 -83.2 42.2 2023-09-15 2023-06-17 2023-09-15 <date [91]>
#> 14 -83.3 42.1 2023-11-16 2023-08-18 2023-11-16 <date [91]>
#> 15 -83.4 42.2 2023-02-09 2022-11-11 2023-02-09 <date [91]>
#> 16 -83.5 42.3 2023-07-12 2023-04-13 2023-07-12 <date [91]>
#> 17 -83.5 42.4 2023-06-05 2023-03-07 2023-06-05 <date [91]>
#> 18 -83.4 42.3 2023-01-12 2022-10-14 2023-01-12 <date [91]>
#> 19 -83.3 42.3 2023-10-24 2023-07-26 2023-10-24 <date [91]>
#> 20 -83.2 42.2 2023-02-11 2022-11-13 2023-02-11 <date [91]>Next, we will use the lon and lat columns
to create the s2 geohash:
d <- d |> dplyr::mutate(s2 = s2::as_s2_cell(s2::s2_geog_point(lon, lat)))
d
#> # A tibble: 20 × 7
#> lon lat dob start_date end_date dates s2
#> <dbl> <dbl> <date> <date> <date> <list> <s2cell>
#> 1 -83.5 42.3 2023-10-31 2023-08-02 2023-10-31 <date [91]> 883b539fde6f468f
#> 2 -82.9 42.4 2023-11-27 2023-08-29 2023-11-27 <date [91]> 882529be4afb184b
#> 3 -83.4 42.1 2023-10-18 2023-07-20 2023-10-18 <date [91]> 883b5cadcae041d3
#> 4 -83.2 42.2 2023-05-10 2023-02-09 2023-05-10 <date [91]> 883b36dae69ea671
#> 5 -83.2 42.3 2023-12-09 2023-09-10 2023-12-09 <date [91]> 883b339ae99f458f
#> 6 -83.3 42.3 2023-05-11 2023-02-10 2023-05-11 <date [91]> 883b4b9750a4f19d
#> 7 -82.9 42.4 2023-10-21 2023-07-23 2023-10-21 <date [91]> 8824d53e3f2addd3
#> 8 -83.3 42.2 2023-08-22 2023-05-24 2023-08-22 <date [91]> 883b462c96efefa9
#> 9 -83.2 42.2 2023-03-06 2022-12-06 2023-03-06 <date [91]> 883b37c7f61125bf
#> 10 -83.4 42.2 2023-08-19 2023-05-21 2023-08-19 <date [91]> 883b44e161d94b5d
#> 11 -83.2 42.1 2023-05-30 2023-03-01 2023-05-30 <date [91]> 883b3f38bc72bc79
#> 12 -83.5 42.3 2023-05-27 2023-02-26 2023-05-27 <date [91]> 883b5129692a69f7
#> 13 -83.2 42.2 2023-09-15 2023-06-17 2023-09-15 <date [91]> 883b362d7f446e2d
#> 14 -83.3 42.1 2023-11-16 2023-08-18 2023-11-16 <date [91]> 883b41493a95f44f
#> 15 -83.4 42.2 2023-02-09 2022-11-11 2023-02-09 <date [91]> 883b451132452249
#> 16 -83.5 42.3 2023-07-12 2023-04-13 2023-07-12 <date [91]> 883b56ca718e3a5b
#> 17 -83.5 42.4 2023-06-05 2023-03-07 2023-06-05 <date [91]> 8824abeaf2f4640b
#> 18 -83.4 42.3 2023-01-12 2022-10-14 2023-01-12 <date [91]> 883b4ef3cd3ded53
#> 19 -83.3 42.3 2023-10-24 2023-07-26 2023-10-24 <date [91]> 883b496b4a4bf401
#> 20 -83.2 42.2 2023-02-11 2022-11-13 2023-02-11 <date [91]> 883b3733a728db21Directly use the s2 and dates columns to
call the predict_pm25() function:
d <- d |> dplyr::mutate(pm25 = predict_pm25(s2, dates))
#> ℹ (down)loading random forest model
#> loaded rf_pm_v0 in 9s
#> ✔ (down)loading random forest model [9.7s]
#>
#> ℹ checking that s2 are within the contiguous US
#> ✔ checking that s2 are within the contiguous US [54ms]
#>
#> ℹ adding coordinates
#> ✔ adding coordinates [26ms]
#>
#> ℹ adding elevation
#> ✔ adding elevation [2.2s]
#>
#> ℹ adding HMS smoke data
#> intersecting smoke data ■■■■■■■■■■ 30% | ETA: 4s
#> intersecting smoke data ■■■■■■■■■■■■■■■■■■■■■■■■■■■ 85% | ETA: 1s
#> ℹ adding HMS smoke data✔ adding HMS smoke data [13.6s]
#>
#> ℹ adding NARR
#> ✔ adding NARR [931ms]
#>
#> ℹ adding gridMET
#> ✔ adding gridMET [1.1s]
#>
#> ℹ adding MERRA
#> ✔ adding MERRA [1.7s]
#>
#> ℹ adding time components
#> ✔ adding time components [30ms]
d
#> # A tibble: 20 × 8
#> lon lat dob start_date end_date dates s2 pm25
#> <dbl> <dbl> <date> <date> <date> <list> <s2cell> <list>
#> 1 -83.5 42.3 2023-10-31 2023-08-02 2023-10-31 <date [91]> 883b539fde… <tibble>
#> 2 -82.9 42.4 2023-11-27 2023-08-29 2023-11-27 <date [91]> 882529be4a… <tibble>
#> 3 -83.4 42.1 2023-10-18 2023-07-20 2023-10-18 <date [91]> 883b5cadca… <tibble>
#> 4 -83.2 42.2 2023-05-10 2023-02-09 2023-05-10 <date [91]> 883b36dae6… <tibble>
#> 5 -83.2 42.3 2023-12-09 2023-09-10 2023-12-09 <date [91]> 883b339ae9… <tibble>
#> 6 -83.3 42.3 2023-05-11 2023-02-10 2023-05-11 <date [91]> 883b4b9750… <tibble>
#> 7 -82.9 42.4 2023-10-21 2023-07-23 2023-10-21 <date [91]> 8824d53e3f… <tibble>
#> 8 -83.3 42.2 2023-08-22 2023-05-24 2023-08-22 <date [91]> 883b462c96… <tibble>
#> 9 -83.2 42.2 2023-03-06 2022-12-06 2023-03-06 <date [91]> 883b37c7f6… <tibble>
#> 10 -83.4 42.2 2023-08-19 2023-05-21 2023-08-19 <date [91]> 883b44e161… <tibble>
#> 11 -83.2 42.1 2023-05-30 2023-03-01 2023-05-30 <date [91]> 883b3f38bc… <tibble>
#> 12 -83.5 42.3 2023-05-27 2023-02-26 2023-05-27 <date [91]> 883b512969… <tibble>
#> 13 -83.2 42.2 2023-09-15 2023-06-17 2023-09-15 <date [91]> 883b362d7f… <tibble>
#> 14 -83.3 42.1 2023-11-16 2023-08-18 2023-11-16 <date [91]> 883b41493a… <tibble>
#> 15 -83.4 42.2 2023-02-09 2022-11-11 2023-02-09 <date [91]> 883b451132… <tibble>
#> 16 -83.5 42.3 2023-07-12 2023-04-13 2023-07-12 <date [91]> 883b56ca71… <tibble>
#> 17 -83.5 42.4 2023-06-05 2023-03-07 2023-06-05 <date [91]> 8824abeaf2… <tibble>
#> 18 -83.4 42.3 2023-01-12 2022-10-14 2023-01-12 <date [91]> 883b4ef3cd… <tibble>
#> 19 -83.3 42.3 2023-10-24 2023-07-26 2023-10-24 <date [91]> 883b496b4a… <tibble>
#> 20 -83.2 42.2 2023-02-11 2022-11-13 2023-02-11 <date [91]> 883b3733a7… <tibble>With daily exposures, we could average fine particulate matter throughout the study period:
d |>
mutate(mean_pm25 = purrr::map_dbl(pm25, \(.) mean(.$pm25)))
#> # A tibble: 20 × 9
#> lon lat dob start_date end_date dates s2 pm25 mean_pm25
#> <dbl> <dbl> <date> <date> <date> <list> <s2ce> <list> <dbl>
#> 1 -83.5 42.3 2023-10-31 2023-08-02 2023-10-31 <date> 883b5… <tibble> 8.25
#> 2 -82.9 42.4 2023-11-27 2023-08-29 2023-11-27 <date> 88252… <tibble> 7.20
#> 3 -83.4 42.1 2023-10-18 2023-07-20 2023-10-18 <date> 883b5… <tibble> 8.91
#> 4 -83.2 42.2 2023-05-10 2023-02-09 2023-05-10 <date> 883b3… <tibble> 7.64
#> 5 -83.2 42.3 2023-12-09 2023-09-10 2023-12-09 <date> 883b3… <tibble> 7.63
#> 6 -83.3 42.3 2023-05-11 2023-02-10 2023-05-11 <date> 883b4… <tibble> 7.04
#> 7 -82.9 42.4 2023-10-21 2023-07-23 2023-10-21 <date> 8824d… <tibble> 9.03
#> 8 -83.3 42.2 2023-08-22 2023-05-24 2023-08-22 <date> 883b4… <tibble> 14.9
#> 9 -83.2 42.2 2023-03-06 2022-12-06 2023-03-06 <date> 883b3… <tibble> 9.16
#> 10 -83.4 42.2 2023-08-19 2023-05-21 2023-08-19 <date> 883b4… <tibble> 14.8
#> 11 -83.2 42.1 2023-05-30 2023-03-01 2023-05-30 <date> 883b3… <tibble> 7.92
#> 12 -83.5 42.3 2023-05-27 2023-02-26 2023-05-27 <date> 883b5… <tibble> 7.80
#> 13 -83.2 42.2 2023-09-15 2023-06-17 2023-09-15 <date> 883b3… <tibble> 12.7
#> 14 -83.3 42.1 2023-11-16 2023-08-18 2023-11-16 <date> 883b4… <tibble> 7.32
#> 15 -83.4 42.2 2023-02-09 2022-11-11 2023-02-09 <date> 883b4… <tibble> 9.21
#> 16 -83.5 42.3 2023-07-12 2023-04-13 2023-07-12 <date> 883b5… <tibble> 12.7
#> 17 -83.5 42.4 2023-06-05 2023-03-07 2023-06-05 <date> 8824a… <tibble> 8.38
#> 18 -83.4 42.3 2023-01-12 2022-10-14 2023-01-12 <date> 883b4… <tibble> 9.00
#> 19 -83.3 42.3 2023-10-24 2023-07-26 2023-10-24 <date> 883b4… <tibble> 8.58
#> 20 -83.2 42.2 2023-02-11 2022-11-13 2023-02-11 <date> 883b3… <tibble> 9.38