Skip to contents

For an addr vector, the string distances are calculated between a reference addr vector (ref_addr). A list of matching reference addr vectors less than or equal to the specified optimal string alignment distances are returned. See stringdist::stringdist-metrics for more details on string metrics and the optimal string alignment (osa) method.

Usage

addr_match(
  x,
  ref_addr,
  stringdist_match = c("osa_lt_1", "exact"),
  match_street_type = TRUE,
  simplify = TRUE
)

addr_match_street_name_and_number(
  x,
  ref_addr,
  stringdist_match = c("osa_lt_1", "exact"),
  match_street_type = TRUE,
  simplify = TRUE
)

addr_match_street(
  x,
  ref_addr,
  stringdist_match = c("osa_lt_1", "exact"),
  match_street_type = TRUE
)

Arguments

x

an addr vector to match

ref_addr

an addr vector to search for matches in

stringdist_match

method for determining string match of street name: "osa_lt_1" requires an optimized string distance less than 1; "exact" requires an exact match

match_street_type

logical; require street type to be identical to match?

simplify

logical; randomly select one addr from multi-matches and return an addr() vector instead of a list? (empty addr vectors and NULL values are converted to NA)

Value

for addr_match() and addr_match_street_name_number(), a named list of possible addr matches for each addr in x

for addr_match_street, a list of possible addr matches for each addr in x (as ref_addr indices)

Examples

addr(c("3333 Burnet Ave Cincinnati OH 45229", "5130 RAPID RUN RD CINCINNATI OHIO 45238")) |>
  addr_match(cagis_addr()$cagis_addr)
#> <addr[2]>
#> [1] 3333 Burnet Avenue Cincinnati OH 45229     
#> [2] 5130 Rapid Run Road Delhi Township OH 45238

addr(c("3333 Burnet Ave Cincinnati OH 45229", "5130 RAPID RUN RD CINCINNATI OHIO 45238")) |>
  addr_match(cagis_addr()$cagis_addr, simplify = FALSE) |>
  tibble::enframe(name = "input_addr", value = "ca") |>
  dplyr::mutate(ca = purrr::list_c(ca)) |>
  dplyr::left_join(cagis_addr(), by = c("ca" = "cagis_addr")) |>
  tidyr::unnest(cols = c(cagis_addr_data)) |>
  dplyr::select(-ca, -cagis_address)
#> # A tibble: 2 × 6
#>   input_addr     cagis_address_place cagis_address_type cagis_s2 cagis_parcel_id
#>   <chr>          <chr>               <chr>              <s2cell> <chr>          
#> 1 3333 Burnet A… NA                  BLD                8841b39… 010400020052   
#> 2 5130 Rapid Ru… NA                  BLD                8841c9f… 054000510478   
#> # ℹ 1 more variable: cagis_is_condo <lgl>