For an addr vector, the string distances are calculated between a reference addr vector (ref_addr
).
A list of matching reference addr vectors less than or equal to the specified
optimal string alignment
distances are returned.
See stringdist::stringdist-metrics
for more details on string metrics and the optimal string alignment (osa
) method.
Usage
addr_match(
x,
ref_addr,
stringdist_match = c("osa_lt_1", "exact"),
match_street_type = TRUE,
simplify = TRUE
)
addr_match_street_name_and_number(
x,
ref_addr,
stringdist_match = c("osa_lt_1", "exact"),
match_street_type = TRUE,
simplify = TRUE
)
addr_match_street(
x,
ref_addr,
stringdist_match = c("osa_lt_1", "exact"),
match_street_type = TRUE
)
Arguments
- x
an addr vector to match
- ref_addr
an addr vector to search for matches in
- stringdist_match
method for determining string match of street name: "osa_lt_1" requires an optimized string distance less than 1; "exact" requires an exact match
- match_street_type
logical; require street type to be identical to match?
- simplify
logical; randomly select one addr from multi-matches and return an addr() vector instead of a list? (empty addr vectors and NULL values are converted to NA)
Value
for addr_match()
and addr_match_street_name_number()
,
a named list of possible addr matches for each addr in x
for addr_match_street, a list of possible addr matches for each addr in x
(as ref_addr
indices)
Examples
addr(c("3333 Burnet Ave Cincinnati OH 45229", "5130 RAPID RUN RD CINCINNATI OHIO 45238")) |>
addr_match(cagis_addr()$cagis_addr)
#> 3333 Burnet Avenue Cincinnati OH 45229 5130 Rapid Run Road Delhi Township OH 45238
addr(c("3333 Burnet Ave Cincinnati OH 45229", "5130 RAPID RUN RD CINCINNATI OHIO 45238")) |>
addr_match(cagis_addr()$cagis_addr, simplify = FALSE) |>
tibble::enframe(name = "input_addr", value = "ca") |>
dplyr::mutate(ca = purrr::list_c(ca)) |>
dplyr::left_join(cagis_addr(), by = c("ca" = "cagis_addr")) |>
tidyr::unnest(cols = c(cagis_addr_data)) |>
dplyr::select(-ca, -cagis_address)
#> # A tibble: 2 × 6
#> input_addr cagis_address_place cagis_address_type cagis_s2 cagis_parcel_id
#> <chr> <chr> <chr> <s2cell> <chr>
#> 1 3333 Burnet A… NA BLD 8841b39… 010400020052
#> 2 5130 Rapid Ru… NA BLD 8841c9f… 054000510478
#> # ℹ 1 more variable: cagis_is_condo <lgl>