Skip to contents

geocode() geocodes addr vectors using Census TIGER address features (see ?taf) by:

  1. searching for a matching street (see ?match_addr_street), within the same ZIP code, also searching similar ZIP codes for a matching street if necessary

  2. using the address number to select the best address feature range and side of the street (even/odd), breaking ties on smallest width and spread

  3. linearly interpolating a geographic point along the best range line based on the actual and potential range of address numbers

  4. offsetting the interpolated point from the range line perpendicularly

Only matched input addresses return non-missing matched ZIP code and street values. Missing or unmatched ZIP codes return missing matched ZIP code, street, geography, and s2 cell values. If all ranges on the matched ZIP code and street exclude the address number, only the geography and s2 cell values return NA.

Usage

geocode(
  x,
  name_phonetic_dist = 1L,
  name_fuzzy_dist = 2L,
  match_street_type = c("exact", "compatible", "ignore"),
  match_street_directional = c("exact", "swap", "ignore"),
  zip_variants = TRUE,
  zip_variant = c("minus1", "plus1", "sub5", "sub4", "swap"),
  year = as.character(2025:2011),
  version = "v1",
  taf_install = TRUE,
  taf_redownload = FALSE,
  offset = 10L,
  progress = interactive()
)

geocode_zip(
  x,
  offset = 10L,
  name_phonetic_dist = 1L,
  name_fuzzy_dist = 2L,
  match_street_type = c("exact", "compatible", "ignore"),
  match_street_directional = c("exact", "swap", "ignore"),
  zip_variants = TRUE,
  zip_variant = c("minus1", "plus1", "sub5", "sub4", "swap"),
  year = as.character(2025:2011),
  version = "v1",
  taf_install = TRUE,
  taf_redownload = FALSE,
  progress_callback = NULL,
  taf_check = TRUE
)

Arguments

x

an addr vector (?as_addr)

name_phonetic_dist

integer; maximum optimized string alignment distance between phonetic_street_key() of x and y to consider a possible match

name_fuzzy_dist

integer; maximum optimized string alignment distance between @name of x and y to consider a possible match

match_street_type

character; how to compare street pretype and posttype when selecting street candidates. "exact" requires pretype to match pretype and posttype to match posttype; "compatible" treats blank type fields as unknown but rejects candidates when known type information conflicts; "ignore" does not use street type fields when selecting candidates.

match_street_directional

character; how to compare street predirectional and postdirectional when selecting street candidates. "exact" requires predirectional to match predirectional and postdirectional to match postdirectional; "swap" also permits predirectional to match postdirectional and postdirectional to match predirectional; "ignore" does not use street directional fields when selecting candidates.

zip_variants

logical; fuzzy match to common variants of x in y?

zip_variant

character vector; zipcode variant types to use when zip_variants is TRUE; see ?zipcode_variant

year

integer, length one; vintage of TIGER addrfeat (address feature) files

version

character, length one; major version of the package and taf dataset schema

taf_install

logical; install missing county TAF files needed for input ZIP codes and selected ZIP code variants before geocoding? If FALSE, geocoding proceeds with installed files only and warns when needed county files are missing.

taf_redownload

logical; re-download cached TIGER ZIP files when installing missing TAF counties?

offset

number of meters to offset geocode from street line

progress

logical; show a ZIP-code progress bar while geocoding?

progress_callback

optional callback used internally by geocode() to update progress after ZIP-code reference data is loaded

taf_check

logical; check for missing TAF counties? Used internally by geocode() after checking once for the full input vector.

Value

A tibble with columns addr (the input addr vector), matched_zipcode (character vector), matched_street (addr_street vector), matched_geography (s2_geography point vector), and s2_cell (s2_cell vector).

Details

geocode_zip() is the workhorse function and operates on addr vectors with the same ZIP code; use geocode() to geocode an addr vector with multiple ZIP codes by grouping them by ZIP code and processing serially by default. At a lower level, grouping addr vectors by ZIP code and applying geocode_zip() facilitates more control (e.g., parallel processing).

If the mirai package is installed and mirai daemons have already been configured by the caller, geocode() uses them for ZIP-code-level parallel processing. Otherwise it falls back to sequential processing.

geocode() and geocode_zip() both download and install tiger address features by county (?taf_install) as needed based on the input addr ZIP codes (and possibly ZIP code variants). TAF install checks run before reading TAF ZIP files so parallel geocoding workers do not try to download county files at the same time.

Examples

x <- as_addr(voter_addresses()[1:100])

# for example purposes, only install one county
Sys.setenv("R_USER_DATA_DIR" = tempfile())
taf_install("39061", "2025")
# and geocode without installing other counties
gcd <- geocode(x, taf_install = FALSE)
#> Warning: TAF files are missing for 23 county/counties needed for geocoding; proceeding with installed files only because taf_install = FALSE. Missing counties: 18161, 39017, 39135, 39113, 39165, 21067, 39025, 18025, .... Affected ZIPs: 45003 (plus1 from 45002), 45003 (sub5 from 45002), 45004 (sub5 from 45002), 45005 (sub5 from 45002), 45042 (sub4 from 45002), 45062 (sub4 from 45002), 45032 (sub4 from 45002), 40502 (swap from 45002), ....

# this is only for example purposes and usually not required; e.g.

if (FALSE) { # \dontrun{
  gcd <- geocode(x)
} # }

gcd
#> # A tibble: 100 × 5
#>    addr                 matched_zipcode matched_street matched_geography s2_cell
#>    <addr>               <chr>           <addr_str>     <s2_geography>    <s2cel>
#>  1 3359 QUEEN CITY Ave… 45238           Queen City Ave POINT (-84.61106… 8841ca…
#>  2 1040 KREIS Ln CINCI… 45205           Kreis Ln       POINT (-84.58899… 8841b6…
#>  3 9960 DALY Rd CINCIN… 45231           Daly Rd        POINT (-84.52779… 88404b…
#>  4 413 VOLKERT Pl CINC… 45219           Volkert Pl     POINT (-84.52570… 8841b4…
#>  5 8519 LINDERWOOD Ln … 45255           Linderwood Ln  POINT (-84.31725… 8841a9…
#>  6 6361 BEECHMONT Ave … 45230           Beechmont Ave  POINT (-84.38246… 8841ae…
#>  7 10466 ADVENTURE Ln … 45242           Adventure Ln   POINT (-84.35959… 884053…
#>  8 3156 LOOKOUT Cir CI… 45208           Lookout Cir    POINT (-84.42829… 8841ad…
#>  9 310 WYOMING Ave CIN… 45215           Wyoming Ave    POINT (-84.46793… 88404d…
#> 10 118 SPRINGFIELD Pik… 45215           Springfield P… POINT (-84.47321… 88404d…
#> # ℹ 90 more rows

table(geocode_stage(gcd))
#> 
#>           none street_variant         street  range_variant          range 
#>              9              1              2              0             88 

geocode_table(gcd)
#> # A tibble: 100 × 5
#>    addr                     geocode_stage matched_zipcode matched_street s2_cell
#>    <chr>                    <chr>         <chr>           <chr>          <chr>  
#>  1 3359 QUEEN CITY Ave CIN… range         45238           Queen City Ave 8841ca…
#>  2 1040 KREIS Ln CINCINNAT… range         45205           Kreis Ln       8841b6…
#>  3 9960 DALY Rd CINCINNATI… range         45231           Daly Rd        88404b…
#>  4 413 VOLKERT Pl CINCINNA… range         45219           Volkert Pl     8841b4…
#>  5 8519 LINDERWOOD Ln CINC… range         45255           Linderwood Ln  8841a9…
#>  6 6361 BEECHMONT Ave CINC… range         45230           Beechmont Ave  8841ae…
#>  7 10466 ADVENTURE Ln CINC… range         45242           Adventure Ln   884053…
#>  8 3156 LOOKOUT Cir CINCIN… range         45208           Lookout Cir    8841ad…
#>  9 310 WYOMING Ave CINCINN… range         45215           Wyoming Ave    88404d…
#> 10 118 SPRINGFIELD Pike CI… range         45215           Springfield P… 88404d…
#> # ℹ 90 more rows

leaflet::leaflet(wk::wk_coords(gcd$matched_geography)) |>
  leaflet::addTiles() |>
  leaflet::addCircleMarkers(lng = ~x, lat = ~y, label = ~feature_id)
# use mirai for parallel processing if (FALSE) { # \dontrun{ mirai::daemons(2) geocode(x) mirai::daemons(0) } # }