Skip to contents

A single addr_number in y is chosen for each addr_number in x. If exact matches (using as.character) are not found, possible matches (within number_fuzzy_distance) are searched for in y. If multiple matches are present in y, the best one is selected based on the lowest absolute numeric difference with the @digits in x; ties are broken by optimized string alignment (OSA) distances and then preferring the lowest value sorted in Lexicographic order with digits preceding alphabetic characters.

addr_number objects with missing @digits or empty strings for all of @prefix, @digits, @suffix are not matched and returned as missing instead.

Usage

match_addr_number(x, y, number_fuzzy_dist = 1L)

Arguments

x, y

addr_number vectors to match

number_fuzzy_dist

integer; maximum optimized string alignment distance between @number of x and y to consider a possible match

Value

an addr_number vector, the same length as x, that is the best match in y for each addr_number code in x; if no best match is found a missing value is returned (addr_number())

Examples

 x <- addr_number(
   prefix = "",
   digits = as.character(c(1, 10, 228, 11, 22, 22, 22, 10, 99897, NA)),
   suffix = ""
 )

y <- addr_number(
  prefix = "",
  digits = as.character(c(12, 11, 10, 22)),
  suffix = ""
)

match_addr_number(x, y)
#> <addr_number> function ()  
#>  @ prefix: chr [1:10] "" "" "" "" "" "" "" "" NA NA
#>  @ digits: chr [1:10] "10" "10" "22" "11" "22" "22" "22" "10" NA NA
#>  @ suffix: chr [1:10] "" "" "" "" "" "" "" "" NA NA

match_addr_number(x, y, number_fuzzy_dist = 0L)
#> <addr_number> function ()  
#>  @ prefix: chr [1:10] NA "" NA "" "" "" "" "" NA NA
#>  @ digits: chr [1:10] NA "10" NA "11" "22" "22" "22" "10" NA NA
#>  @ suffix: chr [1:10] NA "" NA "" "" "" "" "" NA NA