Addresses that were not validated at the time of collection are often heterogenously formatted and filled with typographical and phonetic noise, making them difficult to compare or link to other sets of addresses. The goal of addr is to clean, parse, standardize, and match messy, real-world US addresses in R to use for data linkages. addr uses the included usaddress library to tag address components and build vctrs-based addr vectors, including the addr() vector and the addr_number(), addr_street(), and addr_place() subclass vectors. Addr and addr_part vectors can be standardized and matched/joined using exact, best, or fuzzy linkages. Ultimately, this facilitates using addr vectors as a column in a data frame which allows for powerful computing on nested address structures using standard R tools.
Installation
Install the latest stable release of addr from R-universe with:
install.packages("addr", repos = c("https://geomarker-io.r-universe.dev", "https://cloud.r-project.org"))Or, install the development version of addr from GitHub with:
# install.packages("pak")
pak::pak("cole-brokamp/addr")Installing addr from GitHub requires a working Rust toolchain; install one using rustup.