Parsing Addresses With Machine Learning

Share this post

Data quality issues

Lob’s Address Verification product receives millions of addresses everyday. When working at this scale, we see addresses with a range of formats:

757 Lawrence St Apart C Phila PA 23444
931-29-176 Avenue Top Floor Queens NY 22577
123 Blueriver Rd 1 Pkwy Court N Dallas TX 93847
2883 Roosevelt Rd Ste 101 SLot 909 Deerborn MI 41982

Although these addresses all have some complexities, they still follow common patterns. Building a rules-based parsing system could be complex and difficult to iterate on as more patterns are added to it. However, this is where machine learning can excel as it will detect these patterns as you add more training examples.

Ship fast and iterate

Given the benefits that machine learning can provide to this problem, we wanted to get a solution into production as soon as possible. The question became, how do we quickly train a model, especially when Lob has so much address data to choose from? The answer, active learning.

Active learning is a cyclical process of identifying the most useful training examples, labelling these, and retraining the model. We started with a list of 100,000 unique addresses (this large number makes it more likely that uncommon address formats will be included in the dataset), labelled 10 of these with their address labels (e.g., primary number, street name, zip code), trained the model using just these 10 examples, then predicted the parsings along with a confidence of these parsings on the remaining 999, 990 addresses.

After using just 10 training examples, it was easy to see that the model was beginning to understand patterns in the data. For example, primary numbers are often the numbers at the start of an address and states are often the two letters before the zip code. Choosing the next set of 10 addresses to label and add to the training data is easy, pick 10 that the model has low confidence in how to parse.

This iterative process of training and labelling continued until the model could provide a net benefit to our address verification product. At this point, we moved our machine learning parsing model into production and provided our customers with the added benefit of a more accurate service. Model development will continue to further increase its accuracy, primarily by adding more training examples and better standardization of the input address.

Standardizing the input address

We can train a performant address parser with fewer training examples by standardizing the input address. By reducing the complexity of the task, the model requires fewer training examples to become proficient. Methods to standardize the input address can include:

Making all of the text uppercase
Removing unnecessary symbols: #(),%;_:<>{}
Correcting typos, e.g., “Sourth” to “South”
Separating words, e.g., “Rodeodrive” to “Rodeo drive”

The need for speed

A key feature of our address verification product is speed. Therefore, the library we chose to help build our address parser had to be up for the task. After comparing a few options, we chose spaCy. Given its state-of-the-art speed, named entity recognition feature, and documentation, spaCy is very suitable for this task.

Model evaluation

Measuring the performance of an address parser (or any named entity recognition [NER] model) won’t use traditional metrics from regression or classification problems. We chose the Jaccard coefficient as it is well suited for the evaluation of NER models. The parser’s performance on each address label was measured, then these scores were aggregated into a weighted average to compare the overall performance between two versions of the model.

Conclusion

With a well-defined problem and plenty of data that is ready for labelling, a machine learning solution can be delivered in a matter of weeks. The quick feedback loop that active learning provides will help you to reach the desired performance much faster than labelling random examples. If possible, reduce the complexity for the model by standardizing the input data.

More on machine learning

During our last hackathon, one team experimented with machine learning: Don’t Waste Data —we encourage you to do the same!
More from our Address Autocomplete and Verification team: Comparing Street Names with Machine Learning

FAQs

Answered by:

This blog provides general information and discussion about direct mail marketing and related subjects. The content provided in this blog ("Content”), should not be construed as and is not intended to constitute financial, legal or tax advice. You should seek the advice of professionals prior to acting upon any information contained in the Content. All Content is provided strictly “as is” and we make no warranty or representation of any kind regarding the Content.

Parsing Addresses With Machine Learning

Data quality issues

Ship fast and iterate

Standardizing the input address

The need for speed

Model evaluation

Conclusion

More on machine learning

FAQs

Continue Reading

Driving results: the complete guide to automotive direct mail marketing

Direct mail automation: how to connect with your marketing tech stack

High-Performance Direct Mail Templates: Proven Conversion Designs

Lob's website experience is not optimized for Internet Explorer. Please choose another browser.

Parsing Addresses With Machine Learning

Data quality issues

Ship fast and iterate

Standardizing the input address

The need for speed

Model evaluation

Conclusion

More on machine learning

FAQs

Continue Reading

Driving results: the complete guide to automotive direct mail marketing

Direct mail automation: how to connect with your marketing tech stack

High-Performance Direct Mail Templates: Proven Conversion Designs

Lob's website experience is not optimized for Internet Explorer.
Please choose another browser.