Editor's note: Jesus Martinez joined Lob's Print and Mail API team in June 2021 for a 3-month internship. He reworked our redactions and contributed to overall improvements within our APIs. This article was written about his experience improving user data redaction at Lob.
Engineering teams are keenly aware of the need to manage data privacy. Whether it's per contractual and compliance obligations (i.e., GDPR and CCPA) or based on customer request and retention policies, engineers need to be able to redact personal identifiable information (PII).
User data redaction involves obfuscating sensitive user data. In Lob’s case, it includes, but isn't limited to first name, last name, and email address.
At Lob, we have a way to programmatically handle redaction for resources like letters and postcards but our redaction system needed updating to include user data. We had a manual process to redact user data by querying the production users table to perform redactions but to scale, it needed to be automated. In this post, we'll walk through how we implemented our solution.
While scoping this project, I reviewed our redaction codebase to determine if rewriting it to include user data made sense. To not disrupt existing functionality, I opted to create an endpoint designed for user data redactions and make it work within the current redaction system.
One of my favorite parts of being an engineer is putting on my detective cap and digging into a problem. Working on the redactions project necessitated the use of a magnifying glass.
The solution I landed on was to store user data redactions in a database table. By making post requests to a new redactions endpoint Lob can store information about the user and have a worker run in the background checking daily for new users to redact. The worker does the following:
In addition to the endpoint, I created two migration scripts. The first script updated constraints in the redactions table and allows users to be a redactable resource. The second script added a ‘date_redacted’ and ‘redaction_id’ property to the users table and updated the model and validators so users could be treated as a redactable resource.
After my initial investigation, I proceeded to define a user redaction endpoint. Testing the new endpoint locally worked fine, but an issue surfaced when I deployed it to our staging environment. I discovered I needed to enable the step function to access the new redaction endpoint.
Lob uses a node-wrapper to make HTTP requests. I had to publish a new version of this wrapper to include the new user redaction capabilities. After writing many more tests, I deployed it to staging and everything worked as expected.
One of Lob’s core values is Own the Outcome. To help future Lob developers, I updated our internal documentation with implementation details on user data redaction and how to extend Lob’s redaction system to support future use cases.
This project was a success for several reasons:
I don’t think an intern can ask for anything more from an internship. Thanks, Lob.