Jane Doe will help you improve your project

1:30pm - 1:55pm on Saturday, October 5 in PennTop South

Rebeca Sarai

Slides:: https://docs.google.com/presentation/d/1d1AEIg9_GLCL62E8Nkfcu4W5UyNXK8mR8ynDqcaomMo/edit?usp=sharing
Watch:: https://youtu.be/5pqUjJelIcU

Description

On days of privacy scandals, the concern about securing customers’ data is bigger than ever, and the solution is farther from locking everything in a safe box. Sharing data is inevitable, in this talk we will approach the data anonymity problem, exploring how to use anonymization techniques to secure users personal information when analyzing, testing, processing, or sharing a database.

Customers’ data is important. The number of privacy laws in recent years has grown from 20 to 100, to name a few: PCI compliance in the payment industry, the European GDPR regulation, and the Brazilian LGPD. All these new regulations attempt to bridge an old gap: data anonymity. How to handle data and protect the individuals comprised in it? Companies often face lawsuits to compensate for personal information breaches in their database.

Code must be tested. In classic development workflow, many times production data is copied onto test, QA or staging environments, only to be followed by exposure to the eyes of testers, receivers, or unauthorized developers on machines less protected than production environments. It is not seldom for files also to be shared with external partners, who often require but a small part of the data transferred, and granting access to user’s data might be a breach. If in one hand sharing data is both necessary and inevitable, on the other technologies that assure the privacy of individuals details are no longer only desirable, but essential.

A Jane Doe is a person without a name that is able to perform actions even though without any recollection of personal information. We will use this principle to approach two important areas in software development: how to streamline when testing complex systems and how to manage data whilst securing users’ personal information. We will create a boilerplate project to expose different techniques of anonymization and pseudonymization, showing that solving the anonymity problem is much more complex than replacing names, last names, and social security numbers - and all of that avoiding bottlenecking Django projects.