Differential Privacy for Natural Language Processing


Differential Privacy for Natural Language Processing

Principal Investigator

Macquarie University

Oracle Principal Investigator

Long Duong
Mark Johnson
Michael Wick, Principal Member Technical Staff
Stephen Green, Senior Director


The project investigates how to prevent potentially sensitive information in text from being unintentionally disclosed, such as the writer’s race, gender or age, through machine learning-based inference.  The objectives are to adapt current techniques for adversarial example generation to incorporate a mathematically rigorous technique, Differential Privacy, in order to develop a text-to-text transformation system that can provide formal guarantees of privacy.  The project will examine optimisation frameworks that allow privacy-related constraints to be expressed on text-to-text transformations, and evaluate the frameworks in terms of privacy and the utility of the transformed text.