Tunable Differentially Private Synthetic Data
Project
Tunable Differentially Private Synthetic Data
Principal Investigator
University of Massachusetts at Amherst
Oracle Principal Investigator
Pallika Kanani, Research Director
Virendra Marathe, Consulting Member Technical Staff
Summary
Data sharing within the modern enterprise is extremely constrained by privacy concerns. Privacy- preserving synthetic data is an appealing solution: it allows existing analytics workflows and ma- chine learning methods to be used, while the original data remains protected. But recent research has shown that unless a formal privacy standard is adopted, synthetic data can violate privacy in subtle ways. This proposal will develop novel methods for synthetic data generation that meet the rigorous standard of ε-differential privacy. The balance between privacy and accuracy can be controlled by adjusting the privacy-loss parameter ε, and, crucially, our work will allow the synthetic data to be tuned to provide maximal accuracy for tasks of interest, including both statistical query answering and fundamental machine learning tasks.