Tunable Differentially Private Synthetic Data
University of Massachusetts at Amherst
Oracle Principal Investigator
Pallika Kanani, Consulting Member Technical Staff
Virendra Marathe, Consulting Member Technical Staff
Data sharing within the modern enterprise is extremely constrained by privacy concerns. Privacy- preserving synthetic data is an appealing solution: it allows existing analytics workflows and ma- chine learning methods to be used, while the original data remains protected. But recent research has shown that unless a formal privacy standard is adopted, synthetic data can violate privacy in subtle ways. This proposal will develop novel methods for synthetic data generation that meet the rigorous standard of ε-differential privacy. The balance between privacy and accuracy can be controlled by adjusting the privacy-loss parameter ε, and, crucially, our work will allow the synthetic data to be tuned to provide maximal accuracy for tasks of interest, including both statistical query answering and fundamental machine learning tasks.