Tunable Differentially Private Synthetic Data

Project

Tunable Differentially Private Synthetic Data

Principal Investigator

University of Massachusetts at Amherst

Oracle Principal Investigator

Pallika Kanani, Consulting Member Technical Staff
Virendra Marathe, Consulting Member Technical Staff

Summary

Data sharing within the modern enterprise is extremely constrained by privacy concerns. Privacy- preserving synthetic data is an appealing solution: it allows existing analytics workflows and ma- chine learning methods to be used, while the original data remains protected. But recent research has shown that unless a formal privacy standard is adopted, synthetic data can violate privacy in subtle ways. This proposal will develop novel methods for synthetic data generation that meet the rigorous standard of ε-differential privacy. The balance between privacy and accuracy can be controlled by adjusting the privacy-loss parameter ε, and, crucially, our work will allow the synthetic data to be tuned to provide maximal accuracy for tasks of interest, including both statistical query answering and fundamental machine learning tasks.