Exploiting Provenance to Enhance Data Management
Project
Exploiting Provenance to Enhance Data Management
Principal Investigator
Illinois Institute of Technology
Oracle Fellowship Recipient
Oracle Principal Investigator
Dieter Gawlick
Vasudha Krishnaswamy
Zhen Hua Liu
Summary
Provenance research has operated under the assumption that provenance information will be consumed by humans to support use cases such as auditing and debugging. In this project, we investigate novel applications of provenance. These new applications range from low-level systems support, e.g., improving the performance of query processing and reducing resource usage, to high-level business support, e.g., assessing the value of data based on provenance. To support these new use cases, we will develop low-overhead capturing mechanisms for coarse-granular provenance (e.g., at the level of disk blocks). We then investigate the use of such provenance information for optimizing a wide range of query execution, query optimization, and self-tuning tasks. Our initial focus is on data skipping and caching. Furthermore, we will build a general framework for assessing data value that incorporates provenance and query frequency to determine the importance of data with respect to a workload.