Policy-regulated Management of ETL Evolution

Authors: 
Papastefanatos, G.; Vassiliadis, P.; Simitsis, A.; Vassiliou, Y.
Author: 
Papastefanatos, G
Vassiliadis, P
Simitsis, A
Vassiliou, Y
Year: 
2009
Venue: 
Journal on Data Semantics (JoDS), Special issue on "Semantic Data Warehouses" (JoDS XIII), LNCS 5530, pp. 146-176, 2009, Springer
Citations: 
16
Citations range: 
10 - 49

In this paper, we discuss the problem of performing impact prediction for changes that occur in the schema/structure of the data warehouse sources. We abstract Extract-Transform-Load (ETL) activities as queries and sequences of views. ETL activities and its sources are uniformly modeled as a graph that is annotated with policies for the management of evolution events. Given a change at an element of the graph, our method detects the parts of the graph that are affected by this change and highlights the way they are tuned to respond to it. For many cases of ETL source evolution, we present rules so that both syntactical and semantic correctness of activities are retained. Finally, we experiment with the evaluation of our approach over real-world ETL workflows used in the Greek public sector.