Representing and Querying Data Transformations

Authors: 
Velegrakis, Y.; Miller, R.J.; Mylopoulos, J.
Author: 
Velegrakis, Y
Miller, R
Mylopoulos, J
Year: 
2005
Venue: 
ICDE 2005
URL: 
http://csdl.computer.org/dl/proceedings/icde/2005/2285/00/22850081.pdf
Citations: 
0
Citations range: 
n/a
AttachmentSize
Velegrakis2005RepresentingandQueryingData.pdf229.46 KB

Modern information systems often store data that has
been transformed and integrated from a variety of sources.
This integration may obscure the original source semantics
of data items. For many tasks, it is important to be
able to determine not only where data items originated,
but also why they appear in the integration as they do and
through what transformation they were derived. This problem
is known as data provenance. In this work, we consider
data provenance at the schema and mapping level. In particular,
we consider how to answer questions such as “what
schema elements in the source(s) contributed to this value”,
or “through what transformations or mappings was this
value derived?” Towards this end, we elevate schemas and
mappings to first-class citizens that are stored in a repository
and are associated with the actual data values. An
extended query language, called MXQL, is also developed
that allows meta-data to be queried as regular data and we
describe its implementation. scenario.