Managing Schema Mappings in Highly Heterogeneous Environments

Velegrakis, Y
Dissertation, Univ. of Toronto
Integration, transformation, and translation of data is increasingly important for modern
information systems and e-commerce applications. Views, and more generally, transformation
specifications, or mappings, provide the foundation for many data transformation
Mappings are usually specified manually by data administrators that are familiar with
the semantics of the data and have a good knowledge of the transformation language. The
task of generating and managing mappings is laborious, time consuming and error-prone
since data administrators are called on to write complex mappings in which they specify
in tedious detail how the data is to be transformed. Even once deployed, mappings must
remain under constant supervision since changes in the structure of the data may require
changes in the mappings. In this dissertation, we elaborate on the development of mapping management tools
that are intended to shield administrators from the laborious task of mapping management.
In particular, we present a novel framework for generating mappings between any combination of XML and relational schemas. A set of high-level binary relationships
between the elements of the two schemas, which are specified by a user or generated by a
tool, are combined together to form semantically meaningful mappings. These mappings
are guaranteed to be consistent with the constraints of the schemas. To handle schemas
that are dynamically modified, we describe a methodology for automatically detecting
the mappings that have become invalid as a result of a schema change and rewriting
them to become consistent with the modified schema. Each rewriting is generated in a
way that preserves, as much as possible, the semantics of the initial mapping. Finally,
we show how collections of schemas and mappings can be used in queries to provide a
better understanding of how data has been integrated and transformed.