Rank Aggregation for Automatic Schema Matching

Domshlak, C.; Gal, A.; Roitman, H.
Domshlak, C
Gal, A
Roitman, H
IEEE Transactions on Knowledge and Data Engineering
Citations range: 
10 - 49
rat-dcol.pdf447.23 KB

Schema matching is a basic operation of data integration, and several tools for automating it have been proposed and evaluated in the database community. Research in this area reveals that there is no single schema matcher that is guaranteed to succeed in finding a good mapping for all possible domains and, thus, an ensemble of schema matchers should be considered. In this paper, we introduce schema metamatching, a general framework for composing an arbitrary ensemble of schema matchers and generating a list of best ranked schema mappings. Informally, schema metamatching stands for computing a “consensus” ranking of
alternative mappings between two schemata, given the “individual” graded rankings provided by several schema matchers. We introduce several algorithms for this problem, varying from adaptations of some standard techniques for general quantitative rank aggregation to novel techniques specific to the problem of schema matching, and to combinations of both. We provide a formal analysis of the applicability and relative performance of these algorithms and evaluate them empirically on a set of real-world schemata.