Managing Uncertainty in Schema Matching with Top-K Schema Mappings

Gal, A.
Gal, A
Journal on Data Semantics, 2006
Citations range: 
50 - 99
Gal2006ManagingUncertaintyinSchema.pdf929.03 KB

. In this paper, we propose to extend current practice in schema matching with the simultaneous use of top-K schema mappings rather than a single best mapping. This is a natural extension of existing methods (which can be considered to fall into the top-1 category), taking into account the imprecision inherent in the schema matching process. The essence of this method is the simultaneous generation and examination of K best schema mappings to identify useful mappings. The paper discusses efficient methods for generating top-K methods and propose a generic methodology for the simultaneous utilization of top-K mappings. We also propose a concrete heuristic that aims at improving precision at the cost of recall. We have tested the heuristic on real as well as synthetic data and anlyze the emricial results. The novelty of this paper lies in the robust extension of existing methods for schema matching, one that can gracefully accommodate less-thanperfect scenarios in which the exact mapping cannot be identified in a single iteration. Our proposal represents a step forward in achieving fully automated schema matching, which is currently semi-automated at best.