Database Schema Matching Using Machine Learning with Feature Selection

Authors: 
Berlin, J.; Motro, A.
Author: 
Berlin, J
Motro, A
Year: 
2002
Venue: 
CAiSE, 2002, LNCS
URL: 
http://dit.unitn.it/~accord/RelatedWork/Matching/Berlin_caise02.pdf
Citations: 
176
Citations range: 
100 - 499
AttachmentSize
Berlin2002DatabaseSchemaMatchingUsing.pdf189.39 KB

Schema matching, the problem of finding mappings between
the attributes of two semantically related database schemas, is an important
aspect of many database applications such as schema integration,
data warehousing, and electronic commerce. Unfortunately, schema
matching remains largely a manual, labor-intensive process. Furthermore,
the effort required is typically linear in the number of schemas
to be matched; the next pair of schemas to match is not any easier than
the previous pair. In this paper we describe a system, called Automatch,
that uses machine learning techniques to automate schema matching.
Based primarily on Bayesian learning, the system acquires probabilistic
knowledge from examples that have been provided by domain experts.
This knowledge is stored in a knowledge base called the attribute dictionary.
When presented with a pair of new schemas that need to be
matched (and their corresponding database instances), Automatch uses
the attribute dictionary to find an optimal matching. We also report
initial results from the Automatch project.