Information integration

Bootstrapping pay-as-you-go data integration systems

Authors: 
Sarma, A Das; Dong, X; Halevy, A
Year: 
2008
Venue: 
Proc. ACM SIGMOD

Data integration systems offer a uniform interface to a set of data sources. Despite recent progress, setting up and maintaining a data integration application still requires significant upfront effort of creating a mediated schema and semantic mappings from the data sources to the mediated schema. Many application contexts involving multiple data sources (e.g., the web, personal information management, enterprise intranets) do not require full integration in order to provide useful services, motivating a pay-as-you-go approach to integration.

Simplifying Information Integration: Object-Based Flow-of-Mappings Framework for Integration

Authors: 
Alexe, B; Gubanov, M; Hernández, MA; H Ho,Jen-Wei Huang, Yannis Katsis, Lucian Popa, Barna Saha and Ioana Stanoi
Year: 
2009
Venue: 
Proc. VLDB workshop on Business Intelligence for the Real-Time Enterprise, 2008, Springer Lecture Notes on Business Information Processing, Vol. 27

The Clio project at IBM Almaden investigates foundational aspects of data transformation, with particular emphasis on the design and execution of schema mappings. We now use Clio as part of a broader data-flow framework in which mappings are just one component. These data-flows express complex transformations between several source and target schemas and require multiple mappings to be specified. This paper describes research issues we have encountered as we try to create and run these mapping-based data-flows.

Accessing and Documenting Relational Databases through OWL ontologies

Authors: 
Curino, Carlo; Orsi, Giorgio; Panigati, Emanuele; Tanca, Letizia
Year: 
2009
Venue: 
FQAS

Relational databases have been designed to store high volumes of data and to provide an efficient query interface. Ontologies are geared towards capturing domain knowledge, annotations, and to offer high-level, machine-processable views of data and metadata. The complementary strengths and weaknesses of these data models motivate the research effort we present in this paper. The goal of this work is to bridge relational and ontological worlds, in order to leverage the efficiency and scalability of relational technologies to support an ontological, high level view of data and metadata.

Clip: a Visual Language for Explicit Schema Mappings.

Authors: 
Raffio, A.; Braga, D.; S.Ceri; Papotti, P.; Hernandez, M.A.
Year: 
2008
Venue: 
ICDE conference

Many data integration solutions in the market today include tools for schema mapping, to help users visually relate elements of different schemas. Schema elements are connected with lines, which are interpreted as mappings, i.e. high-level logical expressions capturing the relationship between source and target data-sets; these are compiled into queries and programs that convert source-side data instances into target-side instances.

Schema Exchange: A Template-Based Approach to Data and Metadata Translation.

Authors: 
Papotti, Paolo; Torlone, Riccardo
Year: 
2007
Venue: 
ER Conference

In this paper we study the problem of schema exchange, a
natural extension of the data exchange problem to an intensional level.
To this end, we first introduce the notion of schema template, a tool for
the representation of a class of schemas sharing the same structure. We
then define the schema exchange notion as the problem of (i) taking a
schema that matches a source template, and (ii) generating a new schema
for a target template, on the basis of a set of dependencies defined over
the two templates. This framework allows the definition, once for all,

Data exchange with data-metadata translations

Authors: 
Hernández, Mauricio A.; Papotti, Paolo; Tan, Wang Chiew
Year: 
2008
Venue: 
VLDB

Data exchange is the process of converting an instance of one schema into an instance of a different schema according to a given specification. Recent data exchange systems have largely dealt with the case where the schemas are given a priori and transformations can only migrate data from the first schema to an instance of the second schema. In particular, the ability to perform data-metadata translations, transformation in which data is converted into metadata or metadata is converted into data, is largely ignored.

Improving search and navigation by combining Ontologies and Social Tags

Authors: 
Bindelli, Silvia; Criscione, Claudio; Curino, Carlo A.; Drago, Mauro L.; Eynard, Davide; Orsi, Giorgio
Year: 
2008
Venue: 
OTM Workshop: Ambient Data Integration

The Semantic Web has the ambitious goal of enabling complex autonomous applications to reason on a machine-processable version of the World Wide Web. This, however, would require a coordinated effort not easily achievable in practice. On the other hand, spontaneous communities, based on social tagging, recently achieved noticeable consensus and diffusion.

Managing the History of Metadata in support for DB Archiving and Schema Evolution

Authors: 
Curino, Carlo A.; Moon, Hyun J.; Zaniolo, Carlo
Year: 
2008
Venue: 
ECDM

Modern information systems, and web information systems in particular, are faced with frequent database schema changes, which generate the necessity to manage such evolution and preserve their history.

Information Systems Integration and Evolution: Ontologies at Rescue

Authors: 
Curino, Carlo A.; Tanca, Letizia; Zaniolo, Carlo
Year: 
2008
Venue: 
STSM

The life of a modern Information System is often characterized by (i) a push toward integration with other systems, and (ii) the evolution of its data management core in response to continuously changing application requirements. Most of the current proposals dealing with these issues from a database perspective rely on the formal notions of mapping and query rewriting.

A Model for Schema Integration in Heterogeneous Databases

Authors: 
Gal, A.; Trombetta, A.; Anaby-Tavor, A.; Montesi, D.
Year: 
2003
Venue: 
IDEAS

Schema integration is the process by which schemata from heterogeneous databases are conceptually integrated into a single cohesive schema. In this work we propose a modeling framework for schema integration, capturing the inherent uncertainty accompanying the integration process. The model utilizes a fuzzy framework to express a confidence measure, associated with the outcome of a schema integration process.

The Use of Machine-Generated Ontologies in Dynamic Information Seeking

Authors: 
Modica, G.; Gal, A.; Jamil, H.
Year: 
2001
Venue: 
CoopIS

Information seeking is the process in which human beings recourse to information resources in order to increase their level of knowledge with respect to their goals. In this paper we offer a methodology for automating the evolution of ontologies and share the results of our experiments in supporting a user in seeking information using interactive systems. The main conclusion of our experiments is that if one narrows down the scope of the domain, ontologies can be extracted with a very high level of precision (more than 90% in some cases).

Information Integration Using Logical Views

Authors: 
Ullman, J.D.
Year: 
1997
Venue: 
Proc. of the 6th Int. Conf. on Database Theory (ICDT 1997)

A number of ideas concerning information-integration tools can be
thought of as constructing answers to queries using views that
represent the capabilities of information sources. We review the
formal basis of these techniques, which are closely related to
containment algorithms for conjunctive queries and/or Datalog
programs. Then we compare the approaches taken by AT&T Labs'
\"Information Manifold\" and the Stanford \"Tsimmis\"
project in these terms.

Processing IQL Queries and Migrating Data in the AutoMed toolkit

Authors: 
Jasper, E.; Poulovassilis, A.; Zamboulis, L.
Year: 
2003

This technical report describes how IQL queries are processed in the AutoMed heterogeneous data integration system, and also how data migration can be supported.
We start with an outline of the IQL language in Section 2. We then consider in Section 3 an abstract representation of this textual IQL and describe the ASG class that implements this abstract representation.

Ontology-Based Integration of Information --- A Survey of Existing Approaches

Authors: 
Wache, H.; Vogele, T.; Visser, U.; Stuckenschmidt, H.; Schuster, G.; Neumann, H.; Hubner, S.
Year: 
2001
Venue: 
IJCAI-01 Workshop: Ontologies and Information Sharing, 2001

We review the use on ontologies for the integration
of heterogeneous information sources. Based
on an in-depth evaluation of existing approaches to
this problem we discuss how ontologies are used to
support the integration task. We evaluate and compare
the languages used to represent the ontologies
and the use of mappings between ontologies as well
as to connect ontologies with information sources.
We also ask for ontology engineering methods and
tools used to develop ontologies for information integration.
Based on the results of our analysis we

Schema Mediation for Large-Scale Semantic Data Sharing

Authors: 
Halevy, A.; Ives, Z.; Suciu, D.; Tatarinov, I.
Year: 
2004
Venue: 
VLDB Journal, 2004

A Theory of Attributed Equivalence in Databases with Application to Schema Integration

Authors: 
Larson, J. A.; Navathe, S. B.; Elmasri, R.
Year: 
1989
Venue: 
IEEE Transactions on Software Engineering, 1989

The authors present a common foundation for integrating pairs of entity sets, pairs of relationship sets, and an entity set with a relationship set. This common foundation is based on the basic principle of integrating attributes. Any pair of objects whose identifying attributes can be integrated can themselves be integrated. Several definitions of attribute equivalence are presented. These definitions can be used to specify the exact nature of the relationship between a pair of attributes. Based on these definitions, several strategies for attribute integration are presented and evaluated.

A Comparative Analysis of Methodologies for Database Schema Integration

Authors: 
Batini, C.; Lenzerini, M.; Navathe, S. B.
Year: 
1986
Venue: 
ACM Computing Surveys, 1986

One of the fundamental principles of the database approach is that a database allows a
nonredundant, unified representation of all data managed in an organization. This is
achieved only when methodologies are available to support integration across
organizational and application boundaries.
Methodologies for database design usually perform the design activity by separately
producing several schemas, representing parts of the application, which are subsequently
merged. Database schema integration is the activity of integrating the schemas of existing

Syndicate content