What Four Million Mappings Can Tell You About Two Hundred Ontologies

Ghazvinian, A; Noy, N. F.; Jonquet, C.; Shah, N. H.; Musen, M. A.
The field of biomedicine has embraced the Semantic Web probably
more than any other field. As a result, there is a large number of biomedical ontologies
covering overlapping areas of the field. We have developed BioPortal—
an open community-based repository of biomedical ontologies. We analyzed ontologies
and terminologies in BioPortal and the Unified Medical Language System
(UMLS), creating more than 4 million mappings between concepts in these
ontologies and terminologies based on the lexical similarity of concept names
and synonyms. We then analyzed the mappings and what they tell us about the
ontologies themselves, the structure of the ontology repository, and the ways in
which the mappings can help in the process of ontology design and evaluation.
For example, we can use the mappings to guide users who are new to a field to
the most pertinent ontologies in that field, to identify areas of the domain that are
not covered sufficiently by the ontologies in the repository, and to identify which
ontologies will serve well as background knowledge in domain-specific tools.
While we used a specific (but large) ontology repository for the study, we believe
that the lessons we learned about the value of a large-scale set of mappings to
ontology users and developers are general and apply in many other domains.