header

Semantic Annotation of Natural History Collections

13 Pages Posted: 12 Sep 2018 First Look: Accepted

See all articles by Lise Stork

Lise Stork

Leiden University - Leiden Institute of Advanced Computer Science (LIACS)

Andreas Weber

University of Twente

Eulalia Gasso Miracle

Naturalis Biodiversity Center

Fons Verbeek

Leiden University - Leiden Institute of Advanced Computer Science (LIACS)

Aske Plaat

Leiden University - Leiden Institute of Advanced Computer Science (LIACS)

Jaap van den Herik

Leiden University - Leiden Institute of Advanced Computer Science (LIACS)

Katherine Wolstencroft

Leiden University - Leiden Institute of Advanced Computer Science (LIACS)

Abstract

Large collections of historical biodiversity expeditions are housed in natural history museums throughout the world. Potentially they can serve as rich sources of data for cultural historical and biodiversity research. However, they exist as only partially catalogued specimen repositories and images of unstructured, non-standardised, hand-written text and drawings. Although many archival collections have been digitised, disclosing their content is challenging. They refer to historical place names and outdated taxonomic classifications and are written in multiple languages. Efforts to transcribe the hand-written text can make the content accessible, but semantically describing and interlinking the content would further facilitate research. We propose a semantic model that serves to structure the named entities in natural history archival collections. In addition, we present an approach for the semantic annotation of these collections whilst documenting their provenance. This approach serves as an initial step for an adaptive learning approach for semi-automated extraction of named entities from natural history archival collections. The applicability of the semantic model and the annotation approach is demonstrated using image scans from a collection of 8,000 field book pages gathered by the Committee for Natural History of the Netherlands Indies between 1820 and 1850, and evaluated together with domain experts from the field of natural and cultural history.

Keywords: Linked Data, Biodiversity, Natural History Collections, Ontologies, Semantic Annotation, History of Science

Suggested Citation

Stork, Lise and Weber, Andreas and Miracle, Eulalia Gasso and Verbeek, Fons and Plaat, Aske and Herik, Jaap van den and Wolstencroft, Katherine, Semantic Annotation of Natural History Collections (September 12, 2018). Journal of Web Semantics First Look . Available at SSRN: https://ssrn.com/abstract=3248498 or http://dx.doi.org/10.2139/ssrn.3248498

Lise Stork (Contact Author)

Leiden University - Leiden Institute of Advanced Computer Science (LIACS) ( email )

Postbus 9500
Leiden, 2300 RA
Netherlands

Andreas Weber

University of Twente ( email )

Postbus 217
Twente
Netherlands

Eulalia Gasso Miracle

Naturalis Biodiversity Center ( email )

Pesthuislaan 7
2333 BA Leiden
Netherlands

Fons Verbeek

Leiden University - Leiden Institute of Advanced Computer Science (LIACS) ( email )

Postbus 9500
Leiden, 2300 RA
Netherlands

Aske Plaat

Leiden University - Leiden Institute of Advanced Computer Science (LIACS) ( email )

Postbus 9500
Leiden, 2300 RA
Netherlands

Jaap van den Herik

Leiden University - Leiden Institute of Advanced Computer Science (LIACS) ( email )

Postbus 9500
Leiden, 2300 RA
Netherlands

Katherine Wolstencroft

Leiden University - Leiden Institute of Advanced Computer Science (LIACS) ( email )

Postbus 9500
Leiden, 2300 RA
Netherlands

Register to save articles to
your library

Register

Paper statistics

Abstract Views
199
PlumX Metrics
Downloads
7
!

Under construction: SSRN citations will be offline until July when we will launch a brand new and improved citations service, check here for more details.

For more information