VU University Amsterdam; OCLC - Online Computer Library Center, Incorporated
In this paper we present the MultiFarm dataset, which has been designed as a benchmark for multilingual ontology matching. The MultiFarm dataset is composed of a set of ontologies translated in different languages and the corresponding alignments between these ontologies. It is based on the OntoFarm dataset, which has been used successfully for several years in the Ontology Alignment Evaluation Initiative (OAEI). By translating the ontologies of the OntoFarm dataset into eight different languages – Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish - we created a comprehensive set of realistic test cases. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism.
Keywords: Ontology Matching, Benchmarking, Multilingualism, Data Integration
Meilicke, Christian and García-Castro, Raúl and Freitas, Fred and van Hage, Willem Robert and Montiel-Ponsoda, Elena and Azevedo, Ryan Ribeiro and Stuckenschmidt, Heiner and Zamazal, Ondřej and Svatek, Vojtech and Tamilin, Andrei and Trojahn, Cassia and Wang, Shenghui, MultiFarm: A Benchmark for Multilingual Ontology Matching (April 11, 2012). Available at SSRN: https://ssrn.com/abstract=3198970 or http://dx.doi.org/10.2139/ssrn.3198970