Asymmetric Information Distances for Automated Taxonomy Construction
13 Pages Posted: 27 Aug 2008
Date Written: August 25, 2008
Abstract
A novel method for automatically constructing taxonomies for specific research domains is presented. The proposed methodology uses term co-occurence frequencies as an indicator of the semantic closeness between terms. To support the automated creation of taxonomies or subject classifications we present a simple modification to the basic distance measure, and describe a set of procedures by which these measures may be converted into estimates of the desired taxonomy. To demonstrate the viability of this approach, a pilot study on renewable energy technologies is conducted, where the proposed method is used to construct a hierarchy of terms related to alternative energy. These techniques have many potential applications, but one activity in which we are particularly interested is the mapping and subsequent prediction of future developments in the technology and research.
Keywords: Taxonomy Construction, Asymmetric Information
Suggested Citation: Suggested Citation