|
||||
|
||||
Comparison of Generality Based Algorithm Variants for Automatic Taxonomy GenerationAndreas HenschelMasdar Institute of Science and Technology (MIST) Wei Lee WoonMasdar Institute of Science and Technology (MIST) Thomas WachterDresden University of Technology Stuart MadnickMassachusetts Institute of Technology (MIT) - Sloan School of Management September 24, 2009 MIT Sloan Research Paper No. 4758-09 Abstract: We compare a family of algorithms for the automatic generation of taxonomies by adapting the Heymannalgorithm in various ways. The core algorithm determines the generality of terms and iteratively inserts them in a growing taxonomy. Variants of the algorithm are created by altering the way and the frequency, generality of terms is calculated. We analyse the performance and the complexity of the variants combined with a systematic threshold evaluation on a set of seven manually created benchmark sets. As a result, betweenness centrality calculated on unweighted similarity graphs often performs best but requires threshold fine-tuning and is computationally more expensive than closeness centrality. Finally, we show how an entropy-based filter can lead to more precise taxonomies.
Number of Pages in PDF File: 8 working papers seriesDate posted: September 25, 2009Suggested CitationContact Information
|
|
||||||||||||||||||
© 2013 Social Science Electronic Publishing, Inc. All Rights Reserved.
FAQ
Terms of Use
Privacy Policy
Copyright
This page was processed by apollo3 in 0.750 seconds