Abstract

http://ssrn.com/abstract=1478201
 
 

References (18)



 


 



Comparison of Generality Based Algorithm Variants for Automatic Taxonomy Generation


Andreas Henschel


Masdar Institute of Science and Technology (MIST)

Wei Lee Woon


Masdar Institute of Science and Technology (MIST)

Thomas Wachter


Dresden University of Technology

Stuart Madnick


Massachusetts Institute of Technology (MIT) - Sloan School of Management

September 24, 2009

MIT Sloan Research Paper No. 4758-09

Abstract:     
We compare a family of algorithms for the automatic generation of taxonomies by adapting the Heymannalgorithm in various ways. The core algorithm determines the generality of terms and iteratively inserts them in a growing taxonomy. Variants of the algorithm are created by altering the way and the frequency, generality of terms is calculated. We analyse the performance and the complexity of the variants combined with a systematic threshold evaluation on a set of seven manually created benchmark sets. As a result, betweenness centrality calculated on unweighted similarity graphs often performs best but requires threshold fine-tuning and is computationally more expensive than closeness centrality. Finally, we show how an entropy-based filter can lead to more precise taxonomies.

Number of Pages in PDF File: 8

working papers series


Download This Paper

Date posted: September 25, 2009  

Suggested Citation

Henschel, Andreas and Woon, Wei Lee and Wachter, Thomas and Madnick, Stuart, Comparison of Generality Based Algorithm Variants for Automatic Taxonomy Generation (September 24, 2009). MIT Sloan Research Paper No. 4758-09. Available at SSRN: http://ssrn.com/abstract=1478201 or http://dx.doi.org/10.2139/ssrn.1478201

Contact Information

Andreas Henschel (Contact Author)
Masdar Institute of Science and Technology (MIST) ( email )
MASDAR
PO Box 54115
Abu Dhabi
United Arab Emirates
Wei Lee Woon
Masdar Institute of Science and Technology (MIST) ( email )
MASDAR
PO Box 54115
Abu Dhabi
United Arab Emirates
Thomas Wachter
Dresden University of Technology
Helmholtzstr. 10
Dresden, 01069
Germany
Stuart E. Madnick
Massachusetts Institute of Technology (MIT) - Sloan School of Management ( email )
E53-321
Cambridge, MA 02142
United States
617-253-6671 (Phone)
617-253-3321 (Fax)
Feedback to SSRN


Paper statistics
Abstract Views: 4,800
Downloads: 57
Download Rank: 210,362
References:  18

© 2014 Social Science Electronic Publishing, Inc. All Rights Reserved.  FAQ   Terms of Use   Privacy Policy   Copyright   Contact Us
This page was processed by apollo2 in 0.328 seconds