File(s) under permanent embargo
Evaluating parts-of-speech taggers for use in a text-to-scene conversion system
This paper presents parts-of-speech tagging as a first step towards an autonomous text-to-scene conversion system. It categorizes some freely available taggers, according to the techniques used by each in order to automatically identify word-classes. In addition, the performance of each identified tagger is verified experimentally. The SUSANNE corpus is used for testing and reveals the complexity of working with different tagsets, resulting in substantially lower accuracies in our tests than in those reported by the developers of each tagger. The taggers are then grouped to form a voting system to attempt to raise accuracies, but in no cases do the combined results improve upon the individual accuracies. Additionally a new metric, agreement, is tentatively proposed as an indication of confidence in the output of a group of taggers where such output cannot be validated.
History
Event
South African institute of computer scientists and information technologists (2005: White River, South Africa)Series
ACM International Conference Proceeding SeriesPagination
20 - 28Publisher
South African Institute for Computer Scientists and Information TechnologistsLocation
White River, South AfricaPlace of publication
Pretoria, South AfricaStart date
2005-09-20End date
2005-09-22ISBN-10
1595932585Language
engPublication classification
E1.1 Full written paper - refereedCopyright notice
2005, SAICSITEditor/Contributor(s)
J Bishop, D KourieTitle of proceedings
SAICSIT '05 : Research for a Changing World – Proceedings of SAICSIT 2005Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC