File(s) under permanent embargo
Hierarchical rule generalisation for speaker identification in fiction books
This paper presents a hierarchical pattern matching and generalisation technique which is applied to the problem of locating the correct speaker of quoted speech found in fiction books. Patterns from a training set are generalised to create a small number of rules, which can be used to identify items of interest within the text. The pattern matching technique is applied to finding the Speech-Verb, Actor and Speaker of quotes found in ction books. The technique performs well over the training data, resulting in rule-sets many times smaller than the training set, but providing very high accuracy. While the rule-set generalised from one book is less effective when applied to different books than an approach based on hand coded heuristics, performance is comparable when testing on data closely related to the training set.
History
Event
South African institute of computer scientists and information technologists (2006 : Somerset West, South Africa)Series
ACM conference proceedings seriesPagination
31 - 40Publisher
South African Institute for Computer Scientists and Information TechnologistsLocation
Somerset West, South AfricaPlace of publication
Pretoria, South AfricaStart date
2006-10-09End date
2006-10-11ISBN-13
9781595935670ISBN-10
1595935673Language
engPublication classification
E1.1 Full written paper - refereedCopyright notice
2006, SAICSITEditor/Contributor(s)
J Bishop, D KourieTitle of proceedings
SAICSIT '06 : Research for a changing world : Proceedings of the 2006 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countriesUsage metrics
Categories
No categories selectedLicence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC