Deakin University
Browse

File(s) under permanent embargo

Hierarchical rule generalisation for speaker identification in fiction books

conference contribution
posted on 2006-01-01, 00:00 authored by K Glass, Shaun BangayShaun Bangay
This paper presents a hierarchical pattern matching and generalisation technique which is applied to the problem of locating the correct speaker of quoted speech found in fiction books. Patterns from a training set are generalised to create a small number of rules, which can be used to identify items of interest within the text. The pattern matching technique is applied to finding the Speech-Verb, Actor and Speaker of quotes found in ction books. The technique performs well over the training data, resulting in rule-sets many times smaller than the training set, but providing very high accuracy. While the rule-set generalised from one book is less effective when applied to different books than an approach based on hand coded heuristics, performance is comparable when testing on data closely related to the training set.

History

Event

South African institute of computer scientists and information technologists (2006 : Somerset West, South Africa)

Series

ACM conference proceedings series

Pagination

31 - 40

Publisher

South African Institute for Computer Scientists and Information Technologists

Location

Somerset West, South Africa

Place of publication

Pretoria, South Africa

Start date

2006-10-09

End date

2006-10-11

ISBN-13

9781595935670

ISBN-10

1595935673

Language

eng

Publication classification

E1.1 Full written paper - refereed

Copyright notice

2006, SAICSIT

Editor/Contributor(s)

J Bishop, D Kourie

Title of proceedings

SAICSIT '06 : Research for a changing world : Proceedings of the 2006 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC