Deakin University
Browse

File(s) under permanent embargo

Classification ensemble to improve medical Named Entity Recognition

conference contribution
posted on 2014-01-01, 00:00 authored by Sara Keretna, Chee Peng LimChee Peng Lim, Douglas CreightonDouglas Creighton, K B Shaban
An accurate Named Entity Recognition (NER) is important for knowledge discovery in text mining. This paper proposes an ensemble machine learning approach to recognise Named Entities (NEs) from unstructured and informal medical text. Specifically, Conditional Random Field (CRF) and Maximum Entropy (ME) classifiers are applied individually to the test data set from the i2b2 2010 medication challenge. Each classifier is trained using a different set of features. The first set focuses on the contextual features of the data, while the second concentrates on the linguistic features of each word. The results of the two classifiers are then combined. The proposed approach achieves an f-score of 81.8%, showing a considerable improvement over the results from CRF and ME classifiers individually which achieve f-scores of 76% and 66.3% for the same data set, respectively.

History

Event

2014 IEEE International Conference on Systems, Man and Cybernetics

Pagination

2630 - 2636

Publisher

IEEE

Location

San Diego, CA, USA

Place of publication

Piscataway, NJ

Start date

2014-10-05

End date

2014-10-08

ISBN-13

9781479938407

Language

English

Publication classification

E Conference publication; E1 Full written paper - refereed

Copyright notice

2014, IEEE

Title of proceedings

Proceedings of 2014 IEEE International Conference on Systems, Man and Cybernetics