File(s) under permanent embargo
Enhancing medical named entity recognition with an extended segment representation technique
journal contribution
posted on 2015-04-01, 00:00 authored by Sara Keretna, Chee Peng LimChee Peng Lim, Douglas CreightonDouglas Creighton, K B ShabanOBJECTIVE: The objective of this paper is to formulate an extended segment representation (SR) technique to enhance named entity recognition (NER) in medical applications. METHODS: An extension to the IOBES (Inside/Outside/Begin/End/Single) SR technique is formulated. In the proposed extension, a new class is assigned to words that do not belong to a named entity (NE) in one context but appear as an NE in other contexts. Ambiguity in such cases can negatively affect the results of classification-based NER techniques. Assigning a separate class to words that can potentially cause ambiguity in NER allows a classifier to detect NEs more accurately; therefore increasing classification accuracy. RESULTS: The proposed SR technique is evaluated using the i2b2 2010 medical challenge data set with eight different classifiers. Each classifier is trained separately to extract three different medical NEs, namely treatment, problem, and test. From the three experimental results, the extended SR technique is able to improve the average F1-measure results pertaining to seven out of eight classifiers. The kNN classifier shows an average reduction of 0.18% across three experiments, while the C4.5 classifier records an average improvement of 9.33%.
History
Journal
Computer methods and programs in biomedicineVolume
119Issue
2Pagination
88 - 100Publisher
ElsevierLocation
Amsterdam, The NetherlandsPublisher DOI
ISSN
1872-7565eISSN
1872-7565Language
engPublication classification
C Journal article; C1 Refereed article in a scholarly journalCopyright notice
2015, ElsevierUsage metrics
Keywords
Biomedical text annotationBiomedical text miningInformation extractionNatural language processingUnstructured electronic medical recordsScience & TechnologyTechnologyLife Sciences & BiomedicineComputer Science, Interdisciplinary ApplicationsComputer Science, Theory & MethodsEngineering, BiomedicalMedical InformaticsComputer ScienceEngineeringCONCEPT EXTRACTIONCLINICAL DOCUMENTSINFORMATIONASSERTIONSSYSTEMWEBArtificial Intelligence and Image Processing
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC