Named Entity Recognition on Transcription using cascaded classifiers

Research output: Contribution to conferencePaperpeer-review

Abstract

This paper presents a Named Entity Recognition (NER) system on broadcast news transcription which is a combination of two different classifiers. In addition, we present a comparative analysis of the results obtained by extracting Named Entities from two different types of documents: written documents and spoken documents. Written documents are documents in which text appears as standard written form e.g. newspaper articles. Spoken (transcribed) documents are the documents where orthographic information and punctuation are missing. In transcribed documents, an absence of these two main features often causes a drop in performances to recognize Named Entities (NEs). An additional error in the transcription made by the Automatic Speech Recognition (ASR) system is that it is not able to recognize the right sequence of words. This also introduces additional performance reduction of NER. The system performed the best on the task of Italian NER at Evalita 2011 with F1 of 63.50%. Obtained results of this study are going to be considered for integration into Typhoon [3], a NER system developed by HTL group at FBK, to deal with transcribed broadcast news too.
Original languageEnglish
Publication statusPublished - 2011
Externally publishedYes
EventInternational Workshop on Evaluation of Natural Language and Speech Tools for Italian, EVALITA 2011 - Rome, Italy
Duration: 24 Jan 201225 Jan 2012

Conference

ConferenceInternational Workshop on Evaluation of Natural Language and Speech Tools for Italian, EVALITA 2011
Country/TerritoryItaly
CityRome
Period24/01/1225/01/12

Fingerprint

Dive into the research topics of 'Named Entity Recognition on Transcription using cascaded classifiers'. Together they form a unique fingerprint.

Cite this