Abstract
The ability to recognize named entities (e.g., person, location and organization names) in texts has been proved as an important task for several natural language processing areas, including Information Retrieval and Information Extraction. However, despite the efforts and the achievements obtained in Named Entity Recognition from written texts, the problem of recognizing named entities from automatic transcriptions of spoken documents is still far from being solved. In fact, the output of Automatic Speech Recognition (ASR) often contains transcription errors; in addition, many named entities are out-of-vocabulary words, which makes them not available to the ASR. This paper presents a comparative analysis of extracting named entities both from written texts and from transcriptions. As for transcriptions, we have used spoken broadcast news, while for written texts we have used both newspapers of the same domain of the transcriptions and the manual transcriptions of the broadcast news. The comparison was carried on a number of experiments using the best Named Entity Recognition system presented at Evalita 2007.
| Original language | English |
|---|---|
| Pages (from-to) | 71-89 |
| Number of pages | 19 |
| Journal | Studies in Computational Intelligence |
| Volume | 589 |
| DOIs | |
| Publication status | Published - 2015 |
| Externally published | Yes |
Keywords
- Automatic transcriptions
- Entity detection
- Named entity recognition
- Written texts
Fingerprint
Dive into the research topics of 'Comparing named entity recognition on transcriptions and written texts'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver