Named Entity Recognition for Short Text Messages
(2011) Conference of the Pacific-Association-for-Computational-Linguistics (PACLING) 27. p.178-187- Abstract
- This paper describes a named entity recognition (NER) system for short text messages (SMS) running on a mobile platform. Most NER systems deal with text that is structured, formal, well written, with a good grammatical structure, and few spelling errors. SMS text messages lack these qualities and have instead a short-handed and mixed language studded with emoticons, which makes NER a challenge on this kind of material. We implemented a system that recognizes named entities from SMSes written in Swedish and that runs on an Android cellular telephone. The entities extracted are locations, names, dates, times, and telephone numbers with the idea that extraction of these entities could be utilized by other applications running on the... (More)
- This paper describes a named entity recognition (NER) system for short text messages (SMS) running on a mobile platform. Most NER systems deal with text that is structured, formal, well written, with a good grammatical structure, and few spelling errors. SMS text messages lack these qualities and have instead a short-handed and mixed language studded with emoticons, which makes NER a challenge on this kind of material. We implemented a system that recognizes named entities from SMSes written in Swedish and that runs on an Android cellular telephone. The entities extracted are locations, names, dates, times, and telephone numbers with the idea that extraction of these entities could be utilized by other applications running on the telephone. We started from a regular expression implementation that we complemented with classifiers using logistic regression. We optimized the recognition so that the incoming text messages could be processed on the telephone with a fast response time. We reached an F-score of 86 for strict matches and 89 for partial matches. (C) 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of PACLING Organizing Committee. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/2494191
- author
- Ek, Tobias ; Kirkegaard, Camilla ; Jonsson, Håkan LU and Nugues, Pierre LU
- organization
- publishing date
- 2011
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- Named entity recognition, Short text messages, SMS, Information, extraction, Ensemble systems
- host publication
- Computational Linguistics and Related Fields
- volume
- 27
- pages
- 178 - 187
- publisher
- Elsevier
- conference name
- Conference of the Pacific-Association-for-Computational-Linguistics (PACLING)
- conference location
- Kuala Lumpur, Malaysia
- conference dates
- 2011-07-19 - 2011-07-21
- external identifiers
-
- wos:000299624700020
- scopus:83755171548
- ISSN
- 1877-0428
- DOI
- 10.1016/j.sbspro.2011.10.596
- language
- English
- LU publication?
- yes
- id
- 017a3b06-09e4-4dab-818b-9b140d26ab14 (old id 2494191)
- date added to LUP
- 2016-04-01 12:56:18
- date last changed
- 2022-01-27 08:23:51
@inproceedings{017a3b06-09e4-4dab-818b-9b140d26ab14, abstract = {{This paper describes a named entity recognition (NER) system for short text messages (SMS) running on a mobile platform. Most NER systems deal with text that is structured, formal, well written, with a good grammatical structure, and few spelling errors. SMS text messages lack these qualities and have instead a short-handed and mixed language studded with emoticons, which makes NER a challenge on this kind of material. We implemented a system that recognizes named entities from SMSes written in Swedish and that runs on an Android cellular telephone. The entities extracted are locations, names, dates, times, and telephone numbers with the idea that extraction of these entities could be utilized by other applications running on the telephone. We started from a regular expression implementation that we complemented with classifiers using logistic regression. We optimized the recognition so that the incoming text messages could be processed on the telephone with a fast response time. We reached an F-score of 86 for strict matches and 89 for partial matches. (C) 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of PACLING Organizing Committee.}}, author = {{Ek, Tobias and Kirkegaard, Camilla and Jonsson, Håkan and Nugues, Pierre}}, booktitle = {{Computational Linguistics and Related Fields}}, issn = {{1877-0428}}, keywords = {{Named entity recognition; Short text messages; SMS; Information; extraction; Ensemble systems}}, language = {{eng}}, pages = {{178--187}}, publisher = {{Elsevier}}, title = {{Named Entity Recognition for Short Text Messages}}, url = {{http://dx.doi.org/10.1016/j.sbspro.2011.10.596}}, doi = {{10.1016/j.sbspro.2011.10.596}}, volume = {{27}}, year = {{2011}}, }