Using Sociolinguistic Inspired Features for Gender Classification of Web Authors
(2015) In Lecture Notes in Computer Science 9302. p.587-594- Abstract
- In this article we present a methodology for classification of text from web authors, using sociolinguistic inspired text features. The proposed methodology uses a baseline text mining based feature set, which is combined with text features that quantify results from theoretical and sociolinguistic studies. Two combination approaches were evaluated and the evaluation results indicated a significant improvement in both combination cases. For the best performing combination approach the accuracy was 84.36%, in terms of percentage of correctly classified web posts.
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/6b2f9200-be76-4c70-a0e8-aba4ad16b9a4
- author
- Simaki, Vasiliki LU ; Aravantinou, Christina ; Mporas, Iosif and Megalooikonomou, Vasileios
- publishing date
- 2015
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- keywords
- text classification algorithms, sociolinguistics, gender identification
- host publication
- Text, Speech, and Dialogue : 18th International Conference, TSD 2015, Pilsen,Czech Republic, September 14-17, 2015, Proceedings - 18th International Conference, TSD 2015, Pilsen,Czech Republic, September 14-17, 2015, Proceedings
- series title
- Lecture Notes in Computer Science
- editor
- Král, Pavel and Matoušek, Václav
- volume
- 9302
- pages
- 587 - 594
- publisher
- Springer
- external identifiers
-
- scopus:84951770293
- ISSN
- 1611-3349
- 0302-9743
- ISBN
- 978-3-319-24032-9
- 978-3-319-24033-6
- DOI
- 10.1007/978-3-319-24033-6_66
- language
- English
- LU publication?
- no
- id
- 6b2f9200-be76-4c70-a0e8-aba4ad16b9a4
- date added to LUP
- 2017-06-02 19:10:09
- date last changed
- 2024-06-23 18:27:29
@inproceedings{6b2f9200-be76-4c70-a0e8-aba4ad16b9a4, abstract = {{In this article we present a methodology for classification of text from web authors, using sociolinguistic inspired text features. The proposed methodology uses a baseline text mining based feature set, which is combined with text features that quantify results from theoretical and sociolinguistic studies. Two combination approaches were evaluated and the evaluation results indicated a significant improvement in both combination cases. For the best performing combination approach the accuracy was 84.36%, in terms of percentage of correctly classified web posts.}}, author = {{Simaki, Vasiliki and Aravantinou, Christina and Mporas, Iosif and Megalooikonomou, Vasileios}}, booktitle = {{Text, Speech, and Dialogue : 18th International Conference, TSD 2015, Pilsen,Czech Republic, September 14-17, 2015, Proceedings}}, editor = {{Král, Pavel and Matoušek, Václav}}, isbn = {{978-3-319-24032-9}}, issn = {{1611-3349}}, keywords = {{text classification algorithms; sociolinguistics; gender identification}}, language = {{eng}}, pages = {{587--594}}, publisher = {{Springer}}, series = {{Lecture Notes in Computer Science}}, title = {{Using Sociolinguistic Inspired Features for Gender Classification of Web Authors}}, url = {{http://dx.doi.org/10.1007/978-3-319-24033-6_66}}, doi = {{10.1007/978-3-319-24033-6_66}}, volume = {{9302}}, year = {{2015}}, }