Advanced

Using Sociolinguistic Inspired Features for Gender Classification of Web Authors

Simaki, Vasiliki LU ; Aravantinou, Christina; Mporas, Iosif and Megalooikonomou, Vasileios (2015) In Lecture Notes in Computer Science 9302. p.587-594
Abstract
In this article we present a methodology for classification of text from web authors, using sociolinguistic inspired text features. The proposed methodology uses a baseline text mining based feature set, which is combined with text features that quantify results from theoretical and sociolinguistic studies. Two combination approaches were evaluated and the evaluation results indicated a significant improvement in both combination cases. For the best performing combination approach the accuracy was 84.36%, in terms of percentage of correctly classified web posts.
Please use this url to cite or link to this publication:
author
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
keywords
text classification algorithms, sociolinguistics, gender identification
in
Lecture Notes in Computer Science
editor
Král, Pavel; Matoušek, Václav ; and
volume
9302
pages
587 - 594
publisher
Springer
external identifiers
  • scopus:84951770293
ISSN
0302-9743
1611-3349
ISBN
978-3-319-24032-9
978-3-319-24033-6
DOI
10.1007/978-3-319-24033-6_66
language
English
LU publication?
no
id
6b2f9200-be76-4c70-a0e8-aba4ad16b9a4
date added to LUP
2017-06-02 19:10:09
date last changed
2017-11-14 09:49:49
@inbook{6b2f9200-be76-4c70-a0e8-aba4ad16b9a4,
  abstract     = {In this article we present a methodology for classification of text from web authors, using sociolinguistic inspired text features. The proposed methodology uses a baseline text mining based feature set, which is combined with text features that quantify results from theoretical and sociolinguistic studies. Two combination approaches were evaluated and the evaluation results indicated a significant improvement in both combination cases. For the best performing combination approach the accuracy was 84.36%, in terms of percentage of correctly classified web posts.},
  author       = {Simaki, Vasiliki and Aravantinou, Christina and Mporas, Iosif and Megalooikonomou, Vasileios},
  editor       = {Král, Pavel and Matoušek, Václav },
  isbn         = {978-3-319-24032-9},
  issn         = {0302-9743},
  keyword      = {text classification algorithms,sociolinguistics,gender identification },
  language     = {eng},
  pages        = {587--594},
  publisher    = {Springer},
  series       = {Lecture Notes in Computer Science},
  title        = {Using Sociolinguistic Inspired Features for Gender Classification of Web Authors},
  url          = {http://dx.doi.org/10.1007/978-3-319-24033-6_66},
  volume       = {9302},
  year         = {2015},
}