Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data

Ghaemi, Zeinab and Farnaghi, Mahdi LU (2019) In ISPRS International Journal of Geo-Information 8(2).
Abstract
Extracting the latent knowledge from Twitter by applying spatial clustering on geotagged tweets provides the ability to discover events and their locations. DBSCAN (density-based spatial clustering of applications with noise), which has been widely used to retrieve events from geotagged tweets, cannot efficiently detect clusters when there is significant spatial heterogeneity in the dataset, as it is the case for Twitter data where the distribution of users, as well as the intensity of publishing tweets, varies over the study areas. This study proposes VDCT (Varied Density-based spatial Clustering for Twitter data) algorithm that extracts clusters from geotagged tweets by considering spatial heterogeneity. The algorithm employs exponential... (More)
Extracting the latent knowledge from Twitter by applying spatial clustering on geotagged tweets provides the ability to discover events and their locations. DBSCAN (density-based spatial clustering of applications with noise), which has been widely used to retrieve events from geotagged tweets, cannot efficiently detect clusters when there is significant spatial heterogeneity in the dataset, as it is the case for Twitter data where the distribution of users, as well as the intensity of publishing tweets, varies over the study areas. This study proposes VDCT (Varied Density-based spatial Clustering for Twitter data) algorithm that extracts clusters from geotagged tweets by considering spatial heterogeneity. The algorithm employs exponential spline interpolation to determine different search radiuses for cluster detection. Moreover, in addition to spatial proximity, textual similarities among tweets are also taken into account by the algorithm. In order to examine the efficiency of the algorithm, geotagged tweets collected during a hurricane in the United States were used for event detection. The output clusters of VDCT have been compared to those of DBSCAN. Visual and quantitative comparison of the results proved the feasibility of the proposed method. View Full-Text (Less)
Please use this url to cite or link to this publication:
author
and
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Density-based clustering, Spatial clustering, Spatial heterogeneity, Text Similarity, Twitter
in
ISPRS International Journal of Geo-Information
volume
8
issue
2
article number
82
publisher
MDPI AG
external identifiers
  • scopus:85068459420
ISSN
2220-9964
DOI
10.3390/ijgi8020082
language
English
LU publication?
yes
id
6917fbbf-6530-486b-9f36-e97f107c5765
date added to LUP
2019-05-06 12:23:11
date last changed
2022-04-25 23:07:11
@article{6917fbbf-6530-486b-9f36-e97f107c5765,
  abstract     = {{Extracting the latent knowledge from Twitter by applying spatial clustering on geotagged tweets provides the ability to discover events and their locations. DBSCAN (density-based spatial clustering of applications with noise), which has been widely used to retrieve events from geotagged tweets, cannot efficiently detect clusters when there is significant spatial heterogeneity in the dataset, as it is the case for Twitter data where the distribution of users, as well as the intensity of publishing tweets, varies over the study areas. This study proposes VDCT (Varied Density-based spatial Clustering for Twitter data) algorithm that extracts clusters from geotagged tweets by considering spatial heterogeneity. The algorithm employs exponential spline interpolation to determine different search radiuses for cluster detection. Moreover, in addition to spatial proximity, textual similarities among tweets are also taken into account by the algorithm. In order to examine the efficiency of the algorithm, geotagged tweets collected during a hurricane in the United States were used for event detection. The output clusters of VDCT have been compared to those of DBSCAN. Visual and quantitative comparison of the results proved the feasibility of the proposed method. View Full-Text}},
  author       = {{Ghaemi, Zeinab and Farnaghi, Mahdi}},
  issn         = {{2220-9964}},
  keywords     = {{Density-based clustering; Spatial clustering; Spatial heterogeneity; Text Similarity; Twitter}},
  language     = {{eng}},
  month        = {{02}},
  number       = {{2}},
  publisher    = {{MDPI AG}},
  series       = {{ISPRS International Journal of Geo-Information}},
  title        = {{A Varied Density-based Clustering Approach for Event Detection from Heterogeneous Twitter Data}},
  url          = {{http://dx.doi.org/10.3390/ijgi8020082}},
  doi          = {{10.3390/ijgi8020082}},
  volume       = {{8}},
  year         = {{2019}},
}