Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Rough based symmetrical clustering for gene expression profile analysis

Sarkar, Anasua LU orcid and Maulik, Ujjwal (2015) In IEEE Transactions on Nanobioscience 14(4). p.360-367
Abstract

Identification of coexpressed genes is the central goal in microarray gene expression data analysis. Point symmetry-based clustering is an important unsupervised learning technique for recognizing symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of large microarray data, in this article, a distributed time-efficient scalable parallel rough set based hybrid approach for point symmetry-based clustering algorithm has been proposed. A natural basis for analyzing gene expression data using the symmetry-based algorithm, is to group together genes with similar symmetrical patterns of expression. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the... (More)

Identification of coexpressed genes is the central goal in microarray gene expression data analysis. Point symmetry-based clustering is an important unsupervised learning technique for recognizing symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of large microarray data, in this article, a distributed time-efficient scalable parallel rough set based hybrid approach for point symmetry-based clustering algorithm has been proposed. A natural basis for analyzing gene expression data using the symmetry-based algorithm, is to group together genes with similar symmetrical patterns of expression. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in microarray data. This new parallel implementation with K-means algorithm also satisfies the linear speedup in timing on large microarray datasets. This proposed algorithm is compared with another parallel symmetry-based K-means and parallel version of existing K-means over four artificial and benchmark microarray datasets. We also have experimented over three skewed cancer gene expression datasets. The statistical analysis are also performed to establish the significance of this new implementation. The biological relevance of the clustering solutions are also analyzed.

(Less)
Please use this url to cite or link to this publication:
author
and
publishing date
type
Contribution to journal
publication status
published
subject
keywords
Automatic clustering algorithm, K-means algorithm, microarray gene expression data, point-symmetry based distance, rough set decision rules
in
IEEE Transactions on Nanobioscience
volume
14
issue
4
article number
7097734
pages
8 pages
publisher
IEEE - Institute of Electrical and Electronics Engineers Inc.
external identifiers
  • scopus:84930933394
ISSN
1536-1241
DOI
10.1109/TNB.2015.2421323
language
English
LU publication?
no
id
20911c99-e668-4de3-a565-3e613435af9c
date added to LUP
2018-09-13 10:16:30
date last changed
2022-01-31 05:15:38
@article{20911c99-e668-4de3-a565-3e613435af9c,
  abstract     = {{<p>Identification of coexpressed genes is the central goal in microarray gene expression data analysis. Point symmetry-based clustering is an important unsupervised learning technique for recognizing symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of large microarray data, in this article, a distributed time-efficient scalable parallel rough set based hybrid approach for point symmetry-based clustering algorithm has been proposed. A natural basis for analyzing gene expression data using the symmetry-based algorithm, is to group together genes with similar symmetrical patterns of expression. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in microarray data. This new parallel implementation with K-means algorithm also satisfies the linear speedup in timing on large microarray datasets. This proposed algorithm is compared with another parallel symmetry-based K-means and parallel version of existing K-means over four artificial and benchmark microarray datasets. We also have experimented over three skewed cancer gene expression datasets. The statistical analysis are also performed to establish the significance of this new implementation. The biological relevance of the clustering solutions are also analyzed.</p>}},
  author       = {{Sarkar, Anasua and Maulik, Ujjwal}},
  issn         = {{1536-1241}},
  keywords     = {{Automatic clustering algorithm; K-means algorithm; microarray gene expression data; point-symmetry based distance; rough set decision rules}},
  language     = {{eng}},
  month        = {{06}},
  number       = {{4}},
  pages        = {{360--367}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{IEEE Transactions on Nanobioscience}},
  title        = {{Rough based symmetrical clustering for gene expression profile analysis}},
  url          = {{http://dx.doi.org/10.1109/TNB.2015.2421323}},
  doi          = {{10.1109/TNB.2015.2421323}},
  volume       = {{14}},
  year         = {{2015}},
}