Rough based symmetrical clustering for gene expression profile analysis

Sarkar, Anasua; Maulik, Ujjwal

Rough based symmetrical clustering for gene expression profile analysis

Mark

Sarkar, Anasua ^LU

and Maulik, Ujjwal (2015) In IEEE Transactions on Nanobioscience 14(4). p.360-367

Abstract: Identification of coexpressed genes is the central goal in microarray gene expression data analysis. Point symmetry-based clustering is an important unsupervised learning technique for recognizing symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of large microarray data, in this article, a distributed time-efficient scalable parallel rough set based hybrid approach for point symmetry-based clustering algorithm has been proposed. A natural basis for analyzing gene expression data using the symmetry-based algorithm, is to group together genes with similar symmetrical patterns of expression. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the... (More); Identification of coexpressed genes is the central goal in microarray gene expression data analysis. Point symmetry-based clustering is an important unsupervised learning technique for recognizing symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of large microarray data, in this article, a distributed time-efficient scalable parallel rough set based hybrid approach for point symmetry-based clustering algorithm has been proposed. A natural basis for analyzing gene expression data using the symmetry-based algorithm, is to group together genes with similar symmetrical patterns of expression. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in microarray data. This new parallel implementation with K-means algorithm also satisfies the linear speedup in timing on large microarray datasets. This proposed algorithm is compared with another parallel symmetry-based K-means and parallel version of existing K-means over four artificial and benchmark microarray datasets. We also have experimented over three skewed cancer gene expression datasets. The statistical analysis are also performed to establish the significance of this new implementation. The biological relevance of the clustering solutions are also analyzed.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/20911c99-e668-4de3-a565-3e613435af9c

author

Sarkar, Anasua ^LU

and Maulik, Ujjwal

publishing date

2015-06-01

type

Contribution to journal

publication status

published

subject

Bioinformatics and Computational Biology

keywords

Automatic clustering algorithm, K-means algorithm, microarray gene expression data, point-symmetry based distance, rough set decision rules

in

IEEE Transactions on Nanobioscience

volume

14

issue

4

article number

7097734

pages

8 pages

publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

external identifiers

scopus:84930933394

ISSN

1536-1241

DOI

10.1109/TNB.2015.2421323

language

English

LU publication?

no

id

20911c99-e668-4de3-a565-3e613435af9c

date added to LUP

2018-09-13 10:16:30

date last changed

2025-10-14 12:21:17

@article{20911c99-e668-4de3-a565-3e613435af9c,
  abstract     = {{<p>Identification of coexpressed genes is the central goal in microarray gene expression data analysis. Point symmetry-based clustering is an important unsupervised learning technique for recognizing symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of large microarray data, in this article, a distributed time-efficient scalable parallel rough set based hybrid approach for point symmetry-based clustering algorithm has been proposed. A natural basis for analyzing gene expression data using the symmetry-based algorithm, is to group together genes with similar symmetrical patterns of expression. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in microarray data. This new parallel implementation with K-means algorithm also satisfies the linear speedup in timing on large microarray datasets. This proposed algorithm is compared with another parallel symmetry-based K-means and parallel version of existing K-means over four artificial and benchmark microarray datasets. We also have experimented over three skewed cancer gene expression datasets. The statistical analysis are also performed to establish the significance of this new implementation. The biological relevance of the clustering solutions are also analyzed.</p>}},
  author       = {{Sarkar, Anasua and Maulik, Ujjwal}},
  issn         = {{1536-1241}},
  keywords     = {{Automatic clustering algorithm; K-means algorithm; microarray gene expression data; point-symmetry based distance; rough set decision rules}},
  language     = {{eng}},
  month        = {{06}},
  number       = {{4}},
  pages        = {{360--367}},
  publisher    = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}},
  series       = {{IEEE Transactions on Nanobioscience}},
  title        = {{Rough based symmetrical clustering for gene expression profile analysis}},
  url          = {{http://dx.doi.org/10.1109/TNB.2015.2421323}},
  doi          = {{10.1109/TNB.2015.2421323}},
  volume       = {{14}},
  year         = {{2015}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Rough based symmetrical clustering for gene expression profile analysis