Gene microarray data analysis using parallel point-symmetry-based clustering

Sarkar, Anasua; Maulik, Ujjwal

Gene microarray data analysis using parallel point-symmetry-based clustering

Mark

Sarkar, Anasua ^LU

and Maulik, Ujjwal (2015) In International Journal of Data Mining and Bioinformatics 11(3). p.277-300

Abstract: Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or nonconvex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetrybased K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based... (More); Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or nonconvex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetrybased K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based K-Means algorithm is compared with another new parallel symmetry-based K-Means and existing parallel K-Means over eight artificial and benchmark microarray data sets, to demonstrate its superiority, in both timing and validity. The statistical analysis is also performed to establish the significance of this message-passing-interface based point-symmetry K-Means implementation. We also analysed the biological relevance of clustering solutions.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/fc1b698d-0372-4001-a5e4-3ee6702c3a6c

author

Sarkar, Anasua ^LU

and Maulik, Ujjwal

publishing date

2015-01-01

type

Contribution to journal

publication status

published

keywords

Bioinformatics, Cluster validity measures, Clustering algorithm, K-Means algorithm, Microarray gene expression, Parallel algorithm, Point-symmetry based distance

in

International Journal of Data Mining and Bioinformatics

volume

11

issue

3

pages

24 pages

publisher

Inderscience Publishers

external identifiers

pmid:26333263
scopus:84922583793

ISSN

1748-5673

DOI

10.1504/IJDMB.2015.067320

language

English

LU publication?

no

id

fc1b698d-0372-4001-a5e4-3ee6702c3a6c

date added to LUP

2018-10-09 09:46:08

date last changed

2026-01-09 01:06:57

@article{fc1b698d-0372-4001-a5e4-3ee6702c3a6c,
  abstract     = {{<p>Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or nonconvex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetrybased K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based K-Means algorithm is compared with another new parallel symmetry-based K-Means and existing parallel K-Means over eight artificial and benchmark microarray data sets, to demonstrate its superiority, in both timing and validity. The statistical analysis is also performed to establish the significance of this message-passing-interface based point-symmetry K-Means implementation. We also analysed the biological relevance of clustering solutions.</p>}},
  author       = {{Sarkar, Anasua and Maulik, Ujjwal}},
  issn         = {{1748-5673}},
  keywords     = {{Bioinformatics; Cluster validity measures; Clustering algorithm; K-Means algorithm; Microarray gene expression; Parallel algorithm; Point-symmetry based distance}},
  language     = {{eng}},
  month        = {{01}},
  number       = {{3}},
  pages        = {{277--300}},
  publisher    = {{Inderscience Publishers}},
  series       = {{International Journal of Data Mining and Bioinformatics}},
  title        = {{Gene microarray data analysis using parallel point-symmetry-based clustering}},
  url          = {{http://dx.doi.org/10.1504/IJDMB.2015.067320}},
  doi          = {{10.1504/IJDMB.2015.067320}},
  volume       = {{11}},
  year         = {{2015}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Gene microarray data analysis using parallel point-symmetry-based clustering