Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

A new evolutionary rough fuzzy integrated machine learning technique for microRNA selection using next-generation sequencing data of breast cancer

Sarkar, Jnanendra Prasad ; Saha, Indrajit ; Rakshit, Somnath ; Pal, Monalisa ; Wlasnowolski, Michal ; Sarkar, Anasua LU orcid ; Maulik, Ujjwal and Plewczynski, Dariusz (2019) 2019 Genetic and Evolutionary Computation Conference, GECCO 2019 p.1846-1854
Abstract

MicroRNAs (miRNA) play an important role in various biological process by regulating gene expression. Their abnormal expression may lead to cancer. Therefore, analysis of such data may discover potential biological insight for cancer diagnosis. In this regard, recently many feature selection methods have been developed to identify such miRNAs. These methods have their own merits and demerits as the task is very challenging in nature. Thus, in this article, we propose a novel wrapper based feature selection technique with the integration of Rough and Fuzzy sets, Random Forest and Particle Swarm Optimization, to identify putative miRNAs that can solve the underlying biological problem effectively, i.e. to separate tumour and control... (More)

MicroRNAs (miRNA) play an important role in various biological process by regulating gene expression. Their abnormal expression may lead to cancer. Therefore, analysis of such data may discover potential biological insight for cancer diagnosis. In this regard, recently many feature selection methods have been developed to identify such miRNAs. These methods have their own merits and demerits as the task is very challenging in nature. Thus, in this article, we propose a novel wrapper based feature selection technique with the integration of Rough and Fuzzy sets, Random Forest and Particle Swarm Optimization, to identify putative miRNAs that can solve the underlying biological problem effectively, i.e. to separate tumour and control samples. Here, Rough and Fuzzy sets help to address the vagueness and overlapping characteristics of the dataset while performing clustering. On the other hand, Random Forest is applied to perform the classification task on the clustering results to yield better solutions. The integrated clustering and classification tasks are considered as an underlying optimization problem for Particle Swarm Optimization method where particles encode features, in this case, miRNAs. The performance of the proposed wrapper based method has been demonstrated quantitatively and visually on next-generation sequencing data of breast cancer from The Cancer Genome Atlas (TCGA). Finally, the selected miRNAs are validated through biological significance tests. The code and dataset used in this paper are available online.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; and
publishing date
type
Chapter in Book/Report/Conference proceeding
publication status
published
subject
host publication
GECCO 2019 - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion
pages
9 pages
publisher
Association for Computing Machinery (ACM)
conference name
2019 Genetic and Evolutionary Computation Conference, GECCO 2019
conference location
Prague, Czech Republic
conference dates
2019-07-13 - 2019-07-17
external identifiers
  • scopus:85070648858
ISBN
9781450367486
DOI
10.1145/3319619.3326836
language
English
LU publication?
no
id
cba91f71-2714-4104-abb7-17341f27197c
date added to LUP
2019-08-28 11:05:32
date last changed
2022-07-05 18:28:30
@inproceedings{cba91f71-2714-4104-abb7-17341f27197c,
  abstract     = {{<p>MicroRNAs (miRNA) play an important role in various biological process by regulating gene expression. Their abnormal expression may lead to cancer. Therefore, analysis of such data may discover potential biological insight for cancer diagnosis. In this regard, recently many feature selection methods have been developed to identify such miRNAs. These methods have their own merits and demerits as the task is very challenging in nature. Thus, in this article, we propose a novel wrapper based feature selection technique with the integration of Rough and Fuzzy sets, Random Forest and Particle Swarm Optimization, to identify putative miRNAs that can solve the underlying biological problem effectively, i.e. to separate tumour and control samples. Here, Rough and Fuzzy sets help to address the vagueness and overlapping characteristics of the dataset while performing clustering. On the other hand, Random Forest is applied to perform the classification task on the clustering results to yield better solutions. The integrated clustering and classification tasks are considered as an underlying optimization problem for Particle Swarm Optimization method where particles encode features, in this case, miRNAs. The performance of the proposed wrapper based method has been demonstrated quantitatively and visually on next-generation sequencing data of breast cancer from The Cancer Genome Atlas (TCGA). Finally, the selected miRNAs are validated through biological significance tests. The code and dataset used in this paper are available online.</p>}},
  author       = {{Sarkar, Jnanendra Prasad and Saha, Indrajit and Rakshit, Somnath and Pal, Monalisa and Wlasnowolski, Michal and Sarkar, Anasua and Maulik, Ujjwal and Plewczynski, Dariusz}},
  booktitle    = {{GECCO 2019  - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion}},
  isbn         = {{9781450367486}},
  language     = {{eng}},
  month        = {{07}},
  pages        = {{1846--1854}},
  publisher    = {{Association for Computing Machinery (ACM)}},
  title        = {{A new evolutionary rough fuzzy integrated machine learning technique for microRNA selection using next-generation sequencing data of breast cancer}},
  url          = {{http://dx.doi.org/10.1145/3319619.3326836}},
  doi          = {{10.1145/3319619.3326836}},
  year         = {{2019}},
}