Identification of Heat Shock Protein families and J-protein types by incorporating Dipeptide Composition into Chou's general PseAAC

Ahmad, Saeed; Kabir, Muhammad; Hayat, Maqsood

Identification of Heat Shock Protein families and J-protein types by incorporating Dipeptide Composition into Chou's general PseAAC

Mark

Ahmad, Saeed ; Kabir, Muhammad ^LU

and Hayat, Maqsood (2015) In Computer Methods and Programs in Biomedicine 122(2). p.165-174

Abstract: Heat Shock Proteins (HSPs) are the substantial ingredients for cell growth and viability, which are found in all living organisms. HSPs manage the process of folding and unfolding of proteins, the quality of newly synthesized proteins and protecting cellular homeostatic processes from environmental stress. On the basis of functionality, HSPs are categorized into six major families namely: (i) HSP20 or sHSP (ii) HSP40 or J-proteins types (iii) HSP60 or GroEL/ES (iv) HSP70 (v) HSP90 and (vi) HSP100. Identification of HSPs family and sub-family through conventional approaches is expensive and laborious. It is therefore, highly desired to establish an automatic, robust and accurate computational method for prediction of HSPs quickly and... (More); Heat Shock Proteins (HSPs) are the substantial ingredients for cell growth and viability, which are found in all living organisms. HSPs manage the process of folding and unfolding of proteins, the quality of newly synthesized proteins and protecting cellular homeostatic processes from environmental stress. On the basis of functionality, HSPs are categorized into six major families namely: (i) HSP20 or sHSP (ii) HSP40 or J-proteins types (iii) HSP60 or GroEL/ES (iv) HSP70 (v) HSP90 and (vi) HSP100. Identification of HSPs family and sub-family through conventional approaches is expensive and laborious. It is therefore, highly desired to establish an automatic, robust and accurate computational method for prediction of HSPs quickly and reliably. Regard, a computational model is developed for the prediction of HSPs family. In this model, protein sequences are formulated using three discrete methods namely: Split Amino Acid Composition, Pseudo Amino Acid Composition, and Dipeptide Composition. Several learning algorithms are utilized to choice the best one for high throughput computational model. Leave one out test is applied to assess the performance of the proposed model. The empirical results showed that support vector machine achieved quite promising results using Dipeptide Composition feature space. The predicted outcomes of proposed model are 90.7% accuracy for HSPs dataset and 97.04% accuracy for J-protein types, which are higher than existing methods in the literature so far.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/60ec70e7-eeff-4d43-918d-3aedcb2c1f7e

author

Ahmad, Saeed ; Kabir, Muhammad ^LU

and Hayat, Maqsood

publishing date

2015-11

type

Contribution to journal

publication status

published

keywords

Heat Shock Proteins, J-protein, KNN, PNN, SVM

in

Computer Methods and Programs in Biomedicine

volume

122

issue

2

pages

165 - 174

publisher

Elsevier

external identifiers

scopus:84944276251
pmid:26233307

ISSN

0169-2607

DOI

10.1016/j.cmpb.2015.07.005

language

English

LU publication?

no

additional info

id

60ec70e7-eeff-4d43-918d-3aedcb2c1f7e

date added to LUP

2024-07-03 11:38:20

date last changed

2026-01-02 15:45:55

@article{60ec70e7-eeff-4d43-918d-3aedcb2c1f7e,
  abstract     = {{<p>Heat Shock Proteins (HSPs) are the substantial ingredients for cell growth and viability, which are found in all living organisms. HSPs manage the process of folding and unfolding of proteins, the quality of newly synthesized proteins and protecting cellular homeostatic processes from environmental stress. On the basis of functionality, HSPs are categorized into six major families namely: (i) HSP20 or sHSP (ii) HSP40 or J-proteins types (iii) HSP60 or GroEL/ES (iv) HSP70 (v) HSP90 and (vi) HSP100. Identification of HSPs family and sub-family through conventional approaches is expensive and laborious. It is therefore, highly desired to establish an automatic, robust and accurate computational method for prediction of HSPs quickly and reliably. Regard, a computational model is developed for the prediction of HSPs family. In this model, protein sequences are formulated using three discrete methods namely: Split Amino Acid Composition, Pseudo Amino Acid Composition, and Dipeptide Composition. Several learning algorithms are utilized to choice the best one for high throughput computational model. Leave one out test is applied to assess the performance of the proposed model. The empirical results showed that support vector machine achieved quite promising results using Dipeptide Composition feature space. The predicted outcomes of proposed model are 90.7% accuracy for HSPs dataset and 97.04% accuracy for J-protein types, which are higher than existing methods in the literature so far.</p>}},
  author       = {{Ahmad, Saeed and Kabir, Muhammad and Hayat, Maqsood}},
  issn         = {{0169-2607}},
  keywords     = {{Heat Shock Proteins; J-protein; KNN; PNN; SVM}},
  language     = {{eng}},
  number       = {{2}},
  pages        = {{165--174}},
  publisher    = {{Elsevier}},
  series       = {{Computer Methods and Programs in Biomedicine}},
  title        = {{Identification of Heat Shock Protein families and J-protein types by incorporating Dipeptide Composition into Chou's general PseAAC}},
  url          = {{http://dx.doi.org/10.1016/j.cmpb.2015.07.005}},
  doi          = {{10.1016/j.cmpb.2015.07.005}},
  volume       = {{122}},
  year         = {{2015}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Identification of Heat Shock Protein families and J-protein types by incorporating Dipeptide Composition into Chou's general PseAAC