Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Greedy de novo motif discovery to construct motif repositories for bacterial proteomes

Khakzad, Hamed ; Malmström, Johan LU orcid and Malmström, Lars LU (2019) In BMC Bioinformatics 20(Suppl 4).
Abstract

BACKGROUND: Bacterial surfaces are complex systems, constructed from membranes, peptidoglycan and, importantly, proteins. The proteins play crucial roles as critical regulators of how the bacterium interacts with and survive in its environment. A full catalog of the motifs in protein families and their relative conservation grade is a prerequisite to target the protein-protein interaction that bacterial surface protein makes to host proteins.

RESULTS: In this paper, we propose a greedy approach to identify conserved motifs in large sequence families iteratively. Each iteration discovers a motif de novo and masks all occurrences of that motif. Remaining unmasked sequences are subjected to the next round of motif detection until no... (More)

BACKGROUND: Bacterial surfaces are complex systems, constructed from membranes, peptidoglycan and, importantly, proteins. The proteins play crucial roles as critical regulators of how the bacterium interacts with and survive in its environment. A full catalog of the motifs in protein families and their relative conservation grade is a prerequisite to target the protein-protein interaction that bacterial surface protein makes to host proteins.

RESULTS: In this paper, we propose a greedy approach to identify conserved motifs in large sequence families iteratively. Each iteration discovers a motif de novo and masks all occurrences of that motif. Remaining unmasked sequences are subjected to the next round of motif detection until no more significant motifs can be found. We demonstrate the utility of the method through the construction of a proteome-wide motif repository for Group A Streptococcus (GAS), a significant human pathogen. GAS produce numerous surface proteins that interact with over 100 human plasma proteins, helping the bacteria to evade the host immune response. We used the repository to find that proteins part of the bacterial surface has motif architectures that differ from intracellular proteins.

CONCLUSIONS: We elucidate that the M protein, a coiled-coil homodimer that extends over 500 A from the cell wall, has a motif architecture that differs between various GAS strains. As the M protein is known to bind a variety of different plasma proteins, the results indicate that the different motif architectures are responsible for the quantitative differences of plasma proteins that various strains bind. The speed and applicability of the method enable its application to all major human pathogens.

(Less)
Please use this url to cite or link to this publication:
author
; and
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
BMC Bioinformatics
volume
20
issue
Suppl 4
article number
141
publisher
BioMed Central (BMC)
external identifiers
  • scopus:85064429556
  • pmid:30999854
ISSN
1471-2105
DOI
10.1186/s12859-019-2686-8
language
English
LU publication?
yes
id
71581a61-5e1a-4114-91e5-2071479c9d2f
date added to LUP
2019-04-28 21:38:49
date last changed
2024-09-17 18:28:25
@article{71581a61-5e1a-4114-91e5-2071479c9d2f,
  abstract     = {{<p>BACKGROUND: Bacterial surfaces are complex systems, constructed from membranes, peptidoglycan and, importantly, proteins. The proteins play crucial roles as critical regulators of how the bacterium interacts with and survive in its environment. A full catalog of the motifs in protein families and their relative conservation grade is a prerequisite to target the protein-protein interaction that bacterial surface protein makes to host proteins.</p><p>RESULTS: In this paper, we propose a greedy approach to identify conserved motifs in large sequence families iteratively. Each iteration discovers a motif de novo and masks all occurrences of that motif. Remaining unmasked sequences are subjected to the next round of motif detection until no more significant motifs can be found. We demonstrate the utility of the method through the construction of a proteome-wide motif repository for Group A Streptococcus (GAS), a significant human pathogen. GAS produce numerous surface proteins that interact with over 100 human plasma proteins, helping the bacteria to evade the host immune response. We used the repository to find that proteins part of the bacterial surface has motif architectures that differ from intracellular proteins.</p><p>CONCLUSIONS: We elucidate that the M protein, a coiled-coil homodimer that extends over 500 A from the cell wall, has a motif architecture that differs between various GAS strains. As the M protein is known to bind a variety of different plasma proteins, the results indicate that the different motif architectures are responsible for the quantitative differences of plasma proteins that various strains bind. The speed and applicability of the method enable its application to all major human pathogens.</p>}},
  author       = {{Khakzad, Hamed and Malmström, Johan and Malmström, Lars}},
  issn         = {{1471-2105}},
  language     = {{eng}},
  month        = {{04}},
  number       = {{Suppl 4}},
  publisher    = {{BioMed Central (BMC)}},
  series       = {{BMC Bioinformatics}},
  title        = {{Greedy de novo motif discovery to construct motif repositories for bacterial proteomes}},
  url          = {{http://dx.doi.org/10.1186/s12859-019-2686-8}},
  doi          = {{10.1186/s12859-019-2686-8}},
  volume       = {{20}},
  year         = {{2019}},
}