Approximate clustering of fingerprint vectors with missing values
(2005) Computing: The Australasian Theory Symposium p.57-60- Abstract
- We study the problem of clustering fingerprints with at most p missing values (CMV (p) for short) naturally arising in oligonucleotide fingerprinting, which is an efficient method for characterizing DNA clone libraries.We show that already CMV(2) is NP-hard. We also show that a greedy algorithm yields a min(1 + ln n, 2+pln l) approximation for CMV(p), and can be implemented to run in O(nl2p) time. Furthermore, we introduce other variants of the problem of clustering fingerprints with at most p missing values based on slightly different optimization criteria and show that they can be approximated in polynomial time with ratios 22p-1 and 2(1 - [EQUATION]), respectively.
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/631604
- author
- Figueroa, Andres ; Goldstein, Avraham ; Jiang, Tao ; Kurowski, Maciej ; Lingas, Andrzej LU and Persson, Mia LU
- organization
- publishing date
- 2005
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- host publication
- Proceedings of the 2005 Australasian symposium on Theory of computing
- pages
- 57 - 60
- publisher
- Australian Computer Society
- conference name
- Computing: The Australasian Theory Symposium
- conference location
- Newcastle, Australia
- conference dates
- 0001-01-02
- external identifiers
-
- scopus:84864025053
- ISSN
- 1445-1336
- ISBN
- 1-920682-23-6
- project
- VR 2002-4049
- language
- English
- LU publication?
- yes
- id
- 89d36a69-b532-47df-90e1-247a685ab7fd (old id 631604)
- alternative location
- http://crpit.com/confpapers/CRPITV41Figueroa.pdf
- date added to LUP
- 2016-04-01 15:48:47
- date last changed
- 2022-01-28 07:16:52
@inproceedings{89d36a69-b532-47df-90e1-247a685ab7fd, abstract = {{We study the problem of clustering fingerprints with at most p missing values (CMV (p) for short) naturally arising in oligonucleotide fingerprinting, which is an efficient method for characterizing DNA clone libraries.We show that already CMV(2) is NP-hard. We also show that a greedy algorithm yields a min(1 + ln n, 2+pln l) approximation for CMV(p), and can be implemented to run in O(nl2p) time. Furthermore, we introduce other variants of the problem of clustering fingerprints with at most p missing values based on slightly different optimization criteria and show that they can be approximated in polynomial time with ratios 22p-1 and 2(1 - [EQUATION]), respectively.}}, author = {{Figueroa, Andres and Goldstein, Avraham and Jiang, Tao and Kurowski, Maciej and Lingas, Andrzej and Persson, Mia}}, booktitle = {{Proceedings of the 2005 Australasian symposium on Theory of computing}}, isbn = {{1-920682-23-6}}, issn = {{1445-1336}}, language = {{eng}}, pages = {{57--60}}, publisher = {{Australian Computer Society}}, title = {{Approximate clustering of fingerprint vectors with missing values}}, url = {{http://crpit.com/confpapers/CRPITV41Figueroa.pdf}}, year = {{2005}}, }