Estimation of escape probabilities for PPM based on universal source coding theory
(1997) 1997 IEEE International Symposium on Information Theory, ISIT 1997 In IEEE International Symposium on Information Theory - Proceedings p.65-65- Abstract
Some of the best compression ratios for text compression are provided by the PPM (prediction by partial matching) class of algorithms. These algorithms are based on arithmetic coding using a fixed-depth Markov chain model of the source, i.e., the subsequence of symbols generated in any state s of the source is assumed to be the output of a memoryless subsource w=w(s). One of the most crucial steps in any PPM algorithm is the choice of the escape probability, i.e., the probability of a previously unseen symbol appearing in a given state. Let A={1,...,M} be the source alphabet, x/sup k/=x/ix/ i...x/sub k//spl isin/A/sup k/ be a sequence of symbols generated by a subsource w with unknown parameters, and m=m(x/sup k/)... (More)
Some of the best compression ratios for text compression are provided by the PPM (prediction by partial matching) class of algorithms. These algorithms are based on arithmetic coding using a fixed-depth Markov chain model of the source, i.e., the subsequence of symbols generated in any state s of the source is assumed to be the output of a memoryless subsource w=w(s). One of the most crucial steps in any PPM algorithm is the choice of the escape probability, i.e., the probability of a previously unseen symbol appearing in a given state. Let A={1,...,M} be the source alphabet, x/sup k/=x/ix/ i...x/sub k//spl isin/A/sup k/ be a sequence of symbols generated by a subsource w with unknown parameters, and m=m(x/sup k/) be the number of different symbols in x/sup k/. In most incarnations of PPM, the expression for the escape probability used for coding is of the form /spl thetav/(E|x/sup k/)=k(k,m)/k+m/spl alpha/+k(k,m), where /spl alpha/ is a parameter. The encoding of the "escape symbol" E is followed by a description of the new symbol, which is independent of the choice of escape probability and is not discussed. In almost all PPM schemes, the expression for the escape probability has been chosen heuristically. Following the method of Shtarkov et al. (see Probl. of Information Transmission, vol.31, no.2, p.20-35, 1995) we use a universal coding approach.
(Less)
- author
- Åberg, Jan LU ; Shtarkov, Yu M. and Smeets, B. J.M. LU
- organization
- publishing date
- 1997
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- host publication
- Proceedings 1997 IEEE International Symposium on Information Theory
- series title
- IEEE International Symposium on Information Theory - Proceedings
- article number
- 612980
- pages
- 1 pages
- publisher
- IEEE - Institute of Electrical and Electronics Engineers Inc.
- conference name
- 1997 IEEE International Symposium on Information Theory, ISIT 1997
- conference location
- Ulm, Germany
- conference dates
- 1997-06-29 - 1997-07-04
- external identifiers
-
- scopus:0030718271
- ISSN
- 2157-8095
- ISBN
- 0780339568
- 9780780339569
- DOI
- 10.1109/ISIT.1997.612980
- language
- English
- LU publication?
- yes
- id
- 159a00a8-053c-4727-be1c-c0100d66dc26
- date added to LUP
- 2021-11-05 02:36:28
- date last changed
- 2022-02-02 01:10:57
@inproceedings{159a00a8-053c-4727-be1c-c0100d66dc26, abstract = {{<p>Some of the best compression ratios for text compression are provided by the PPM (prediction by partial matching) class of algorithms. These algorithms are based on arithmetic coding using a fixed-depth Markov chain model of the source, i.e., the subsequence of symbols generated in any state s of the source is assumed to be the output of a memoryless subsource w=w(s). One of the most crucial steps in any PPM algorithm is the choice of the escape probability, i.e., the probability of a previously unseen symbol appearing in a given state. Let A={1,...,M} be the source alphabet, x/sup k/=x/<sub>i</sub>x/ <sub>i</sub>...x/sub k//spl isin/A/sup k/ be a sequence of symbols generated by a subsource w with unknown parameters, and m=m(x/sup k/) be the number of different symbols in x/sup k/. In most incarnations of PPM, the expression for the escape probability used for coding is of the form /spl thetav/(E|x/sup k/)=k(k,m)/k+m/spl alpha/+k(k,m), where /spl alpha/ is a parameter. The encoding of the "escape symbol" E is followed by a description of the new symbol, which is independent of the choice of escape probability and is not discussed. In almost all PPM schemes, the expression for the escape probability has been chosen heuristically. Following the method of Shtarkov et al. (see Probl. of Information Transmission, vol.31, no.2, p.20-35, 1995) we use a universal coding approach.</p>}}, author = {{Åberg, Jan and Shtarkov, Yu M. and Smeets, B. J.M.}}, booktitle = {{Proceedings 1997 IEEE International Symposium on Information Theory}}, isbn = {{0780339568}}, issn = {{2157-8095}}, language = {{eng}}, pages = {{65--65}}, publisher = {{IEEE - Institute of Electrical and Electronics Engineers Inc.}}, series = {{IEEE International Symposium on Information Theory - Proceedings}}, title = {{Estimation of escape probabilities for PPM based on universal source coding theory}}, url = {{http://dx.doi.org/10.1109/ISIT.1997.612980}}, doi = {{10.1109/ISIT.1997.612980}}, year = {{1997}}, }