GREEDY MACRO TEXT COMPRESSION
| dc.contributor.author | Witten, Ian H. | |
| dc.contributor.author | Bell, Timothy | |
| dc.date.accessioned | 2008-02-27T22:27:12Z | |
| dc.date.available | 2008-02-27T22:27:12Z | |
| dc.date.computerscience | 1999-05-27 | |
| dc.date.issued | 1987-12-01 | |
| dc.description.abstract | Text compression schemes can be divided into two classes: statistical, where symbols are assigned codes based on their probabilities, and macro, where groups of consecutive symbols (phrases) are replaced by indexes into some dictionary. A subset of macro coding called Greedy Macro (GM) accounts for the vast majority of macro schemes in the literature, including all variations of the popular Ziv-Lempel method. At each coding step, GM schemes encode as many symbols as possible with a single index to the dictionary. Although this parsing strategy is not optimal, no optimal macro scheme can be implemented with a bounded coding delay. This paper defines GM coding and establishes an algorithm which takes any such scheme and constructs a statistical coding method that achieves exactly the same compression rate. Thus GM schemes can never achieve better compression than statistical ones. The conclusion is that research aimed at increasing compression should concentrate on statistical methods, leaving macro schemes for applications in which compression efficiency can be sacrificed for speed and modest memory requirements. | eng |
| dc.description.notes | We are currently acquiring citations for the work deposited into this collection. We recognize the distribution rights of this item may have been assigned to another entity, other than the author(s) of the work.If you can provide the citation for this work or you think you own the distribution rights to this work please contact the Institutional Repository Administrator at digitize@ucalgary.ca | |
| dc.identifier.department | 1987-285-33 | |
| dc.identifier.doi | http://dx.doi.org/10.11575/PRISM/31160 | |
| dc.identifier.uri | http://hdl.handle.net/1880/46158 | |
| dc.language.iso | Eng | |
| dc.publisher.corporate | University of Calgary | |
| dc.publisher.faculty | Science | |
| dc.subject | Computer Science | eng |
| dc.title | GREEDY MACRO TEXT COMPRESSION | eng |
| dc.type | unknown | |
| thesis.degree.discipline | Computer Science | eng |