Universal variable-to-fixed length source codes
Abstract
A universal variable-to-fixed length algorithm for binary memoryless sources which converges to the entropy of the source at the optimal rate is known. We study the problem of universal variable-to-fixed length coding for the class of Markov sources with finite alphabets. We give an upper bound on the performance of the code for large dictionary sizes and show that the code is optimal in the sense that no codes exist that have better asymptotic performance. The optimal redundancy is shown to be H log log M/log M where H is the entropy rate of the source and M is the code size. This result is analogous to Rissanen's result for fixed-to-variable length codes. We investigate the performance of a variable-to-fixed coding method which does not need to store the dictionaries, either at the coder or the decoder. We also consider the performance of both these source codes on individual sequences. For individual sequences we bound the performance in terms of the best code length achievable by a class of coders. All the codes that we consider are prefix-free and complete.