Wednesday, April 21, 2010

Zipf's law and entropy

Are they are related?
Zipf’s Law is named after the linguist George Kingsley Zipf, who discovered the law when studying the distribution of words: the second most common word in a text typically shows up one-half as often as the most commonly used word. The law has been observed in many other contexts, including firm sizes and income distribution, which follows the closely-connected Pareto Distribution.
That description is a Huffman encoder for language. Normally a Huffman encoder minimizes the symbol error in delivering a multi-symbol message by coding them by length and frequency. When the symbol error (SNR) is equal across all the coded messages, then the SNR equals the channel capacity. However, when the SNR is given, biologically determined, and the length and frequency are selected over time to match the fixed error term. A channel is created to match the given error.

This is related to the uncertainty constant. If we are limited to decoding eight bit symbols in everyday language, then the optimum distribution of lengths and frequency for language will follow the xlog(x) rule, where x is the frequency of the thing said.

In the same way, cities will organize themselves into a Huffman encoding machine for transaction size and transaction rates in intercity trade.

1 comment:

buy D3 Gold said...


Truly i appreciate the trouble you've made to share the information.The subject here i observed was actually successful for the topic which i was looking into for years