Get started !
online LTE test
online C test

Updated or New
CDF explained
5G Page
5G Intent (a presentation)
5G Intent (article)
CV2X Page
A Look at CV2X (a presentation)

Information Theory
Multiple Access
OSI Model
Data Link layer
Word about ATM
Standard Reference
Reference books
Resources on Web
Mind Map
Perl resources
Magic MSC tool
Bar graph tool
C programming
ASCII table
Project Management
Simple Google box
HTML characters
Site view counter
7 4 4 , 7 2 5
(May-2010 till Dec-2018)
another knowledge site

Entropy definition

Entropy definition [Under Information theory > Shannon's paper]

In 1st article, we said "for improving English language transmission rate, we can assign smaller number of bits representation to alphabet which is more probable than the alphabet which has lower probability". The idea here can further be extended to words; by mapping words - rather than alphabet - will give us still better transmission rate. For example, "The", "is" are most found and so can be given smaller number of bits representation. In alphabet mapping, we would have needed more than 3 bits for representing "The"; in word mapping, we may use lesser than 3 bits.

Can we do further improvment ? Yes ! application of English's grammer rules may give us better rates !!

So it seems for improving information rate, we need better and better knowledge or study of English language.

The idea we are trying to get here is that there are ways to improve transmission rate provided we invent better and better representation of information (symbols). But, of course, there will be limit to achievable transmission rate. Can this be theoretically calculated ? Is there any measure for information rate ?

Answer is Entropy.

Say information source is producing symbols at constant rate with every symbol having a certain probability (independent of previous symbol or symbols). We map every symbol with certain bit representation based on the probabilities and start transmitting the same. At receiving end, we decode these representations and update two counters - one for number of symbols and other one for number of bits. Entropy is calculated as average number of bits per symbol by using above two counters taken after sufficiently long duration.

For example,

A) say there are 32 symbols and each symbol is equally probable. Number of bits per symbol would be log232. So entropy would be 5 bits per symbol.
B) Now let us take as different case, say out of 32 symbols, a particular symbol Sx has probability of 50% and rest have equal probability. Since Sx has highest probability, it makes sense to assign only a bit to Sx. Due to higher probability, occurence of Sx would be more, making information average not 5 bits per symbol but rather smaller than 5 bits per symbol.

Shannon defines Entropy as (theorem 2, page 11):

H = — K pi log pi
K is positive constant, pi is probablity of i th symbol, and summation done for all possible symbols.

For above example (taking K as unit value):

A) 32 symbols with equal probability (1/32 each)

H = — 32 (1/32) log2( 1/32 ) = 5 bits per symbol.

B) 32 symbols, Sx with probability of 0.5 and rest with (0.5/31) each

H = — ( 0.5 log2( 0.5) + 31(0.5/31) log2( 0.5/31) ) = 3.49 bits per symbol

As expected, entropy for case B is lower than entropy for case A.

In this article, we defined Entropy as a measure of best representation that can be achieved. But what exactly is the significance of Entropy ? In next article, we will look at the same.

References: A Mathematical Theory of Communication by Claude E. Shannon, An Introduction to Information Theory by John R. Pierce.

Copyright © Samir Amberkar 2010-11§

Capacity theorem « Theory Index » Uncertainty and Entropy