Telecom Tutorials by Samir Amberkar: Combined entropy

online LTE test
online C test

Updated or New

GPRS RAN refresh notes ^New
GSM RAN refresh notes ^New

Combined entropy

Combined entropy [Under Information theory > Shannon's paper]

In last article, we mentioned about combined entropy; let us spend little more time in understanding the concept of combined entropy. Say we have two information sources, I_a and I_b, each having its own set of symbols and bit representations or entropies, H(a) and H(b) respectively. What would be their combined entropy i.e. H(a,b) ?

Let us take examples.

Case I: Independent sources

Say I_a has two symbols S_1a and S_2ahaving same probability and I_b has three symbols S_1b, S_{2b ,} S_3bandS_4bagain with same probabilities. A simple example I_a transmission would be S_1aS_2aS_1aS_2aS_1a... and that of I_b would be S_1bS_2bS_3bS_4bS_1bS_2bS_3bS_4bS_1b... If we talk about combining I_a and I_btransmissions, symbols about resultant transmission would be (S_1aS_1b), (S_1aS_2b), (S_1aS_3b), (S_1aS_4b), (S_2aS_1b), (S_2aS_2b), (S_2aS_3b), (S_2aS_4b) assuming I_a and I_b are independent. Example of combined transmission (let us call it I_c) would be (S_1aS_1b) (S_1aS_2b) (S_1aS_3b) (S_1aS_4b) (S_2aS_1b) (S_2aS_2b) (S_2aS_3b) (S_2aS_4b). Note that all these examples maintain the probabilities that we have assumed and also, probabilities of I_a and I_b symbols have been maintained in example transmission for I_c.

I_a would require 1 bit per symbol for representation, I_b would require 2 bits per symbol, and I_cwould require 3 bits per symbol. Now if you look at above example, I_c requiring 3 bits per symbols makes sense as I_c symbols contain both I_a and I_b symbols. As I_a and I_b symbols are produced independently, average bits per symbol required for I_c has to be sum of average bits per symbol required for I_a and that required for I_b.

In other words, H(a,b) = H(a) + H(b).

Case II: Dependent sources

What if I_b is dependent on I_a ? Say I_b symbols S_1b& S_2b are produced only if symbol produced by I_a is S_1a - also I_b produces S_3b& S_4b only when I_a produce S_2a. Probable symbols for combined transmissions would be (S_1aS_1b), (S_1aS_2b), (S_2aS_3b), (S_2aS_4b). Entropies for I_a and I_b remain same as before, but number of bits per symbol required for I_c would be 2. This is because: for combined transmission, once I_a symbol is produced, we have only two choices from I_b symbols => two choices mean 1 bit per symbol => that makes total as 1 bit per symbol for I_a + 1 bit per symbol for I_b symbol = 2 bits per symbol. What we have used here is called conditional entropy (or equivocation).

In other words, H(a,b) = H(a) + H_b(a).

Above situation can be seen in other direction, say we know symbol produced by I_b, by knowing conditional probability, we can correctly guess the symbol produced by I_a. So we need only transmission of I_b symbols, making combined entropy as 2 bits per symbol. This match with our earlier value.

In the example, we have taken many-to-one dependency (number of I_b symbols dependent on one I_a symbol). There could be many-to-many dependency. Taking care of this, we can write:

H(a,b) = H(b) + H_a(b)

Thus H(a,b) = H(a) + H_b(a) = H(b) + H_a(b) = H(b,a).

Since entropy is about uncertainties, we can recheck above equations from uncertainty point of view.

When information sources are independent, it makes logical sense that uncertainty associated with combined transmissions will be additions of uncertainties associated with individual sources.

When sources are dependent, once symbol for particular source becomes known, uncertainty associated with connected (or dependent) source(s) decreases (lesser number of options to choose from) and vice versa. And so, for dependent sources, conditional probabilities apply !

Above equations seems to be doing just what we expected from thinking of entropy as measure of uncertainty.

References: A Mathematical Theory of Communication by Claude E. Shannon, An Introduction to Information Theory by John R. Pierce.

Information rate with noise «

Theory Index

» Rate in terms of noise entropy