Points: Grade: 0-15 0 16-18 1 19-21 2 22-24 3 25-27 4 28-30 5You must get at least grade 1 to pass the course.

Problems:

1.1 max 3p | A discrete
memoryless source emits a sequence of statistically independent binary
digits with probabilities p(1) = 0.005 and p(0) = 0.995. The digits
are taken 100 at a time and a binary codeword is provided for every
sequence of 100 digits containing three of fewer ones.
| |||||||

1.2 max 2p | Show that the
entropy of the probability distribution, (p_{1}, ...,
p_{i}, ..., p_{j}, ..., p_{m}), is less than
the entropy of the distribution (p_{1}, ...,
(p_{i}+p_{j})/2, ..., (p_{i}+p_{j})/2,
..., p_{m}). Show that in general any transfer of probability
that makes the distribution more uniform increases the entropy. |
|||||||

2.1 max 2p | Determine a Huffman code for the text of this problem. | |||||||

2.2 max 2p | Consider all possible 256×256 pixel images with 8 bit grayscale pixels. How many images can be compressed by 20 bits or more using a uniquely decodable code? Compare the result with the number of all possible images. | |||||||

2.3 max 2p | Assume
there is a source emitting symbols a and b with probability P(a |
ß)=ß. Given a sequence s with N_{a} symbols a and
N_{b} symbols b, what is the probability P(a | s) if p(ß) =
[ß(1-ß)]^{-1/2} / pi. | |||||||

3.1 max 2p | The noisy channel coding theorem states that for a discrete memoryless channel, it is possible to achieve a finite rate of transmission with arbitrarily small error probability. Discuss the reasons why in real applications higher error probabilities are unavoidable. | |||||||

3.2 max 2p | Which
distributions p(x) maximize the differential entropy of x for each of
the following constraints:
- E(|x|)=1
- Absolute value |x| is at most one
- E(x^2)=1
| |||||||

4.1 max 2p | Recompute the posterior probability (10.33) in the book by using the given conjugate priors for the mean and variance. | |||||||

4.2 max 2p | With Laplace's method, what is the approximation
for the posterior probability density if the posterior has n local
maxima at x_{1}, x_{2}, ..., x_{n}, the
corresponding joint densities of x_{i} and the data D are
p(x_{1}, D), ..., p(x_{n}, D) and the corresponding
Hessian matrices of -log p(x_{i}, D) are H_{1}, ...,
H_{n}. | |||||||

5.1 max 2p |
Estimate in bits the total sensory experience that you have had in
your life - visual information, auditory information etc. Estimate
how much information you have memorized. Compare these with the
capacity of your brain assuming you have 10^{11} neurons
each making 1000 synaptic connections and that the capacity result
for one neuron (two bits per connection) applies. Is your brain full yet?
| |||||||

6.1 max 2p | In the
MML principle the coding cost is measured by assuming a random
discretisation of the parameters w_{i} with accuracies
e_{i}. The coding cost of the accuracies is neglected. Show
that the MML principle can be seen as a special case of ensemble
learning and that the bits-back argument also justifies the neglection
of the coding cost of the accuracies. (Hint: consider what kind of
ensemble distribution for the weights is implicitly used in the MML
principle.) | |||||||

7.1 max 1p | Exercise 20.1 in the book, page 387. | |||||||

7.2 max 2p | Exercise 21.3 in the book, page 403. | |||||||

8.1 max 2p | Exercise 19.1 in the book, page 379. | |||||||

8.2 max 2p | Exercise 13.3 in the book, page 276. |

That's all folks. Deadline 15^{th} February.

Tästä sivusta vastaa Harri.Lappalainen@hut.fi.

Sivua on viimeksi päivitetty 11.11.1998.

URL: http://www.cis.hut.fi/cis/edu/Tik-61.181/exercises.html