[an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] /Opinnot/T-61.5060/2007/solutions/sol1.shtml [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive]

## Solutions for the 1st exercises

1. Possible tree implementations: RB-tree, AVL-tree, 2-3-4 tree
A hash table:

• hash function?
• array size (random or prime)?
• growable?
• collision maneuver: chaining or open addressing?
• secondary (n-ary?) hash function and its form?

Well known implementations:

• C++ STL, D: RB-tree
• C++ Boost / ANSI C++ 2009: Hash table with some extra tricks
• Java, .NET: Simple dynamic hash table / RB-tree
• Perl, Ruby, Python, Lisp: Simple dynamic hash table
• Awk: Non-growing simple hash table

1. By the binomial distribution this is P(k = 3) = Bin(5,3)/2^5 = 5/16.
2. Similarly, P(k >= 3) = (Bin(5,3) + Bin(5,4) + Bin(5,5))/2^5 = 1/2.
3. P(k = 3 | k >= 3) = P(k = 3)/P(k >= 3) = 5/8. (Here Bin(n,k) is the binomial coefficient "n-choose-k".)

2. P(3/3 heads, type = 1) = 1/3*1/8 = 1/24,
P(3/3 heads, type = 2) = 1/3*1/64 = 1/192,
P(3/3 heads, type = 3) = 1/3*27/64 = 9/64.
P(type = x | 3/3 heads) = P(3/3 heads, type = x) / sum_y ( P(3/3 heads, type = y) ).
P(type = 1 | 3/3 heads) = 2/9,
P(type = 2 | 3/3 heads) = 1/36,
P(type = 3 | 3/3 heads) = 3/4.

3. Let the running time be f(n). There exists numbers A and n0 such that
for all n >= n0, f(n) <= An^2.

4. Var(X) = E((X-m)^2) = E(X^2 - 2*m*X + m^2) = E(X^2) - 2*m*E(X) + m^2 = E(X^2) - m^2, where m = E(X).

5. ```rand('twister', sum(100*clock));
rows = 1000;
cols = 100;
M = randint(rows, cols);
a = randint(1,1,[1, rows]);

dists = sum(abs(M - repmat(M(a,:), rows, 1)), 2);
plot(dists);

figure;
Z = squareform(cols*pdist(M, 'hamming'));
imagesc(Z); colorbar;
```
Plots in the exercise

6. ```rand('twister', sum(100*clock));
rows = 1000;
cols = 100;
M = zeros(rows, cols);
for j = 1:cols,
M(:,j) = randsrc(rows, 1, [ 0, 1; (j-1)/j, 1/j ]);
end
a = randint(1,1,[1, rows]);

dists = sum(abs(M - repmat(M(a,:), rows, 1)), 2);
plot(dists); title(sprintf('Distances to row %d', a));

figure;
Z = squareform(cols*pdist(M, 'hamming'));
imagesc(Z); colorbar; title('Distances');

figure;
Zsums = squareform(pdist(sum(M,2)));
imagesc(Zsums); colorbar; title('Sums');
```
Distances to a certain row
Row distances and their sums

The rarity of ones makes the differences show up big as there aren't that many distances to end up in. Also, the sums of the rows explain quite a bit of the values and their patterns, because even a single extra one is something a bit extraordinary. This is why typically all vectors are either close to or far away from all the others.

You are at: CIS → /Opinnot/T-61.5060/2007/solutions/sol1.shtml

Page maintained by webmaster at cis.hut.fi, last updated Monday, 15-Oct-2007 18:23:44 EEST

[an error occurred while processing this directive]
 WWW www.cis.hut.fi