Laboratory of Computer and Information Science / Neural Networks Research Centre CIS Lab Helsinki University of Technology

Laboratory of Computer and Information Science > Teaching > T-61.5060 > Exercises 2007 > Solutions 1

Solutions for the 1st exercises

  1. Possible tree implementations: RB-tree, AVL-tree, 2-3-4 tree
    A hash table:

    • hash function?
    • array size (random or prime)?
    • growable?
    • collision maneuver: chaining or open addressing?
    • secondary (n-ary?) hash function and its form?

    Well known implementations:

    • C++ STL, D: RB-tree
    • C++ Boost / ANSI C++ 2009: Hash table with some extra tricks
    • Java, .NET: Simple dynamic hash table / RB-tree
    • Perl, Ruby, Python, Lisp: Simple dynamic hash table
    • Awk: Non-growing simple hash table


    1. By the binomial distribution this is P(k = 3) = Bin(5,3)/2^5 = 5/16.
    2. Similarly, P(k >= 3) = (Bin(5,3) + Bin(5,4) + Bin(5,5))/2^5 = 1/2.
    3. P(k = 3 | k >= 3) = P(k = 3)/P(k >= 3) = 5/8. (Here Bin(n,k) is the binomial coefficient "n-choose-k".)


  2. P(3/3 heads, type = 1) = 1/3*1/8 = 1/24,
    P(3/3 heads, type = 2) = 1/3*1/64 = 1/192,
    P(3/3 heads, type = 3) = 1/3*27/64 = 9/64.
    P(type = x | 3/3 heads) = P(3/3 heads, type = x) / sum_y ( P(3/3 heads, type = y) ).
    P(type = 1 | 3/3 heads) = 2/9,
    P(type = 2 | 3/3 heads) = 1/36,
    P(type = 3 | 3/3 heads) = 3/4.


  3. Let the running time be f(n). There exists numbers A and n0 such that
    for all n >= n0, f(n) <= An^2.


  4. Var(X) = E((X-m)^2) = E(X^2 - 2*m*X + m^2) = E(X^2) - 2*m*E(X) + m^2 = E(X^2) - m^2, where m = E(X).

  5. rand('twister', sum(100*clock));
    rows = 1000;
    cols = 100;
    M = randint(rows, cols);
    a = randint(1,1,[1, rows]);
    
    dists = sum(abs(M - repmat(M(a,:), rows, 1)), 2);
    plot(dists);
    
    figure;
    Z = squareform(cols*pdist(M, 'hamming'));
    imagesc(Z); colorbar;
    
    Plots in the exercise

  6. rand('twister', sum(100*clock));
    rows = 1000;
    cols = 100;
    M = zeros(rows, cols);
    for j = 1:cols,
    	M(:,j) = randsrc(rows, 1, [ 0, 1; (j-1)/j, 1/j ]);
    end
    a = randint(1,1,[1, rows]);
    
    dists = sum(abs(M - repmat(M(a,:), rows, 1)), 2);
    plot(dists); title(sprintf('Distances to row %d', a));
    
    figure;
    Z = squareform(cols*pdist(M, 'hamming'));
    imagesc(Z); colorbar; title('Distances');
    
    figure;
    Zsums = squareform(pdist(sum(M,2)));
    imagesc(Zsums); colorbar; title('Sums');
    
    Distances to a certain row
    Row distances and their sums

    The rarity of ones makes the differences show up big as there aren't that many distances to end up in. Also, the sums of the rows explain quite a bit of the values and their patterns, because even a single extra one is something a bit extraordinary. This is why typically all vectors are either close to or far away from all the others.


You are at: CIS → T-61.5060 Exercises 2007, Solutions 1

Page maintained by t615060@james.hut.fi, last updated Monday, 15-Oct-2007 18:23:44 EEST