Randomized algorithms |
Exercises 2, due February 16, 2000
- Assume we have a file containing N records, where N is so
large that we cannot keep all of the file in memory at one time.
We wish to select a random subset of n records from the file.
We will use the following algorithm:
Set t:=0, m:=0.
Repeat until m=n:
Generate a random number U, uniformly distributed between 0 and 1.
If U<(n-m)/(N-t), then
Add the next record from the file into the subset.
Skip the next record.
Prove that the algorithm doesn't try to read more than N records
from the file. Prove that each of the
subsets is equally likely to be selected.
- Let p(x,y) be the probability that exactly x records are
selected from among the first y by the previous algorithm.
- What is the expected value of t when the algorithm
terminates? What is the variance?
- Devise an algorithm to select a random permutation of n objects using O(n) random numbers. Each permutation should be
Wednesday, 09-Feb-2000 20:43:29 EET