Design and Analysis of Algorithms Midterm
Problems
- Write pseudo-code for a double hasing scheme. You may simply
call your hashing functions h1() and h2(). (In other words, you
needn't actually write the hashing functions.)
Example code here.
- What operations do we want in the ADT iterator?
Consider including rewind() in the ADT: is it always going to
be possible to implement it? If "no," in which situations
would it not be possible?
- Get iterator.
- Get next value.
- Detect end of iterables.
Can't rewind if data is gone, e.g., live stock feed.
- Let us say we have an algorithm that runs in 8(lg n) + 10 time.
We want to show its order of complexity is θ(lg n). Give
an example of a set of k1, k2,
and n0 that would do this, so that k1
* lg n < 8(lg n) + 10 < k2 * lg n.
Example values:
k1 = 9
k2 = 7
n0 = 2048
- Imagine you have an 8-sided die, numbered 1 through 8. Use an
indicator random variable to show how many 8s we expect to get
in 100 rolls.
See here for example.
- Draw the binary search tree that arises from inserting, in
order:
24, 32, 10, 6, 88, 20, 28, 22, 1, 99
- In the above tree, describe the process by which we find the
successor of 22.
- Use an indicator random variable to show how many numbers we
have to pick from a set of n numbers before we are
more likely than not to have drawn the same number twice.
See here for example.
- Write pseudo-code for a hash table (you can just assume a hash
function of h) that resolves collisions by chaining, and that
looks up (possibly chained) key value and returns the
associated data.
Example code here.
- Write pseudo-code to search a BST for a particular key.
- The logistic equation is
xn = rx(n-1)(1 - x(n-1)).
Write memoized pseudo-code that uses recursion, but also
uses a direct addressing to store results.
Remember to both store value and retrieve the value!
Multiple Choice
- Compared to chaining, open addressing has the downside that:
- It is very slow.
- The hash table can fill up. *
- It can't resolve collisions.
- The hash table takes a huge amount of space.
- We have one algorithm for processing customer records
with run time of O(n), and another with
run time of O(lg n) + 1000. In what circumstances might we want
to choose the O(n) algorithm?
- No circumstances.
- If our programmers are really bad.
- We believe our program will always be dealing with a
number of records less than one thousand. *
- If n is very large.
- Under what circumstances might we most reasonably expect to
receive worst-case input to our binary search tree?
- We are loading the tree from an existing data store
which is sorted. *
- We insert data into the tree as it comes along to us.
- The data has been randomly shuffled.
- All of the above.
- When we randomize an algorithm, we then speak of its
- necessary running time.
- best-case running time.
- expected running time. *
- unbounded running time.
- The linearity of expectations means that, for two events X
and Y, the expectation that either event will occur equals
- the product of the expectation that X will occur
and the expectation that Y will occur
- the difference of the expectation that X will occur
and the expectation that Y will occur
- the square of the expectation that X will occur
and the expectation that Y will occur
- the sum of the expectation that X will occur and
the expectation that Y will occur *
- The worst-case running time of a search on a binary search
tree is
- θ(n2)
- θ(n) *
- θ(lg n)
- θ(1)
- If we have a binary search tree of height h, then
all searching functions such as min, max, and successor
will run in
- O(h) time. *
- O(h2) time.
- O(ln h) time.
- O(lg h) time.
- According to the binary search tree property
- for all nodes higher in the tree than x,
x.key ≤ y.key.
- for all nodes in the right subtree of x,
x.key ≤ y.key. *
- for all nodes lower in the tree than x,
x.key ≤ y.key.
- all nodes in the right subtree of x will
be the successors of x.
- The expected value of a real variable X is:
- is the weighted sum of all its values, weighted
according to the probability that they happen. *
- the single value that occurs most often.
- the value in the middle of its probability
distribution.
- all of the above, as they are equivalent.
- Θ-notation applied to a function implies
it is
- a function we know little about.
- asymptotically bound from above and below. *
- asymptotically bound from below only.
- asymptotically bound from above only.
- For large n, an algorithm will run the slowest
if it time complexity is:
- O(20n lg n)
- O(n!) *
- O(n3)
- O(2n)
- An algorithm is most like:
- a carefully designed recipe. *
- a car engine.
- an organic molecule.
- a literary work.
- O-notation applied to a function implies
- it is a function we know little about.
- it is asymptotically bound from above and below.
- only that it is asymptotically bound from below.
- only that it is asymptotically bound from above. *
- For large n, an algorithm will run the fastest
if it time complexity is:
- O((1.1)n)
- O(n!)
- O(n2) *
- O(n3)
- Besides running time, we can also measure algorithm performance
by:
- disk usage
- memory usage
- power consumed
- all of the above *
- If f(n) = ω(g(n)), that means
- g dominates f asymptotically
- f is bounded below by g asymptotically
- f is bounded above by g asymptotically
- f dominates g asymptotically *
- Which of these functions grows most slowly?
- lg n
- lg* n *
- ln n
- log10 n
- Ω-notation applied to a function implies
- it is a function we know little about.
- it is asymptotically bound from above and below.
- only that it is asymptotically bound from below. *
- only that it is asymptotically bound from above.
- In algorithm analysis, we usually analyze algorithms
in terms of
- actual running time in nanoseconds.
- the number of disk operations.
- the number of basic operations. *
- CPU time used.
- In algorithm analysis, we usually look at
- worst-case performance. *
- average-case performance.
- best-case performance.
- median-case performance.
- The chief advantage of a doubly-linked list over a linked list is
- we can walk the list in either direction. *
- inserts are faster.
- deletes are faster.
- less memory is used.
- A good use of direct dictionaries would be
- memoization of a numeric algorithm.
- coding the Sieve of Eratosthenes.
- marking zipcodes seen in a mailing application.
- all of the above. *
- The worst case of hashing with chaining occurs when
- the input comes in sorted already.
- all inputs hash to different values.
- the input is purely random.
- all inputs hash to the same value. *
- The "clustering" problem with open addressing with linear
probing means that
- "clusters" of keys will all hash to the same value.
- "clusters" of keys will build up in a linked list.
- the hash table clusters around its central value.
- once inputs start to cluster in one area of the
hash table, they will become more likely to do so. *
- A random variable is actually
- the mean of all possible values.
- an algorithm.
- a precise real number value.
- a function. *