Insertion sort animation:
Pseudo-code:
We employ this instead of "real"
code. It keeps the "code" language independent,
and allows us to express ideas in the simplest way
possible.
A key idea in proving correctness. We must show the algorithm is correct at:
We surely do not want to talk about actual time in nanoseconds, as that depends on too many external things: what other processes are running, details of the hardware, compiler switches, etc, etc. So instead we come up with a very primitive abstract machine (RAM=the Random Access Machine: CPU + directly addressable memory) which we analyze instead of a real computer. Then primitive operations are memory reads/writes and operations in the CPU, such as address arithmetic, additions, multiplications, divisions, etc. Our idea of "running time" is then simply the number of operations performed when the program runs.
Even for inputs of a fixed size (say, your favorite program sorting 10 numbers), different specific inputs will produce different performance. For example, a clever algorithm may notice that the input is already sorted and not try sorting it again. So, we distinguish between best- and worst-case performance (minimum and maximum number of operations performed by the algorithm, over all possible legal inputs of a given size). (There is also the concept of "average-case," but it is tricky, as it requires a definition of what average means: technically, you need to specify a distribution of the inputs. We will mostly stay away from average-case analysis and focus on the worst case; we will occasionally look at the best case too.)
What we really want, is not the running time (as the number of operations, say, in the worst case), for a specific input size, but as a function of the input size, over all sizes. For example, it may be that a given sorting algorithm requires 4n2 + 4n + 24 operations to sort n items.
In order to make life easier, we will not be focusing on the exact value of such function, but on its asymptotic behavior.
In particular, we want to focus on the highest power of n in any function expressing the run time of an algorithm. That is because, as input size grows, that power will dwarf all others in significance. Consider the contribution of n2 and n in the runtime of the sorting algorithm mentioned above for various values of n:
n | n2 |
---|---|
10 | 100 |
100 | 10,000 |
1000 | 1,000,000 |
10,000 | 100,000,000 |
100,000 | 10,000,000,000 |
Now consider that Amazon is processing tens of millions of transactions per day: of what significance is the n factor in the performance of their servers, if they are using an algorithm with an n2 factor involved to sort them?
1. a; 2. b; 3. c; 4. b;
We will try to provide you with many opportunities to try out these algorithms as possible. Here is Python code for a swap routine, which our textbook uses periodically:
l = [1, 2, 3] def swap(l, i, j): temp = l[i] l[i] = l[j] l[j] = temp
Paste that code into the Python console below:
Hit return. Then type 'swap(l, 0, 2)' and hit return.
Then type 'l'.
You should see '[3, 2, 1]': you swapped element 0 and
2!
Now type in the following:
def bubble_sort(l): for i in range(0, len(l) - 1): for j in range(len(l) - 1, i, -1): if l[j] < l[j - 1]: print("Swapping " + str(l[j]) + " and " + str(l[j - 1])) swap(l, j, j - 1) return l