Algorithms and Analysis
Bart Massey
What is an algorithm?
Instructions for solving a problem
"Mechanical": do not require special reasoning
Things to worry about:
- Can it be executed? (c.f. Kelly-Bootle's Algorithm)
- Will it finish?
- How long will it take?
- Will it produce a "right answer"?
Instances, Problems
In CS, a "problem" is a collection of related "instances"
Problem descriptions are in a special format
Example: POSITIVE INTEGER PRODUCT
Instance: Positive integers m and n
Find: The product m × n
Note that 6×5 is an instance of POSITIVE INTEGER PRODUCT
Classes
In CS, a "class" is a collection of problems
- Example: The class of "arithmetic problems"
While there's lots of ways to classify (collect) problems, one is special: performance
We will talk more about analysis in a bit
Algorithms solve instances
In CS, an "algorithm" is a set of detailed, organized steps for solving all instances of a problem
Algorithms are always described by "pseudocode" — instructions for what things to do in what order
Example: MULTIPLICATION BY ITERATED ADDITION
To multiply positive integers m and n:
p ← 0
repeat m times
p ← p + n
return p
Algorithms take time and use resources
For a computing algorithm
- Time is usually measured in clock cycles
- The usual resource is memory, measured in bits
Questions about MULTIPLICATION BY ITERATED ADDITION:
- Will it always complete? (yes)
- How much storage does it take? (one integer's worth)
- How long does it take? (m cycles…maybe?)
- Does it produce a "right answer"? (yes, but needs proof)
Uses lg m + lg n bits of storage. We write the number of bits needed to represent an integer m as |m|.
Takes m steps to complete: maybe m + |n| when large addition costs are taken into account.
The Church-Turing Theses
Alonzo Church: Pioneer of mathematics of algorithms, pre-computer
Alan Turing: Pioneer of computing
Weak Church-Turing Thesis: The class of problems computable on a Turing Machine is the class of problems computable on any other physical computer
Strong Church-Turing Thesis: The class of problems tractably computable…
Measuring performance
Decide what operations to count, and for how much
For a given instance, run (notionally) the pseudocode and compute the cost
Problem: How to count instances
For a given algorithm each instance of the problem it solves likely takes a different amount of time
Can't really report performance on all possible instances: need a statistic
Obvious statistics don't work: for example, MBIA has an infinite number of arbitrarily large instances, so the mean over all instances is infinity
Approximation: Group instances by input size
For MBIA, the input size is |m| + |n|
Note that MBIA runs in same time for any given |m|, within a factor of two.
Report "asymptotic" runtime as function of |m|.
Approximation: Don't worry about small inputs
- For MBIA, cost of initialization of p is boring as m gets large: leave it out of cost function
Approximation: Don't worry about constants
For MBIA, ignore that factor of two. Report the runtime as O(|m|)
"Big-O" notation is formalizable. We'll talk about the details next lecture.
Approximation: Use a statistic to combine instances
Can take the maximum, minimum, or average runtime over all instances of given input sizes. For MBIA they are all the same, but…
The maximum is especially interesting, since that's the "worst-case"
Complexity Classes
Consider the asymptotic worst-case big-O complexity of a bunch of algorithms for multiplication
The best of these provides an upper bound on the worst-case running time for some hypothetical "best" multiplication algorithm
It is sensible to group together algorithms that have a given worst-case running time
We will talk next time about these "complexity classes" and what they mean to computing
Why are we doing all this?
You are often asked to provide a computer program to solve a problem
Sometimes this requires a non-trivial algorithm
You have to pick or invent an algorithm that gets you the performance you need
Thus, you have to know what the performance is
This whole business provides a practical approach to that
Book Example: "Robot Tour" Optimization
AKA Euclidean Travelling Salesman, Traveling Salesperson, etc.
Find the shortest path from point to point in a scatter of 2D points that visits each point exactly once.
ROBOT TOUR Input: A set P of 2D points Output: A sequence S of points such that |S| = |P|, ran(S) = P, and the "wraparound sum" of Euclidean distances D(S) = sum(i=1..|S| . dist(S[i], S[(i + 1) mod |P|])) is minimal over all such sequences.
Four algorithms
Exhaustive trial
Nearest neighbor
Nearest neighbor to endpoint (new)
Iterated closest pair
Book Example: Movie Jobs
Formalizing the problem
Five algorithms
Exhaustive Trial
Earliest Job
Longest/Shortest job
Earliest Completion
Answering the questions about Earliest Completion:
Will it always complete? With the right answer?
Will it run in a reasonable amount of time?
Is it easy to specify and implement?
Computer Science is about algorithms
The one thing that everyone in our CS Dept does is algorithms
It is at the core of what computer science (vs software engineering, computer programming etc) is
Master this material: it will serve you very well