Algorithms and Analysis

Bart Massey

What is an algorithm?

  • Instructions for solving a problem

  • "Mechanical": do not require special reasoning

  • Things to worry about:

    • Can it be executed? (c.f. Kelly-Bootle's Algorithm)
    • Will it finish?
    • How long will it take?
    • Will it produce a "right answer"?

Instances, Problems

  • In CS, a "problem" is a collection of related "instances"

  • Problem descriptions are in a special format

  • Example: POSITIVE INTEGER PRODUCT

    Instance: Positive integers m and n

    Find: The product m × n

  • Note that 6×5 is an instance of POSITIVE INTEGER PRODUCT

Classes

  • In CS, a "class" is a collection of problems

    • Example: The class of "arithmetic problems"
  • While there's lots of ways to classify (collect) problems, one is special: performance

  • We will talk more about analysis in a bit

Algorithms solve instances

  • In CS, an "algorithm" is a set of detailed, organized steps for solving all instances of a problem

  • Algorithms are always described by "pseudocode" — instructions for what things to do in what order

  • Example: MULTIPLICATION BY ITERATED ADDITION

    To multiply positive integers m and n:
      p ← 0
      repeat m times
        p ← p + n
      return p

Algorithms take time and use resources

  • For a computing algorithm

    • Time is usually measured in clock cycles
    • The usual resource is memory, measured in bits
  • Questions about MULTIPLICATION BY ITERATED ADDITION:

    • Will it always complete? (yes)
    • How much storage does it take? (one integer's worth)
    • How long does it take? (m cycles…maybe?)
    • Does it produce a "right answer"? (yes, but needs proof)
  • Uses lg m + lg n bits of storage. We write the number of bits needed to represent an integer m as |m|.

  • Takes m steps to complete: maybe m + |n| when large addition costs are taken into account.

The Church-Turing Theses

  • Alonzo Church: Pioneer of mathematics of algorithms, pre-computer

  • Alan Turing: Pioneer of computing

  • Weak Church-Turing Thesis: The class of problems computable on a Turing Machine is the class of problems computable on any other physical computer

  • Strong Church-Turing Thesis: The class of problems tractably computable…

Measuring performance

  1. Decide what operations to count, and for how much

  2. For a given instance, run (notionally) the pseudocode and compute the cost

Problem: How to count instances

  • For a given algorithm each instance of the problem it solves likely takes a different amount of time

  • Can't really report performance on all possible instances: need a statistic

  • Obvious statistics don't work: for example, MBIA has an infinite number of arbitrarily large instances, so the mean over all instances is infinity

Approximation: Group instances by input size

  • For MBIA, the input size is |m| + |n|

  • Note that MBIA runs in same time for any given |m|, within a factor of two.

  • Report "asymptotic" runtime as function of |m|.

Approximation: Don't worry about small inputs

  • For MBIA, cost of initialization of p is boring as m gets large: leave it out of cost function

Approximation: Don't worry about constants

  • For MBIA, ignore that factor of two. Report the runtime as O(|m|)

  • "Big-O" notation is formalizable. We'll talk about the details next lecture.

Approximation: Use a statistic to combine instances

  • Can take the maximum, minimum, or average runtime over all instances of given input sizes. For MBIA they are all the same, but…

  • The maximum is especially interesting, since that's the "worst-case"

Complexity Classes

  • Consider the asymptotic worst-case big-O complexity of a bunch of algorithms for multiplication

  • The best of these provides an upper bound on the worst-case running time for some hypothetical "best" multiplication algorithm

  • It is sensible to group together algorithms that have a given worst-case running time

  • We will talk next time about these "complexity classes" and what they mean to computing

Why are we doing all this?

  • You are often asked to provide a computer program to solve a problem

  • Sometimes this requires a non-trivial algorithm

  • You have to pick or invent an algorithm that gets you the performance you need

  • Thus, you have to know what the performance is

  • This whole business provides a practical approach to that

Book Example: "Robot Tour" Optimization

  • AKA Euclidean Travelling Salesman, Traveling Salesperson, etc.

  • Find the shortest path from point to point in a scatter of 2D points that visits each point exactly once.

          ROBOT TOUR
          Input: A set P of 2D points
          Output: A sequence S of points such that
          |S| = |P|, ran(S) = P, and the "wraparound sum" of
          Euclidean distances
    
             D(S) = sum(i=1..|S| . dist(S[i], S[(i + 1) mod |P|]))
    
          is minimal over all such sequences.
    
  • Four algorithms

    • Exhaustive trial

    • Nearest neighbor

    • Nearest neighbor to endpoint (new)

    • Iterated closest pair

Book Example: Movie Jobs

  • Formalizing the problem

  • Five algorithms

    • Exhaustive Trial

    • Earliest Job

    • Longest/Shortest job

    • Earliest Completion

  • Answering the questions about Earliest Completion:

    • Will it always complete? With the right answer?

    • Will it run in a reasonable amount of time?

    • Is it easy to specify and implement?

Computer Science is about algorithms

  • The one thing that everyone in our CS Dept does is algorithms

  • It is at the core of what computer science (vs software engineering, computer programming etc) is

  • Master this material: it will serve you very well