# NP-Completeness

Bart Massey

## Classes, Problems, Instances

Problem: Collection of instances related by a problem description

Instance: A single input

Example:

`MEDIAN VALUE Instance: A nonempty collection V of n values, with a total order between values Solution: A value v drawn from V such that exactly n div 2 values in V are less than or equal to V and the remaining values in V are greater than or equal to V`

Class: Collection of problems related by a common property

Classes are usually about time or space complexity

## Decision Problem

Problem where the answer for a given instance is either

*true*or*false*Can transform non-decision problems into decision problems

`MEDIAN BOUND Instance: A nonempty collection V of n values, with a total order between values. A value k in V. Question: Are at least n div 2 values in V less than or equal to than k?`

Classic method for optimization problems is binary search on size limit on output

Optimization Problem: Find a shortest path for the Traveling Salesperson

Decision Problem: Is there a path of length <= l for the Traveling Salesperson?

Binary Search: Keep doubling l until the answer is yes, then binary search between last two answers. If TSP is O(n

^{k}), then decision version is O(n^{k}), since l is bounded by existence of relatively short tours (double-tree, see below)

## P and NP

Polytime: O(n

^{k}) for some (any) kNon-deterministic Polynomial Time: decision problem known to be in P for a

*nondeterministic machine*Examples: TSP, Movie Jobs

## Nondeterministic Machine

Augment a computer (Turing Machine, traditionally) with a conditional branch that notionally

Executes both arms in parallel (no limit on parallelism)

Always picks the arm that will lead to deciding yes if if there is such an arm

If the NDTM returns yes, then answer yes

Problem is in NP if NDTM can produce yes in polytime on problems with a yes answer (thus no in polytime on problems with a no answer)

## C-Hardness, C-Completeness

A problem L is

*hard*for a complexity class C if L's complexity class is as hard as any problem in CA problem L is

*polytime-hard*for C if any instance I' of a problem L' in C can be solved in polytime given the solution of some instance I of LIf the problem is also in C, it is

*complete*for C

## P-Completeness

Example: The Circuit Value Problem

`CIRCUIT VALUE Instance: A binary encoding of a Boolean combinatorial circuit C composed of AND, OR, and NOT gates, inputs x1, x2, ..., xn, and a designated output y. Question: Is y=true on inputs x1, x2, ..., xn?`

Clearly Circuit Value is in P

It is possible to prove that every instance of any problem L in P can be encoded as a Circuit Value instance polynomial in the size of the L-instance

Circuit Value is P-complete

## NP-Completeness

Example: PROP Satisfiability

`PROP SAT Instance: A propositional formula with atoms X1..Xn Question: Is there an assignment of true/false to the atoms such that the formula evaluates to true?`

Prop SAT is in NP, since one can write a polytime nondeterministic program that assigns true/false to the atoms of X1..Xn in a way that makes the formula true if there is any satisfying assignment (thus, can get UNSAT by running the program for long enough)

It is possible to prove that every instance of any decision problem L in P can be encoded as a Prop SAT instance polynomial in the size of the L-instance such that the SAT instance has a satisfying assignment iff the L instance answer is yes. (Cook's Theorem, 1971; also a Russian proof by Levin at roughly the same time.)

Prop SAT is in NP and NP-hard, so NP-complete. Problems known to be in P are in NP, but none are known to be NP-hard

If we could find a deterministic polytime algorithm for an NP-hard problem, then P = NP. Is there one? It's a mystery

## Other NPC Problems: Reduction

Building an analogue of Cook's proof for Prop SAT for an arbitrary problem L in NP is quite difficult. Prop SAT was chosen specifically to make the proof easy

Alternative technique: Show that Prop SAT is

*reducible*to LShow that every SAT instance can be reencoded in polytime as an instance of L

This reencoding shows that if we could solve instances of L in polytime we could solve instances of any problem in NP in polytime

Thus L is NP-hard, and since it is in NP, NP-complete

## Example: 3SAT

```
PROP CNF 3SAT
Instance: A propositional formula encoded in
Conjunctive Normal Form, with atoms X1..Xn. No
clause of the formula has more than three literals
Question: Is there an assignment of true/false to
the atoms such that the formula evaluates to true?
```

Can reduce SAT to 3SAT *

Represent the SAT instance as a parse tree of its boolean formula. Let yi be the output of node i with inputs as given

Write the equivalence as a conjunctive formula over the nodes. Each clause will have at most three atoms

Convert each clause to two 3CNF clauses by implication and positive-nand = negative-nor

Augment a short clause with copies of literals

This collection of clauses is satisfiable iff the original clause is satisfiable

Thus, we can transform any SAT instance to a 3SAT instance in polytime in such a way that the 3SAT instance is equivalent to the SAT instance

3SAT is NP-hard and in NP, so is NP-Complete

Now we have two NPC problems, with 3SAT being a

*restriction*of SATWhat about 2SAT? No, there is no known way to transform a SAT instance into a 2SAT instance in polytime. So 2SAT is not known to be NP-hard. (It might be, if P=NP.) There's a deterministic polytime solution to 2SAT (2SAT is in P), so this makes sense

## The NPC Family

There are literally hundreds of problems that have been proven NPC by chains of reductions starting with SAT

Once you add a problem to NPC, you can reduce from it to prove other problems NPC

This is where the NPC proof of TSP and Movie Jobs comes from

Note that only decision problems can be NPC: the binary search trick is used a lot for optimization problems

## Subset Sum is NPC, so Knapsack is

```
KNAPSACK
Instance: A collection L of n lengths. A bound B.
A target k.
Question: Is the a subcollection of L that adds up
to less than B but more than k.
SUBSET SUM
Instance: A collection of n positive integers w1..wn and a
and an integer target size s
Question: Is there a subset of w1..wn that adds up to
exactly s?
```

Subset Sum has been proven NPC by a chain of reductions

Reduce Subset Sum to Knapsack

For a Subset Sum instance (W, s), create a Knapsack of capacity s with items p1..pn whose weight and value are w1..wn and answer the question of whether there is a solution of weight s

It is easy to see that the answer is yes for the Knapsack instance iff it is yes for the Subset Sum instance

Knapsack is NPC

## Why Should You Care?

Garey and Johnson's classic book on NPC starts with a story about your boss wanting you to find an efficient general algorithm for a problem, but you can't

Way better to tell your boss "I proved your problem NPC and so I won't find an efficient algorithm without winning a Turing Award" than "This problem is too hard for me"

## "Practical Example:" Tentmates

Dennis Sasha, Dr. Dobb's Journal 1998

`TENTMATES Given: Set T of tents. Capacity function s: T → N+. Set C of campers, with |C| = sum(ran s). Preference function p: C × C → -10..10 where p(c1, c2) is the preference of c1 for c2 and forall c in C . p(c, c) = 0. Find: Assignment function a: C → T satisfying forall t in T . |{c | a(c) = t}| = s(t) and maximizing sum[c1, c2 in C, a(c1) = a(c2)] p(c1, c2)`

So…maximizing happiness of tentmates

Can come up with good polytime approximations, and good search algorithms

But a polytime algorithm is elusive

Convert to a decision problem and reduce from k-Clique, which is known NPC

This is your final homework

## What To Do If Your Problem Is NPC

Try to find an efficient algorithm for the largest instance sizes you expect to encounter

Try to find a restriction of your problem that is in P

Try to find a good approximation algorithm that is in P