tomclegg.net |
NP-completePosted January 16, 2001 Today I'm going to talk about computers instead of physics. But I'm not going to talk about GigaBytes and MegaHertz because KCR already has a show about that. But this is a science show, and there is a discipline called "computer science." I don't think that's a very good name for the subject, but that's what everyone calls it. Except the University of Waterloo, where it's part of the mathematics department. Sciences generally deal with trying to understand the nature of the universe, and they mostly boil down to testing mathematical models to see how accurately they describe real patterns in nature. You could describe computer science the same way, except that instead of looking for patterns in nature, you're looking for patterns in mathematics. One thing you need to know in order to understand computer science, is that a computer is not a little grey box that buzzes and complains about illegal instructions. To a computer scientist, a computer doesn't even have to exist in order to be interesting. Probably the best way to describe what it is, is with a bit of history. Alan Turing is the man who invented the stored-program digital computer. He didn't build one, he just wrote a paper about it. It's now commonly known as the Turing machine, and it's still never been built, but it still has a place in computer science because it was capable of solving exactly the same set of problems as any modern computer.
The original idea of the Turing machine was to design a
machine that could churn out all provable logic statements,
one by one, starting from a few fundamental assumptions, or
axioms, and some rules which describe the rules to use with
them to prove new statements. The assumptions and rules are
written on a tape which is fed into the computer. The
computer knows how to read a symbol from the tape, erase the
symbol and replace it with something else, and move the tape
forward and backward a certain number of spaces. And thus
the art of computer programming was born. If you put the
right sequence of symbols on the tape, and put it in the
computer, the computer will spit it out some time later with
a list of the first 100 prime numbers. If you put the
The most important thing about the Turing machine, which sets it apart from all machines that had ever been invented before, was that there is no distinction between input, and output, and instructions. They're all just a bunch of symbols on the tape, until you put it in the computer. And when the computer stops and the tape comes out, there's still just a bunch of symbols on the tape, although some of them might have changed since you put it in. But you now have the option of reading the tape and interpreting it as "output," or putting it back in the computer and interpreting it as a "program."
Alan Turing immediately figured out a whole bunch of fun
things about this hypothetical machine. For one thing, he
noticed that you can write a program whose output is itself
a program. He noticed that you can write a program whose
Of course, this raises the question that people like to ask when they're interested in Artificial Intelligence, which is, "what if the other computer is a thousand times more powerful? Maybe the other machine can do something that this one can't!" Well, another thing Alan Turing noticed was that all hardware limitations can be overcome by software. If the so-called more powerful computer has a deterministic operation, which basically means it's predictable, then its operation can be programmed into your so-called less powerful computer. As a rough analogy, if you know how to do long multiplication, and I only know how to add, then you can figure out what 10 x 10 is a lot quicker than I can, but it would just be a matter of time before I added 10 + 10 + 10 and so on and arrived at the same number. So your method may be faster, but it's not more powerful in the sense that there's no question that you can answer with your more powerful multiplication that I can't answer with my less powerful addition.
Once we've established that a Turing machine is capable of
all possible computation, then we're left with a couple of
big questions. The biggest one is, "is anything Guess who gave us the first example of something that can't be computed? Remember, we're still on the same essay where he invented computers, and now he's telling us the fundamental limitations of logic machines. The impossible programs are the ones that tell us things about other programs. You can't write the program called EQUIVALENT, which examines two other programs and tells you whether they are functionally identical. This is a pity, since it would let us find fast algorithms to solve problems just by trying all possible algorithms and seeing whether they're equivalent to a known, slow algorithm. You can't write the program called CRASH-TEST, which examines another program and tells you whether it will ever crash. Now this is a real pity, for reasons that should be obvious. There's a nice little proof that shows why you can't write a CRASH-TEST program. Let's say you gave me a program that claimed to be CRASH-TEST. All I have to do to prove you wrong is to give you a program that will fool your program into thinking it never crashes, but then goes ahead and crashes anyway. This program is called CRASH, and here's how it works, in two easy steps. - It runs your CRASH-TEST program on itself, to see what your
prediction is.
- If CRASH-TEST said "NO, this program is safe," then it crashes.
Or, if you said, "YES, this program will crash", then my program stops without crashing.
Not very difficult! It's enough for a mathematician to prove that there is such a thing as an unwriteable program. He doesn't need to list them all, which is an especially good thing if he just proved that listing them all would be impossible. Alan Turing went on to do more great things, like calling the police about a robbery, telling them that he was gay, taking hormone injections and growing breasts so he didn't have to go to jail for being a pervert, and eventually committing suicide anyway. Oh, he also helped the Allies win the second world war, by intercepting the Germans' radio signals and repeatedly breaking their unbreakable encryption scheme. But that's not important any more, because a couple of short decades later, someone in Toronto, of all places, thought up another interesting category of computer programs. The category is called NP-complete, and it basically means, "just as hard as the hardest NP problem." Which doesn't help much if you don't know what NP means. Guess what. I'm going to tell you what NP means.
NP stands for Nondeterministic Polynomial. Which is
concise, if not exactly illuminating. If a problem is NP,
that means you can easily verify whether someone has found
the right answer. For example, if someone tells you that
the combination to their suitcase is 4-5-1, then you can
just dial up 4-5-1 and see if it opens. The important thing
is that it's easy to So that's NP... How about NP-complete? Well, Stephen Cook proved in 1972 that a problem called Circuit-SAT is at least as hard to solve as any other NP problem. To solve Circuit-SAT is to read a schematic for a piece of electronics and say whether there's any combination of inputs that will make it output something. Ever since 1972, people have been proving the same thing about a whole bunch of other problems, using an interesting technique called polynomial transformation. Polynomial transformation bears some resemblance to the CRASH-TEST proof. The idea is to argue that if someone were to give me a fast algorithm that solves my pet problem, then I can take that program and turn it into a program that solves Circuit-SAT. And if my program is just as hard to solve as Circuit-SAT, then, like Circuit-SAT, it must be just as hard as any other NP problem. So we've got two problems that are NP-complete, and we can prove a third and a fourth the same way.
Keep in mind that nobody has ever This shouldn't be a big surprise. Godel proved in the 1920s that there's no such thing as a perfect logic system. Normally the imperfection takes the form of a logic statement that can't be proved true or false. I'll just leave you with the most famous example, and tonight while you're trying to fall asleep, you can follow the parallel with the CRASH problem, about the program that can't be written.
First a quick refresher. There's a set of Dogs, and a set
of Cats. Some animals are Dogs, and some are Cats. There's
no animal that is both a Dog and a Cat. There are some
animals that are neither dogs nor cats. If an animal is in
the set of Cats, then it is That should all sound pretty reasonable.
Now I want you to consider the set of all Reflexive sets. A
reflexive set is one that contains itself. The set of all
non-empty sets is Reflexive, because it is not empty. On
the other hand, the set of all dogs is And if you think reflexive sets are confusing, then consider the set of non-reflexive sets. Non-reflexive sets are things like the empty set. The empty set does not contain itself. Of course, because it is empty. The set of Dogs is non-reflexive. So tell me this:
If I've got a set called NR, and it contains all of the
non-reflexive sets, is it really non-reflexive? I'm asking
because if it Kinda confusing... I'll say it again, in case it helps. If I have a set called NR, and it contains all of the non-reflexive sets, is NR itself a reflexive set or a non-reflexive set? |