Knowing that P means “in polynomial time”, you might be tempted to think that NP means “in non-polynomial time” and while that kind of goes in the right direction, it means “in non-deterministic polynomial time”. Explaining what non-deterministic calculations are would be a bit too complicated for an ELI5, so let’s simplify a bit. A regular computer must make all decisions (for example which way to turn when calculating a shortest route between two points) based on the problem input alone. A non-deterministic computer can randomly guess. For judging complexity, we look at the case where it just happens to always guess right. Even when guessing right, such a computer doesn’t solve a problem immediately because it needs to make a number of guesses that depends on the input (for example the number of road junctions between our points). NP is the class of problems that a non-deterministic computer can solve in polynomial Time (O(n^a) for any a).
Obviously, we don’t really have computers that always guess right, though quantum computers can get us a bit closer. But there are three important properties that let us understand NP problems in terms of regular computers:
a non-deterministic computer can do everything a regular computer can do (and more), so every problem that’s part of P is also part of NP.
every problem that takes n guesses with x options for each guess can be simulated on a regular computer in O(x^n) steps by just trying all combinations of options and picking the best one. With some math, we can show that this is also true if we don’t have n but O(n^a) guesses. Our base x might be different, but we can always find something with n in the exponent.
While finding a solution on a regular computer may need exponential time, we can always check if a solution is correct in polynomial time.
One important example for a problem in NP is finding the prime factors of a number which is why that is an important basic operation in cryptography. It’s also an intuitive example for checking the result being easy. To check the result, we just need to multiply the factors together and see if we get our original number. Okay, technically we also need to check if each of the factors we get is really prime but as mentioned above, that’s also doable in polynomial time.
Now for the important thing: we don’t know if there is some shortcut that lets us simulate NP problems on a regular computer in polynomial time (even with a very high exponent) which would make NP equal to P.
What we do know is that there are some special problems (either from NP or even more complex) where every single problem from NP can be rephrased as a combination of that special problem (let’s call it L) plus some extra work that’s in P (for example converting our inputs and outputs to/from a format that fits L). Doing this rephrasing is absolutely mind-bending but there are clever computer scientists who have found a whole group of such problems. We call them NP-hard.
Why does this help us? Because finding a polynomial-time solution for just a single NP-hard problem would mean that by definition we can solve every single problem from NP by solving this polynomial-time NP-hard problem plus some polynomial-time extra work, so polynomial-time work overall. This would instantly make NP equal to P.
This leaves us with the definition of NP-complete. This is simply the class of problems that are both NP-hard and themselves in NP. This definition is useful for finding out if a problem is NP-hard but I think I’ve done enough damage to your 5-year-old brain.
I see it as:
P: is a problem that gets solved and proved easily.
Np: is is a problem that is difficult to solve but easy to prove.
P=np ie np-complete: as difficult to solve as it is to prove.
Np-hard: no single solution, might require multiple “np” solutions (eg a different algorithm for each input element)
The diagram is pretty good but your interpretation is not quite right, especially for NP-complete and NP-hard.
NP-hard means “at least as hard as all problems in NP”, proven by the fact that any single NP-hard problem can be used to solve the entire class of all NP problems.
NP-complete means “at least as hard as all problems in NP and itself also in NP”, so the intersection between NP and NP-hard.
The thing about P = NP or P != NP is something different. We don’t know if P and NP are the same thing or not, we don’t have a proof in either direction. We only know that P is at least a subset of NP. If we could find a P solution for any NP-hard problem, we would know that P = NP. That would have massive consequences for cryptography and cyber-security because modern encryption relies on the assumption that encrypting something with a key (P) is easier than guessing the key (NP).
On the other hand, at some point we might find a mathematical proof that we can never find a P solution to an NP-hard problem which would make P != NP. Proving that something doesn’t exist is usually extremely hard and there is the option that even though P != NP we will never be able to prove it and are left to wonder for all eternity.
One important addendum: complexity classes always consider how hard a problem is depending on the input size. Sorting is in P (usually O(n*log(n)), so one of the easiest problems overall) but given a few trillion inputs, it would be pretty much impossible to solve on consumer hardware. On the other hand, problems like 3-sat, the knapsack problem or travelling salesman are all NP-hard but with small enough inputs (up to a few dozen or so), they are easy to solve, even with pen and paper and are even regularly included in puzzle books.
Alright, part 2, let’s get to NP.
Knowing that P means “in polynomial time”, you might be tempted to think that NP means “in non-polynomial time” and while that kind of goes in the right direction, it means “in non-deterministic polynomial time”. Explaining what non-deterministic calculations are would be a bit too complicated for an ELI5, so let’s simplify a bit. A regular computer must make all decisions (for example which way to turn when calculating a shortest route between two points) based on the problem input alone. A non-deterministic computer can randomly guess. For judging complexity, we look at the case where it just happens to always guess right. Even when guessing right, such a computer doesn’t solve a problem immediately because it needs to make a number of guesses that depends on the input (for example the number of road junctions between our points). NP is the class of problems that a non-deterministic computer can solve in polynomial Time (
O(n^a)
for any a).Obviously, we don’t really have computers that always guess right, though quantum computers can get us a bit closer. But there are three important properties that let us understand NP problems in terms of regular computers:
n
guesses withx
options for each guess can be simulated on a regular computer inO(x^n)
steps by just trying all combinations of options and picking the best one. With some math, we can show that this is also true if we don’t haven
butO(n^a)
guesses. Our basex
might be different, but we can always find something withn
in the exponent.One important example for a problem in NP is finding the prime factors of a number which is why that is an important basic operation in cryptography. It’s also an intuitive example for checking the result being easy. To check the result, we just need to multiply the factors together and see if we get our original number. Okay, technically we also need to check if each of the factors we get is really prime but as mentioned above, that’s also doable in polynomial time.
Now for the important thing: we don’t know if there is some shortcut that lets us simulate NP problems on a regular computer in polynomial time (even with a very high exponent) which would make NP equal to P.
What we do know is that there are some special problems (either from NP or even more complex) where every single problem from NP can be rephrased as a combination of that special problem (let’s call it L) plus some extra work that’s in P (for example converting our inputs and outputs to/from a format that fits L). Doing this rephrasing is absolutely mind-bending but there are clever computer scientists who have found a whole group of such problems. We call them NP-hard.
Why does this help us? Because finding a polynomial-time solution for just a single NP-hard problem would mean that by definition we can solve every single problem from NP by solving this polynomial-time NP-hard problem plus some polynomial-time extra work, so polynomial-time work overall. This would instantly make NP equal to P.
This leaves us with the definition of NP-complete. This is simply the class of problems that are both NP-hard and themselves in NP. This definition is useful for finding out if a problem is NP-hard but I think I’ve done enough damage to your 5-year-old brain.
I had a huge reply, but after some googling to try and understand, I’m gonna go with this wiki image:
https://upload.wikimedia.org/wikipedia/commons/thumb/a/a0/P_np_np-complete_np-hard.svg/1024px-P_np_np-complete_np-hard.svg.png
(Black graph on transparent background, this might be better: https://en.m.wikipedia.org/wiki/NP_(complexity)#/media/File:P_np_np-complete_np-hard.svg )
I see it as:
P: is a problem that gets solved and proved easily.
Np: is is a problem that is difficult to solve but easy to prove.
P=np ie np-complete: as difficult to solve as it is to prove.
Np-hard: no single solution, might require multiple “np” solutions (eg a different algorithm for each input element)
The diagram is pretty good but your interpretation is not quite right, especially for NP-complete and NP-hard.
NP-hard means “at least as hard as all problems in NP”, proven by the fact that any single NP-hard problem can be used to solve the entire class of all NP problems.
NP-complete means “at least as hard as all problems in NP and itself also in NP”, so the intersection between NP and NP-hard.
The thing about P = NP or P != NP is something different. We don’t know if P and NP are the same thing or not, we don’t have a proof in either direction. We only know that P is at least a subset of NP. If we could find a P solution for any NP-hard problem, we would know that P = NP. That would have massive consequences for cryptography and cyber-security because modern encryption relies on the assumption that encrypting something with a key (P) is easier than guessing the key (NP).
On the other hand, at some point we might find a mathematical proof that we can never find a P solution to an NP-hard problem which would make P != NP. Proving that something doesn’t exist is usually extremely hard and there is the option that even though P != NP we will never be able to prove it and are left to wonder for all eternity.
That was awesome, thank you!
One important addendum: complexity classes always consider how hard a problem is depending on the input size. Sorting is in P (usually
O(n*log(n))
, so one of the easiest problems overall) but given a few trillion inputs, it would be pretty much impossible to solve on consumer hardware. On the other hand, problems like 3-sat, the knapsack problem or travelling salesman are all NP-hard but with small enough inputs (up to a few dozen or so), they are easy to solve, even with pen and paper and are even regularly included in puzzle books.