The halting problem: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
 
(2 intermediate revisions by the same user not shown)
Line 3: Line 3:
Given the code of a program, can you say that it will ever stop, as in, not continue running forever?
Given the code of a program, can you say that it will ever stop, as in, not continue running forever?


{{comment|(Note that this problem addresses correct, deterministic problems and solutions.  Yes, bad thread locking can cause race conditions that [[deadlock]] or [[livelock]], but that's not the point of the exercise.)}}


For some code, absolutely.  
 
It would be trivial to make some examples that will always halt.
 
For some specific code running on some specific data, absolutely.
 
It would be trivial to make some code-and-data examples that will always halt, and some others that run forever.




Yet, once you introduce programs for which the behaviour depends on the input,  
Yet, once you introduce programs for which the behaviour depends on the input,  
can you prove any given code will halt for all possible input?
can you prove any given code will halt ''for all possible input''?
 
Also, you can figure this out for ''any'' possible code given to you?
That's suddenly some hard proofy math problem.
 
 
Spoilers: Nope, you cannot. 
 
This problem is [https://en.wikipedia.org/wiki/Undecidable_problem undecidable], meaning it is ''proven'' to be impossible to give a correct yes-or-no answer.
 
 




Also, you can figure this out for ''any'' possible code given to you?
'''''The''''' halting problem often refers to this question in a general sense.
: i.e. the concept being discussed here




'''''A''''' halting problem can describe this property of a particular question
: For example, "I want you to discover the fastest possible machine code to compile this to" turns out to be a halting problem.{{verify}}




-->


Spoilers: Nope. This problem is [https://en.wikipedia.org/wiki/Undecidable_problem undecidable], meaning it is ''proven'' to be impossible to give a correct yes-or-no answer.


==="I suppose, but um, why is that interesting ''at all''?"===
<!--
Or, "So, is the halting problem just a academic exercise?"


Partly, yes.


(Note that this problem addresses correct, deterministic things. Yes, bad thread locking can cause race conditions that deadlock or livelock, but that's not the point of the exercise)


Mathematicians like this sort of thing, though will probably already seen this stuff through Godel and Turing.


Programmers may care in theory, but a lot of the examples you will see in discussions do not matter
to ''most'' programmers.




But partly no, it's not ''just'' a thought experiment, it has some uses.


As to software we didn't make, well, a lot of software is written in a relatively straightforward recipe form, and finishes just because what it does doesn't vary, much or at all, with the data you give it.
One, knowing that there are some ''perfectly valid questions'' that ''cannot be answered in reasonable time'',
can save you a lot of time trying to do so, by recognizing the earmarks of such problems.  


It's related to recognizing [[NP-complete]] and [[NP-hard]] problems.


However, a lot of real-world, practically useful problems do different things with different data.
It's related to saying
: "this is suboptimal, but we know that we can't necessarily do much better"
: or, sometimes "this is a dirty quick fix, we ''do'' know of a much better solution but just haven't gotten to it"


This tends to increase the amount of possible states and possible edge cases.
(mathematicians will also care for this sort of concept, to not waste too much time)
In some cases in a very constrained way, in which case a complete proof it will halt will just be longer and more involved.
 
 
Consider asking a compiler for the fastest possible code (or shortest possible, or any such optimum).  


In other cases it's so explosively bad, or unconstrained, that a proof is intractible.
You ''can'' reason about "it will never be faster than this" style bounds,
and determine those bounds ''without'' having to figure out precisely what code that is.


Because it turns out that asking for ''the'' best possible answer ''is'' a halting problem.


And sometimes it's more of a mathematical property.
Knowing that, it suddenly becomes really useful that,
if we can say with certainty code can fundamentally never be more than (roughly) 10% faster than it is now,
that is valuable, e.g. to tradeoffs made at compile time - it makes cost ''and'' benefit (roughly) quantifiable,
and makes it easier to decide ''when'' we are happy with the result.


One example lies in Goldbach's conjecture - "every even integer can be expressed as a sum of two primes".
(...though arguably that's crossing over into [[complexity theory]]).


There's no short proof anyone has thought of,  
In other words: So it's really useful to know when something is really close to optimal, and go with that.  
so people have written program that just starts trying. Once it finds a counterexample, it will halt.
This is how we tackle optimizing compilers in the real world.


Will it halt?


If the conjecture is false, which is provable with one counterexample, it will finish.
If the conjecture is true, it will not. And people have run exactly such programs for a looong while, and so far gotten only answers.


Which makes it certainly ''presumable'' the conjecture is true, but is actually an unsolved problem.
And in some ways, with this approach it will remain unsolved ''if'' it is true.


But we cannot know whether that will halt.
Only if and when it does.


Usually, this undecidability is because of an search space that is or becomes infinite.


There are some cases that ''look'' finite and merely intractible.


'''So, is the halting problem just a academic exercise?'''


Partly, yes.
Sometimes that difference depending on the precise wording of the question.


A lot of the examples don't matter a lot to most programmers,
and mathematicians will probably already know this stuff through Godel and Turing.


Asking a virus scanner "do you have a ''known'' virus, verbatim" has a linear-time solution (that is, linear with virus size, and amount of data to be scanned).


And partly no, for a few reasons.
Asking anti-virus scanner "is this PC 100% certainly free of code that may have malicious actions at any time in the future", however, is a halting problem.


One, knowing that there are some ''perfectly valid questions'' that ''cannot be answered in reasonable time'', can save you a lot of time even trying to do so, by recognizing the earmarks of such problems.  
Asking a virus scanner "do you have something that looks sort of like a known virus", leaving it up to it what 'looks like' means, makes it look for certain means of variation. This ''may'' take [[exponentially]] more CPU time (with virus size), catches more cases, and but still halts.






Also, discussions of things like this sometimes puts rough bounds on how hard optimal solutions are, which can matter to complex dynamic problems, in that you can sometimes say
https://cs.stackexchange.com/questions/32845/why-really-is-the-halting-problem-so-important
: this is sorta-bad-but-we-can't-do-much better
: or that you probably ''can'' do a lot better


https://www.quora.com/Why-is-%E2%80%9Cthe-halting-problem%E2%80%9D-a-problem-Why-does-it-exist


For example, asking a compiler for the fastest possible code (or shortest possible, or any such optimum).


You ''can'' reason about "it will never be faster than this" bounds, without having to figure out precisely what code that is, so if you can estimate our generated code is close to that, you're fine.
It also relates somewhat to the limits of machines dealing with logic -- boolean logic, at least.


Because in fact, asking for the best possible answer ''is'' a halting problem.
For any particular case it ''might'' answer, but you cannot know for sure that it does.


So it's really useful to know that something is really close to optimal and go with that.
This is how we tackle optimizing compilers in the real world.


A lot of software is written in a relatively straightforward recipe form, and finishes just because what it does doesn't vary, much or at all, with the data you give it.


...though arguably that's just [[complexity theory]].


A lot of real-world data varies may take different approaches with different data. That may be the ''point''.


This tends to increase the amount of possible states and possible edge cases.
In some cases in a very constrained way, in which case a complete proof it will halt will is only more work.


Usually, this undecidability is because of an search space that is infinite. There are some cases that ''look'' finite and merely intractible, and may or may not be depending on how you ask the question.
In other cases it's so explosively bad, or unconstrained, that a proof is intractible.




Asking a virus scanner 'is this a known virus verbatim' has a linear-time solution (with virus size and amount of data to be scanned).


{{comment|(Asking it whether it might be a permutation on one is usually much slower, because depending on how you define variation it might be, say, exponentially more CPU time (with virus size), but still halt.)}}
And sometimes it's more of a mathematical property. That may be hard to ''know'',  
but often things have all the earmarks of 'timesink/insanity awaits those who try'.


Asking anti-virus scanner 'is this 100% certainly free of malicious actions at any time in the future', however, is a halting problem.


One example lies in Goldbach's conjecture, "every even integer can be expressed as a sum of two primes".


There's no short proof anyone has thought of,
so people have written program that just starts finding an answer for every integer.
Once it finds a counterexample, it will halt.


https://cs.stackexchange.com/questions/32845/why-really-is-the-halting-problem-so-important
Will it halt?


It also relates somewhat to the limits of machines dealing with logic -- boolean logic, at least.
If the conjecture is false, which is provable with any one counterexample, it will finish.
If the conjecture is true, it will never finish.  




There are heuristics that suggest it might be true and never halt.


People have run exactly such programs for a long while, and so far gotten only answers (up to 4E18 so far).


Which makes it certainly ''presumable'' the conjecture is true, but is actually an unsolved problem.


And in some ways, ''if'' the conjecture is true,
''that means'' it will be impossible to solve in this exhaustive-search way.


https://www.quora.com/Why-is-%E2%80%9Cthe-halting-problem%E2%80%9D-a-problem-Why-does-it-exist


But we cannot know whether that will halt.
Only if and when it does.


-->
-->

Latest revision as of 14:32, 25 March 2024


"I suppose, but um, why is that interesting at all?"