An algorithm is a computational process that takes a problem instance and in a finite amount of time produces a solution. . . . It is hard to make the definition of algorithm more precise except by saying that a computational process is anything that can be done by a program for a computing machine, and in that case one must accept that a human being with paper and pencil is a kind of computing machine. (Floyd and Beigel, The Language of Machines, Computer Science Press, W.H. Freeman, 1994, p. 444.)
Problems which are intended to be solved by a computational process may be stated in a variety of ways:
A decision problem is stated as a question with a "yes" or "no" answer, such as:
If the problem is stated as a Boolean statement --- an assertion which is either true or false --- we call the statement a predicate.
If there is an effective procedure for answering the question or evaluating the assertion, we say that the problem is decidable.
Other problems require a unique but particular answer:
This type of problem can be viewed as the evaluation of a function, since the answer is unique. It can also be viewed as a mapping of an input value to an output value. If we have a function that computes smallest prime factors, we give it 23171 as an input value and it yields 17 as an output value. Note that problems which we might think of as procedures or processes, such as sorting, can also be viewed as functions which map inputs to outputs --- a sorting program maps a given unsorted list of names to a unique sorted list of names.
A function is said to be computable if there is an effective procedure (an algorithm) for evaluating the function which, given any input or set of input values for which the function is defined, produces the correct result and halts in a finite amount of time. (If the function is undefined for some input values in its domain [i.e., it is a partial function rather than a total function], the procedure is not required to halt. For example, we can regard the problem of finding the smallest prime factor of a given natural number as computable even if it fails to halt if given 0 as an input, and assuming that we define the function to be 1 if the input number is prime or 1.)
A decision problem can be reformulated as a function by defining a function which returns 0 if the answer is 'no' or the predicate is false or 1 if the answer is 'yes' or the predicate is true.
Some problems may have multiple correct answers or no correct answers. (This is the general case which could be viewed as including the previous two categories.)
Mathematicians call such problems relations, because the answers are not unique.
A relation may be computable even if it doesn't produce a result, as would happen if we provided a prime number as the input to a prime-factor-finding procedure. Since we could reformulate a relation as a function by specifying a particular value to be returned if there is no answer to the problem and by defining the function to return a list of all the possible answers or a randomly selected answer if there are multiple correct answers to the problem, it suffices to talk about computable functions and to ignore the distinction between functions and relations for the purposes of determining computability.
In 1936, Alan Turing published a paper called "On computable numbers, with an application to the Entscheidungsproblem [decision problem]", in which he addressed a previously unsolved mathematical problem posed by the German mathematician David Hilbert in 1928: Is there, in principal, any definite mechanical method or process by which all mathematical questions could be decided? (source)
To answer the question (in the negative), Turing proposed a simple abstract computing machine, modelled after a mathematician with a pencil, an eraser, and a stack of pieces of paper (Floyd and Beigel, p . 444). He asserted that any function is a computable function if it is computable by one of these abstract machines. He then investigated whether there were some mathematical problems which could not be solved by any such abstract machine. Such abstract computing machines are now called "Turing machines". One particular Turing machine, called a "Universal Turing Machine", served as the model for designing actual programmable computers.
A Turing machine (TM) consists of a control unit and a read/write head positioned over a tape of unlimited length which contains a finite string (sequence) of characters from some alphabet (designated set of possible characters). The tape is conceptually divided into squares or frames, each of which can hold precisely one character. The possible operations of a TM are:
In most formulations, the tape is "one-way infinite", extending infinitely to the right. The head cannot move left from the left end of the tape; if it moves right beyond the end of the string on the tape to a previously unvisited square, this square is assumed to be blank (i.e., a blank is presumed to be a valid character in the designated alphabet). [The Turing machine simulator in the online materials accompanying the textbook does not behave this way --- you have to be sure to provide sufficient 'b' characters to represent blanks at the end of the input string.]
Most actual computers do not use a tape for input, storage, and output --- they use random-access memory (RAM), stacks, and registers. So why do we use a Turing machine --- a tape machine --- as a model of computing?
One might also ask, "Aren't Turing machines slow?" Since the Turing machine is an abstract machine --- that is, it exists only as a mental concept, or as a diagram on paper --- its speed is both irrelevant and unknown. All we need to assume is that each operation of a Turing machine takes some non-zero amount of time to execute. This allows us to talk about Turing machines that never halt for some inputs. (If operations were assumed to be instantaneous, then even a process that required an infinite number of steps would finish right away, wouldn't it?)
Several other researchers tried to address the Decision Problem by other methods. Alonzo Church introduced recursive partial functions as a formalization of algorithmically computable functions. Emil Post proposed symbol manipulation systems for making logical deductions. When it was proven that all three models were equivalent (i.e., they defined the same class of functions, and agreed as to which of them are computable), Church recognized that "all formalizations of algorithms were destined to yield the same class of computable functions" (Denning, Dennis, and Qualitz, Machines, Languages, and Computation, Prentice-Hall, 1978, p. 477) and proposed what has come to be known as the Church-Turing thesis (given here as formulated by Floyd and Beigel, p. 444):
A Turing machine program can simulate any physically realizable computational process at all --- including that of the most powerful digital computers or a human being.
This is a thesis, not a theorem. It cannot be proven, because the term "computational process" (equivalently, "algorithm") is not formally defined. Turing proposed his abstract machines precisely for the purpose of formalizing the concept of algorithm.
Turing also showed that it is possible (actually, fairly easy) to design a single Turing machine which can simulate the computations of any Turing machine, given an encoded description of the target TM and its initial configuration (the "input" string on its tape, the initial state, and the position of the head). Such a machine is called a Universal Turing Machine (UTM).
Physical programmable computers are effectively Universal Turing Machines which simulate the computations of any Turing machine (think, "algorithm" or special-purpose computer) by executing a program, which we can think of as a description of the computational process to be performed.
The existence of the UTM, together with the Church-Turing thesis, implies that "there is a certain minimal level of computational ability that is sufficient for any algorithmic computation" (Denning, Dennis, and Qualitz, p. 486).
The most startling result of Turing's 1936 paper was his assertion that there are well-defined problems that cannot be solved by any computational procedure. If these problems are formulated as functions, we call such functions noncomputable; if formulated as predicates, they are called undecidable. Using Turing's concept of the abstract machine, we would say that a function is noncomputable if there exists no Turing machine that could compute it.
The proof that there are functions which are noncomputable uses a method of converting problems to numbers invented by Kurt Goedel.
Just as we can encode any Turing machine in a description which can be submitted to a Universal Turing Machine for simulation, it is also possible to assign a "serial number" (a unique positive integer) to every possible Turing machine. Many different encoding methods are possible --- the textbook presents one method using binary strings. This implies that the set of all Turing machines is countable, that is, is in one-to-one correspondence with the integers.
However, the set of all functions f: N -> N (matchings of inputs to outputs over the natural numbers) is known to be uncountable. We must conclude that Turing machines are able to compute only a subset of the number-theoretic functions.
We might assume that those functions which are noncomputable mustn't be useful. However, Turing also provided a constructive proof for his thesis by showing that a particular function which would be very useful to computer scientists was noncomputable: the halting problem.
The same problem can be stated in an alternative formulation which highlights its significance for computer scientists:
- Epimenides' Paradox:
- This sentence is false.
- The text below a picture of a pipe in Rene Magritte's The Air and the Song (1964):
- Ceci n'est pas une pipe.
- Bertand Russell's Barber Paradox:
- The barber in a certain town had a sign on the wall saying, "I shave those men, and only those, who do not shave themselves."
The proof that the halting problem is noncomputable relies on the same device illustrated in the preceding paradoxes and used by Goedel to prove his Incompleteness Theorem: self-reference.
Suppose there is an algorithm Hto solve the halting problem: given an encoding of a program P and an input string x, it returns 'true' if program P halts on input x, and 'false' otherwise. Use H as a subroutine to perform the conditional test in the following program H'(as formulated by Floyd and Beigel, p. 479):
input x, a string which encodes a program
if program x halts on input x then
loop forever
else
halt
Note that this program passes its input string to the subroutine H both as the encoding of a program (or, equivalently, a Turing machine) and as the input string for which we are to determine if program x halts. This means that H' halts on input x only if x doesn't halt on input x.
What happens if we give H' its own description as its input string?
Hoare and Allison re-state the conclusion this way (in "Incomputability", Computing Surveys 4, no. 3 [Sept. 1972]):
Any language containing conditionals and recursive function definitions which is powerful enough to program its own interpreter cannot be used to program its own 'terminates' function.
There are many problems related to programs which are undecidable --- so many, in fact, that H.G. Rice proved the following theorem (using a complicated definition of "nontrivial"):
Any nontrivial property of programs is undecidable.
There are also undecidable problems in other subject areas. Note that proving that a problem is noncomputable does not mean that a solution cannot be computed in some cases (i.e., for some inputs), but that it is impossible to construct a solution that works for any input.