Mastering the Ackermann Function

Yesterday I came in touch with a curious, astonishing mathematical function. It's called Ackermann function.

Mathematically speaking, it is a well-defined total function. That is, it has defined values for every integer input (= total function), and this value is not ambiguous (every input has one and one only possible output value) (= well-defined).

Speaking about computer science, this function is computable, but it's not a primitive recursive function. In other words, you can implement an algorithm to express the function using while-loops (= computable), but still you cannot write an equivalent algorithm using only do-loops (= not primitive recursive). I suggest you to try this statement.

The function has this form:

As you can see, it's form is fairly simple. But, even if it seems simple, its values explode quickly. A(4, 2), for example, has more than 19.000 digits.

Ackermann Function on the complex plane

Representation of the Ackermann Function on the complex plane

What I want to talk about is the possible implementation of such a function by means of a computer algorithm. I will use Python for the next examples, but the description is language-independent.

Recursive approach

The quicker approach to solving such a problem is by implementing a recursive algorithm, since the definition of the function itself is recursive. Let's try with something like this:

import sys

def A(x, y):
  if x == 0:
    return y+1
  else:
    if y == 0:
      return A(x-1, 1)
    else:
      return A(x-1, A(x, y-1))

x = int(sys.argv[1])
y = int(sys.argv[2])

print "Result of A(%d, %d) is %d" % (x, y, A(x, y))

You will want to try it calling from command line:

python ackermann.py 4 2

You will face a brutal reality: this algorithm will not end for the input (4, 2), since it quickly reaches the maximum recursion depth provided by Python. We will get better results starting from lower inputs.

You will find that the maximum computable values, for each possible x are:

  • A(1, 997) = 999
  • A(2, 497) = 997
  • A(3, 6) = 509
  • A(4, 0) = 13

For every input (x, y) with x greater than 5 you will get no result, for each y. As you can see, the function quickly diverges in complexity when x = 3, and much more when x = 4.

What we can try to do is jump as much iterations as possible with a simple trick: saving already computed values in order to not compute them anymore. This could simplify the navigation through the recursion graph.

Let's try this implementation (with some debug information):

import sys

res = {}
jumps = 0
recursions = 0

#------ Start Method -------#
def A(x, y):
  global res, jumps, recursions

  recursions += 1
  try:
    jumps += 1
    return res[x][y]
  except Exception, e:
    jumps -= 1
    res[x] = {}

  if x == 0:
    res[x][y] = y+1
  else:
    if y == 0:
      res[x][y] = A(x-1, 1)
    else:
      res[x][y] = A(x-1, A(x, y-1))

  return res[x][y]
#------ End Method -------#

x = int(sys.argv[1])
y = int(sys.argv[2])

try:
  print "Result of A(%d, %d) is %d" % (x, y, A(x, y))
  print "%d total recursions, %d operations avoided" % (recursions, jumps)
except Exception, e:
  print "Exception occurred after %d recursions" % (recursions)

As you can see, the res variable is loaded with each computed value (at lines 20, 23, 25), and at the beginning of the function we try to return the value of A(x, y) if already computed (at line 14). The jumps variable keeps track of the number of the iterations saved by this implementation, while the recursions variable keeps track of the total number of single calls the the A function.

You can now still try to call the function to see what the maximum computable values are. You will find this:

  • A(1, 995) = 997
  • A(2, 496) = 995
  • A(3, 6) = 509
  • A(4, 0) = 13

We got three results: the maximum computable inputs have decreased ((1, 995) instead of (1,997) for example), the number of jumps increases with increasing values of the inputs, thus the computation time has decreased. Sadly, our target was not to reduce computable time, but reduce recursion depth.

We should try something different, to truly master the Ackermann Function. This something different involves computing the analytic form of the function. This approach will show a completely different world.

Probably I'll talk about that in a future article. Let me know what you think about it, and submit your different implementations (different languages, different approaches) but still focusing on the recursion optimization.

Lascia un commento

Il tuo indirizzo email non sarĂ  pubblicato. I campi obbligatori sono contrassegnati *