## Monday, August 19, 2013

### Recursive call optimization

In a previous post, I wrote the map function in F#, compiled it, and disassembled it, and expressed my disappointment that it was recursive.  What I had hoped for was something like tail call optimization.

Often, in my imperative programming career, I have had to choose between implementing a process in a recursive way or a looping way.  In one language that I work with, the number of nested blocks allowed is limited, so I must be very careful about how I build recursive solutions.  I usually view the loop implementation as inferior because it lacks the beauty of the recursive one, but usually it is more efficient.  Since functional programming relies extensively on recursion, however, it is often optimal to turn recursive function calls into loops at compile time.

Functions require memory to process calculations.  For convenience, I'll refer to the units of memory used as variables, even if (strictly speaking) they are not variable.  The effective scope of a variable begins when it is assigned a value, and it ends at the last point that the value is referenced.  Since some control branches of a function may not reference a variable, the scope may cover parts of the expression tree, rather than a linear range of lines of code.
If no variables are in scope at the time of a recursive call, then the recursive call can be implemented as a pair of loops.  The first loop of the pair represents the descent into the recursive call, while the second loop represents the return.
Since most variables are out of scope by the end of a function, a recursive call at the end of the function can usually be implemented as a single loop (since nothing is done on return), which is called tail call optimization.  However, the general rule allows recursive calls to be implemented as a pair of loops anywhere, provided the recursive call falls outside the scope of all other variables.

Here is a contrived example of a recursive C# method that can be optimized in this way:

public int TestFunction(int n) {
// Variable r is defined here, but not in scope yet,
// because its initial value will not be used
int r;

// Start of scope for t
int t = n * n + n;

// Start of scope for j
int j = 2 * t - 13;

// Recursion should stop somewhere
if (t < 100000 && j < 100000) {
// Scope of t and j ends here on this execution branch.
// Scope of r starts here
r = TestFunction(t + j) - 7;
} else {
// Scope of t and j ends here on this execution branch.
// Scope of r starts here
r = t - j;
}

// Scope of r ends here
// Scope of k starts here
int k = 3 * r;

// Scope of k ends here
return k;
}

The optimized implementation of this recursive function is as follows:

public int TestFunction(int ntop) {
int n;
int t;
int j;
int r;
int k;
int loopCount = 0;

// Descent loop
n = ntop;
while(true) {
t = n * n + n;
j = 2 * t - 13;
if (t < 100000 && j < 100000) {
n = t + j;
++loopCount;
} else {
break;
}
}

// Bottom of call processing
r = t - j;
k = 3 * r;

// Return loop
while (loopCount > 0) {
r = k;
r -= 7;
k = 3 * r;
--loopCount;
}

return k;
}