An absolute beginner’s guide to folding in Haskell


(This is a small write-up I did on a forum for a Haskell MOOC).

foldr and foldl are simply means of reducing/accumulating/folding over values of a sequence into a single value. That’s basically it!

To make it more concrete, let’s analyse a specific implementation of foldr and foldl for Lists:

The easier one to understand is actually foldl since it is more intuitive and natural:

foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f acc [] = acc
foldl f acc (x:xs) = foldl f (f acc x) xs

The first thing to do is look at the type. How does it read? In plain English, you can read that as “foldl is a function that takes three arguments: the first argument is a function that takes a value of type b and a value of type a and produces a value of type b, the second argument is a value of type b (this is the accumulator), and the final argument is a list of values of type a. The overall result is a single value of type b”. Makes sense?

With that intuition, let’s look at an example and map that to the definition itself:

Suppose you have a list, xs = [1, 2, 3, 4, 5] and you want to find the sum of all the elements of this list. Then you basically want to reduce the list into a single element under addition, right? So you can define the whole operation as: foldl (+) 0 xs.

Now see how that maps to the definition. So the first argument to foldl, which is a function, is +, and this makes sense since + is a binary function that takes two numerical values and produces their sum. The second argument, which is the “accumulator” (basically the one which keeps accumulating the results as we traverse the list) and we want to start off with 0 here since that is the identity for addition. Finally, the final argument is the list of values itself, xs.

Now look at the body of foldl. It has two patterns to match the inputs against:

1). foldl f acc [] = acc

So when we run out of values in our input list, we simply return the accumulator, which had been dutifully collecting the running sum of the elements of the list. So in our case, this would be analogous to something like foldl (+) acc [] = acc.

2). foldl f acc (x:xs) = foldl f (f acc x) xs

This is the more interesting case. Translating it to our example, we might have something like foldl (+) acc (x:xs) = foldl (+) (acc + x) xs. This is the crucial part – note the acc + x part. In the recursive call, the next value of acc will be acc + x, right? So we are collecting, in this example, the sums of the elements of the list in the variable acc. Now, the most important bit – note that acc is always the left operand of the function as in acc + xs (which, in function application form would be (+) acc xs). This is the reason why we call it “foldl” or “fold left” – we are simply reducing the list from the left to the end of the list!

So, for example:
foldl (+) 0 [1, 2, 3, 4, 5] can be expanded (conceptually) as:

foldl (+) 0 [1, 2, 3, 4, 5]
= foldl (+) (0+1) [2, 3, 4, 5
= foldl (+) ((0+1)+2) [3, 4, 5]
= foldl (+) (((0+1)+2)+3) [4, 5]
= foldl (+) ((((0+1)+2)+3)+4) [5]
= foldl (+) ((((((0+1)+2)+3)+4)+5) []
= (((((0+1)+2)+3)+3)+5) — from the first pattern in the definition of `foldl`.
= 15

As you can see, we move from left to right across the list, and keep accumulating the values as we go along. This is also the reason why foldl is much more efficient than foldr (as we will see).

Now, onto foldr. foldr is very similar to foldl, but whereas foldl fold from left to right, foldr fold from right to left! First, again, let’s look at the definition:

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f acc [] = acc
foldr f acc (x:xs) = f x (foldr f acc xs)

The type in plain English: “foldr is a function that takes three arguments – the first argument is a binary function that takes a value of type a and a value of type b, and produces a value of type b, the second argument is the “accumulator” of type b, and the final argument is the list of values of type a. The overall result is a single value of type b”.

Now taking the same example, let’s analyse the body:

1). foldr f acc [] = acc

This is identical to the first pattern in the body of foldl. This makes sense since once we have exhausted the list, we simply need to return the final result, which is stored in the running accumulator, acc.

2), foldr f acc (x:xs) = f x (foldr f acc xs)

Now, this is the tricky part and needs to be analysed carefully. The analogue of this definition for our sum example would be foldr (+) acc (x:xs) = x + (foldr (+) acc xs). What does this mean?

Well, since we want to process the list starting from the rightmost end of the list, we are forced to have to update the accumulator from the right and move towards the beginning of the list. This is what that definition basically means. So if you observe carefully, the value of the accumulator is actually given by the expression (foldr f acc xs) (or (foldr (+) acc xs) in our example). Remember that in the case of foldl, the accumulator was always the left operand. Likewise, in the case of foldr, the accumulator is always the right operand. This is why in the expression f x (foldr f acc xs), the second operand is the running state of the accumulator. Note that the full call itself cannot be fully evaluated until the (foldr f acc xs) part has been evaluated. This means that we keep on building up a stack of function calls, and only when the entire list has been consumed can we begin actually updating the value of the accumulator. This is why foldr is much slower and memory-intensive than foldl, which does not suffer from this deficiency.

To use the same example, let’s evaluate the call foldr (+) 0 [1, 2, 3, 4, 5]:

foldr (+) 0 [1, 2, 3, 4, 5]
= 1 + (foldr (+) 0 [2, 3, 4, 5])
= 1 + (2 + (foldr (+) 0 [3, 4, 5]))
= 1 + (2 + (3 + (foldr (+) 0 [4, 5])))
= 1 + (2 + (3 + (4 + (foldr (+) 0 [5]))))
= 1 + (2 + (3 + (4 + (5 + (foldr (+) 0 []))))) — we now match with the first pattern in the definition of `foldr`
= 1 + (2 + (3 + (4 + (5 + 0)))) — we now keep updating the accumulator all the way from right to left
= 1 + (2 + (3 + (4 + 5)))
= 1 + (2 + (3 + 9))
= 1 + (2 + 12)
= 1 + 14
= 15

If you observe carefully, the expressions are parenthesised from right to left whereas in the case of foldl, they were from left to right.

Now, you see that both foldl and foldr gave the same result 15 for our example. However, this is only because addition is a commutative property (just like multiplication), and that’s why it doesn’t matter if we fold from the left or from the right. So, multiplication would also give the same results for both foldl and foldr, but subtraction and division would not, since they are not commutative operations.

To confirm this assertion, just compare the outputs of the following expressions:

Prelude> foldr (+) 0 [1..5]
15
Prelude> foldl (+) 0 [1..5]
15

Prelude> foldr (*) 1 [1..5]
120
Prelude> foldl (*) 1 [1..5]
120

Prelude> foldr (-) 0 [1..5]
3
Prelude> foldl (-) 0 $ [1..5]
-15

Prelude> foldr (/) 1.0 [1.0, 2.0, 3.0, 4.0, 5.0]
1.875
Prelude> foldl (/) 1.0 [1.0, 2.0, 3.0, 4.0, 5.0]
8.333333333333333e-3

Let’s just analyse the subtraction example:

foldr (-) 0 [1..5]
= 1 – (foldr (-) [2, 3, 4, 5])
= 1 – (2 – (foldr (-) [3, 4, 5]))
= 1 – (2 – (3 – (foldr (-) [4, 5])))
= 1 – (2 – (3 – (4 – (foldr (-) [5]))))
= 1 – (2- (3 – (4 – (5 – (foldr (-) [])))))
= 1 – (2 – (3 – (4 – (5 – 0))))
= 1 – (2 – (3 – (4 – 5)))
= 1 – (2 – (3 – (-1)))
= 1 – (2 – 4)
= 1 – (-2)
= 3.

and

foldl (-) 0 [1..5]
= foldl (-) (0-1) [2, 3, 4, 5]
= foldl (-) ((0-1)-2) [3, 4, 5]
= foldl (-) (((0-1)-2)-3) [4, 5]
= foldl (-) ((((0-1)-2)-3)-4) [5]
= foldl (-) (((((0-1)-2)-3)-4)-5) []
= (((((0-1)-2)-3)-4)-5)
= -15.

Et voila!

Optional

In case you are familiar with C, the functions may be rendered thus, for instance:

// The generic foldr function
void *foldr(void *(*fptr)(void*, void*), void *acc, void *arr, size_t n, size_t delta)
{
    if (!n) {
        return acc;
    } else {
        return fptr(arr, foldr(fptr, acc, arr+delta, n-1, delta));
    }
}

// The generic foldl function
void *foldl(void *(*fptr)(void*, void*), void *acc, void *arr, size_t n, size_t delta)
{
    if (!n) {
        return acc;
    } else {
        return foldl(fptr, fptr(acc, arr), arr+delta, n-1, delta);
    }
}

Note how they map to the Haskell definitions directly.

If you are interested, here is a full program (note that this is only to simulate Haskell behaviour as closely as possible – it is not the idiomatic way to do the operations in C, and it leaks memory as well which we do not care to consider for this demo) that you can run to observe the behaviour and compare it with Haskell:

The program:

#include <stdio.h>
#include <stdlib.h>

void *add_int(void *x, void *y)
{
    int *res = malloc(sizeof(int));
    *res = *((int*) x) + *((int*) y);

    return (void*) res;
}

void *multiply_int(void *x, void *y)
{
    int *res = malloc(sizeof(int));
    *res = *((int*) x) * *((int*) y);

    return (void*)res;
}

void *subtract_int(void *x, void *y)
{
    int *res = malloc(sizeof(int));
    *res = *((int*) x) - *((int*) y);

    return (void*) res;
}

void *divide_double(void *x, void *y)
{
    double *res = malloc(sizeof(double));
    *res = *((double*) x) / *((double*) y);

    return (void*) res;
}

// The generic foldr function
void* foldr(void *(*fptr)(void*, void*), void *acc, void *arr, size_t n, size_t delta)
{
    if (!n) {
        return acc;
    } else {
        return fptr(arr, foldr(fptr, acc, arr+delta, n-1, delta));
    }
}

// The generic foldl function
void *foldl(void *(*fptr)(void*, void*), void *acc, void *arr, size_t n, size_t delta)
{
    if (!n) {
        return acc;
    } else {
        return foldl(fptr, fptr(acc, arr), arr+delta, n-1, delta);
    }
}

int sum_foldr(int a[], size_t n)
{
    int sum = 0, acc = 0;
    void *ret = foldr(add_int, &acc, a, n, sizeof(int));
    if (ret) {
        sum = *((int*) ret);
        free(ret);
    }

    return sum;
}

int product_foldr(int a[], size_t n)
{
    int prod = 1, acc = 1;
    void *ret = foldr(multiply_int, &acc, a, n, sizeof(int));
    if (ret) {
        prod = *((int*) ret);
        free(ret);
    }

    return prod;
}

int sum_foldl(int a[], size_t n)
{
    int sum = 0, acc = 0;
    void *ret = foldl(add_int, &acc, a, n, sizeof(int));
    if (ret) {
        sum = *((int*) ret);
        free(ret);
    }

    return sum;
}

int product_foldl(int a[], size_t n)
{
    int prod = 1, acc = 1;
    void *ret = foldl(multiply_int, &acc, a, n, sizeof(int));
    if (ret) {
        prod = *((int*) ret);
        free(ret);
    }

    return prod;
}

int sub_foldr(int a[], size_t n)
{
    int diff = 0, acc = 0;
    void *ret = foldr(subtract_int, &acc, a, n, sizeof(int));
    if (ret) {
        diff = *((int*) ret);
        free(ret);
    }

    return diff;
}

int sub_foldl(int a[], size_t n)
{
    int diff = 0, acc = 0;
    void *ret = foldl(subtract_int, &acc, a, n, sizeof(int));
    if (ret) {
        diff = *((int*) ret);
        free(ret);
    }

    return diff;
}

double div_foldr(double a[], size_t n)
{
    double div = 1.0, acc = 1.0;
    void *ret = foldr(divide_double, &acc, a, n, sizeof(double));
    if (ret) {
        div = *((double*) ret);
        free(ret);
    }

    return div;
}

double div_foldl(double a[], size_t n)
{
    double div = 1.0, acc = 1.0;
    void *ret = foldl(divide_double, &acc, a, n, sizeof(double));
    if (ret) {
        div = *((double*) ret);
        free(ret);
    }

    return div;
}

int main(int argc, char **argv)
{
    int a[] = { 1, 2, 3, 4, 5 };
    size_t n = sizeof(a)/sizeof(a[0]);

    double b[] = { 1.0, 2.0, 3.0, 4.0, 5.0 };
    size_t m = sizeof(b)/sizeof(b[0]);

    printf("sum_foldr = %d\n", sum_foldr(a, n));
    printf("product_foldr = %d\n", product_foldr(a, n));

    printf("sum_foldl = %d\n", sum_foldl(a, n));
    printf("product_foldl = %d\n", product_foldl(a, n));

    printf("sub_foldr = %d\n", sub_foldr(a, n));
    printf("div_foldr = %lf\n", div_foldr(b, m));

    printf("sub_foldl = %d\n", sub_foldl(a, n));
    printf("div_foldl = %lf\n", div_foldl(b, m));

    return 0;
}

Running it:

$ gcc -Wall -O2 -o fold fold.c && ./fold
sum_foldr = 15
product_foldr = 120
sum_foldl = 15
product_foldl = 120
sub_foldr = 3
div_foldr = 1.875000
sub_foldl = -15
div_foldl = 0.008333

Exactly the same result as in the case of Haskell.

The basic idea is this – take a simple example, and work through it with pen and paper to really understand how it works!

Advertisements

Speak your mind!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s