A Line number closure in Rust, and an interesting battle with the Rust compiler

It had been a while since I’d played around with Rust, so I decided to implement one of my two favourite programs when learning new languages viz., a line number closure (with mutation, of course) as shown by Hoyte in his seminal book, “Let Over Lambda”. The other program is, of course, implementing a functioning Rational number class (as in the previous blog post, for instance).

Well, with Rust, it’s a very…interesting experience to say the least. The Rust Borrow Checker is a very strange beast indeed. So, let’s have a look at how we can arrive at a working version of the program.

Background

Rust does have powerful support for closures (what respectable language does not today?). However, the interesting part is that closures in Rust are simply syntactic sugar for traits. To be more precise, there are three available traits that a closure can belong to:

  • the Fn trait – this is the least permissive of all, and it allows a read-only closure (in effect). This is ideal when we want to simply have a self-contained immutable closure. For instance,
    fn square<F>(n: i32, callback: F)
        where
        F: Fn(i64) ->() {
        callback(n as i64 * n as i64);
    }
    
    fn main() {
        square(100, |n| {  println!("Received {}", n); });
    }
    

    This duly prints out:

    Received: 10000
    

    This is quite straightforward. As we can see, The type of the callback, which is the closure, is Fn(i64) -> (), which means that it takes a 64-bit integer and does some side-effect (printing to screen in this case). No mutation anywhere, no capturing of environments, and so the Fn trait works superbly here.

  • the FnMut trait is more interesting. This is needed when we need to mutate the environment captured by the closure. As a small example:
    fn process_mutating_closure<F>(mut closure: F)
        where
        F: FnMut(i32) -> () {
        closure(100);
    }
    
    fn main() {
        let mut local = 100;
    
        process_mutating_closure(|x| { 
                  local += 100; 
                  println!("Value = {}", x + local); 
        });
    
    

    which produces:

    Value = 300
    

    All right, this is a bit more complicated than the previous version. So, let’s break it down. What we intend to do is this – the closure that’s created in the main function captures a mutable variable, local, in the enclosing lexical scope. The closure then modifies this captured variable, and uses it for some dummy processing before printing the final value out.

    Since a captured variable is being modified, the type of the closure will be FnMut. This is exactly what’s reflected in the type signature when the closure is passed as an argument to process_mutating_closure.

  • , and finally, the FnOnce trait is the most restrictive of all. It basically transfers ownership of the capture to the closure itself. If you are not familiar with Rust’s wonderful (sarcasm) Ownership and Borrowing world, consider yourself lucky.
    fn move_closure_demo<F: FnOnce(i32) -> ()>(closure: F) {
        closure(100)
    }
    
    fn main() {
        let mut local = 100;
    
        move_closure_demo(move |x| {
            local += 100;
            println!("Result = {}", x + local);
        });
    
        println!("{}", local);
    }
    
    Result = 300
    

    Eh? This code looks almost identical to the previous one (except for the change in the function signature of the closure from FnMut to FnOnce and the move in front of the closure). Well, if you notice, you will also see that the function var, closure in the function move_closure_demo also does not require mut unlike in the previous case.

    What’s happening here is that because we have “moved” the closure, the closure is now the owner of the captured variable, local. Wait a minute, so how come we can still use local in main after it’s been moved? It turns out that because the binding of local is to a Copy-able type (i32 in this case), a copy of the binding’s value is actually moved. However, this won’t work for types that do not implement the Copy trait. Confused? Well, you should be. I mean, wtf?

All right, now let’s get on with the meat of the actual matter.

First attempt

Let’s put our knowledge of how closures are actually syntactic sugar for the aforementioned trifecta of traits to use. Here’s the first shot:

fn gen_line_number() -> (FnMut() -> ()) {
    let mut line_no = 0;

    || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    }
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

Since we’re modifying the capture var, line_no, we need the closure to be of the trait type, FnMut. Looks okay, let’s take it for a spin:

bash-3.2$ rustc hoyte-blog.rs 
error[E0277]: the trait bound `std::ops::FnMut() + 'static: std::marker::Sized` is not satisfied
 --> hoyte-blog.rs:1:26
  |
1 | fn gen_line_number() -> (FnMut() -> ()) {
  |                          ^^^^^^^^^^^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut() + 'static`
  |
  = note: `std::ops::FnMut() + 'static` does not have a constant size known at compile-time
  = note: the return type of a function must have a statically known size

error[E0308]: mismatched types
 --> hoyte-blog.rs:4:5
  |
4 |     || {
  |     ^ expected trait std::ops::FnMut, found closure
  |
  = note: expected type `std::ops::FnMut() + 'static`
  = note:    found type `[closure@hoyte-blog.rs:4:5: 8:6 line_no:_]`

error[E0277]: the trait bound `std::ops::FnMut(): std::marker::Sized` is not satisfied
  --> hoyte-blog.rs:12:9
   |
12 |     let foo = gen_line_number();
   |         ^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut()`
   |
   = note: `std::ops::FnMut()` does not have a constant size known at compile-time
   = note: all local variables must have a statically known size

error[E0277]: the trait bound `std::ops::FnMut(): std::marker::Sized` is not satisfied
  --> hoyte-blog.rs:20:9
   |
20 |     let bar = gen_line_number();
   |         ^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut()`
   |
   = note: `std::ops::FnMut()` does not have a constant size known at compile-time
   = note: all local variables must have a statically known size

error: aborting due to 4 previous errors

Whoa! Okay, let’s read the error messages in order. Maybe that will help us fix the issue better. Right, so the first one says something about std::marker::Sized not being implemented… blah blah blah. What all this verbosity actually means is that we are returning a closure from the function, and it needs to know the size of the closure at compile time itself.

Fair enough. However, why does it work when we’re passing a closure into a function the same way (instead of returning the closure from it)? Who the hell really knows? Well, the basic problem here (as I figure it) is that we need to wrap the whole damned thing inside a heap pointer (aka the Box type) since we’re returning from the function, and the stack frame for that function is going to get de-allocated, we need to return a heap pointer to the object so that it can be used even after the current function frame’s destroyed.

This is the reason why code like the following works because the callback function, closure cannot outlive the function where the closure was generated (main):

fn foo<F: Fn() -> ()>(closure: F) {
    closure();
}

fn main() {
    foo(|| { println!("Hello, world!");});
}
Hello, world!

Okay, let’s proceed.

Second Attempt

Having gained that great insight from the first version, here’s how the second version looks like:

fn gen_line_number() -> Box<FnOnce() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

and…

bash-3.2$ rustc hoyte-blog.rs 
error[E0161]: cannot move a value of type std::ops::FnOnce(): the size of std::ops::FnOnce() cannot be statically determined
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^

error[E0161]: cannot move a value of type std::ops::FnOnce(): the size of std::ops::FnOnce() cannot be statically determined
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^

error[E0382]: use of moved value: `*foo`
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^ value moved here in previous iteration of loop
   |
   = note: move occurs because `*foo` has type `std::ops::FnOnce()`, which does not implement the `Copy` trait

error[E0382]: use of moved value: `*bar`
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^ value moved here in previous iteration of loop
   |
   = note: move occurs because `*bar` has type `std::ops::FnOnce()`, which does not implement the `Copy` trait

error: aborting due to 4 previous errors

Sigh. Just when things were looking up… theoretically. All right, before we lose our sanity, let’s read all the bloody error messages and try to make some sense of them. Okay, ignoring the first two about statically size something, we see that the other errors are complaining about foo and bar being already moved in the previous iteration of the loop. At least that makes some sort of sense – since we have the closure passed to main and assigned to foo and bar, and, as the error message says, FnOnce does not implement the Copy trait, we cannot change the owner of the closure in subsequent iterations of the for loop.

Let’s rectify that by using the FnMut trait instead, which is actually what we want since we are not using any ownership changes, and we’re mutating the captured local variable, it sounds like the perfect candidate for our use case.

In case you’re wondering about modifying the main function instead to use references inside the for loops, I assure you that it’s not an oversight on my part – down that road lies perdition! (Reflect upon the fact that I have a working copy, and I’m actually working backwards through some sane progression amongst the many that I had to endure along the way!)

Third attempt

Here’s the version with the closure type changed to FnMut:

fn gen_line_number() -> Box<FnMut() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

And yes, you guessed it, it simply doesn’t work, but you’ll love the error messages:

bash-3.2$ rustc hoyte-blog.rs 
error: cannot borrow immutable `Box` content `*foo` as mutable
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^

error: cannot borrow immutable `Box` content `*bar` as mutable
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^

error: aborting due to 2 previous errors

Hmmm… hmmm? It mentions that when I call foo and bar, it’s trying to make something mutable that’s actually immutable. Eh? Okay, let’s see. The local captured variable was “moved” to live beyond the function’s stack frame, so that’s fine. I’m using FnMut so that I can actually mutate the captured variable, that’s fine. I’m wrapping the whole damned thing inside a Box so that the whole closure can persist beyond the function call, and finally, all I’m trying to do is to increment the mutable captured line_no var inside my for loops. That also sounds fine. So when did the contents of the box become immutable?

All right, let’s see the final version first.

Final version

fn gen_line_number() -> Box<FnMut() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let mut foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let mut bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

and running it,

bash-3.2$ rustc hoyte-blog.rs 
bash-3.2$ ./hoyte-blog 
Current line number: 1
Current line number: 2
Current line number: 3
Current line number: 4
Current line number: 5

Current line number: 1
Current line number: 2
Current line number: 3
Current line number: 4
Current line number: 5

Wow! Glory at last! Heh. Well, if you have the eyes of a hawk, you’ll observe that the only change was adding the mut qualifier to the closure handles in main. So what that extremely useful, intuitive, and helpful error message was telling us is that we have a boxed closure containing a captured mutable local var which we’re updating in every iteration of the respective for loops, but unfortunately we have assigned the closures to immutable handles, and so the whole thing just won’t work.

Whew. And seriously, wtf?

Rust appears to be great fun when things go well, but when things go badly (and they will, of course), it can be quite a PITA. Granted, I am not an expert or even an intermediate Rust programmer, but perhaps people’s complaints about the learning curve being seriously fucked up aren’t really exaggerated. If the learning curve had been at least logical, it would be a different matter. As things stand though, the more I venture into the language, the more “patched-together” it appears to be. Also, all the various exceptions to the rules in addition to the myriad rules about exceptional scenarios aren’t really helping, folks. I doubt anything can really be done about it now, though. Heh.

Advertisements
A Line number closure in Rust, and an interesting battle with the Rust compiler

Hoyte’s Line Number closure in a few choice languages

Douglas Hoyte, in his excellent (if a bit fanatical) book, “Let over Lambda” gives a simple and pithy example to demonstrate the concept of closures – little anonymous functions that can capture variables in the environment in which the closure was created.

The example is very simple – create a mini-logging facility by capturing a variable representing the current line number (initially set to 0), incrementing it for every invocation of the closure, and printing it out.

An implementation in Common Lisp might look so –

(defun get-line-logger ()
  (let ((line-num 0))
    #'(lambda (id)
        (incf line-num)
        (format t "[~a] Line number: ~d~%" id line-num))))


(defun logger-demo ()
  (let ((logger-1 (get-line-logger))
        (logger-2 (get-line-logger)))
    (flet ((f (id logger)
             (dotimes (i 5)
               (funcall logger id))
             (terpri)))
      (f "logger-1" logger-1)
      (f "logger-2" logger-2))))

Sample run:

CL-USER> (logger-demo)
[logger-1] Line number: 1
[logger-1] Line number: 2
[logger-1] Line number: 3
[logger-1] Line number: 4
[logger-1] Line number: 5

[logger-2] Line number: 1
[logger-2] Line number: 2
[logger-2] Line number: 3
[logger-2] Line number: 4
[logger-2] Line number: 5

NIL

As expected, we can not only capture local variables in the lexical environment during the time of the closure’s creation, but also modify them independently of any other instances of the closure. This is what makes is a proper closure. Also note that the capture is automatically done (whether we actually use the variables or not is irrelevant).

The Racket version is, unsurprisingly, almost identical not only in syntax, but also semantics:

#lang racket

(define (get-line-logger)
  (let ([line-no 0])
    (lambda (id)
      (set! line-no (+ 1 line-no))
      (fprintf (current-output-port)"[~a] Line number: ~a~%" id line-no))))

(define (logger-demo)
  (let ([logger-1 (get-line-logger)]
        [logger-2 (get-line-logger)])
    (letrec ([f (lambda (id logger)
                  (do
                      ((i 0 (+ i 1)))
                      ((= i 5))
                    (logger id))
                  (newline))])
      (f "logger1" logger-1)
      (f "logger2" logger-2))))

And the behaviour is exactly the same:

hoyte-closure.rkt´╗┐> (logger-demo)
[logger1] Line number: 1
[logger1] Line number: 2
[logger1] Line number: 3
[logger1] Line number: 4
[logger1] Line number: 5

[logger2] Line number: 1
[logger2] Line number: 2
[logger2] Line number: 3
[logger2] Line number: 4
[logger2] Line number: 5

Racket is, in some ways, more elegant than even Common Lisp. I especially love the part where lambdas don’t need any funcallS or applyS to make them run (unlike in Common Lisp). Still, pretty much a branch off the same family tree.

Moving on, let’s try the same in Java, shall we?

import java.util.function.Function;

public class HoyteClosure {
    private static Function<String, Void> getLineLogger() {
        int lineNum = 0;

        return (String id) -> { lineNum++; System.out.printf("[%s] Line number: %d\n",
                                                                 id, lineNum); return null; };
    }

    private static void invokeLogger(String id, Function<String, Void> logger) {
        for (int i = 0; i < 5; i++) {
            logger.apply(id);
        }
        System.out.println();
    }

    public static void main(String[] args) {
        Function<String, Void> logger1 = getLineLogger();
        Function<String, Void> logger2 = getLineLogger();

        invokeLogger("logger1", logger1);
        invokeLogger("logger2", logger2);
    }
}

Okay, looks good. However, if we try to run it, we run into issues immediately:

Timmys-MacBook-Pro:Java8 z0ltan$ javac HoyteClosure.java 
Timmys-MacBook-Pro:Java8 z0ltan$ javac HoyteClosure.java 
HoyteClosure.java:7: error: local variables referenced from a lambda expression must be final or effectively final
        return (String id) -> { lineNum++; System.out.printf("[%s] Line number: %d\n",
                                ^
HoyteClosure.java:8: error: local variables referenced from a lambda expression must be final or effectively final
                                                                 id, lineNum); return null; };
                                                                     ^
2 errors

The problem is that Java really does not have real closures. The lambda support in Java 8 is just syntactic sugar for the good old anonymous class which can read local variables in the environment, but cannot modify them. So what can we do?

To make Java happy, we can create a new object for every logger instance i.e., use instance variables in lieu of local variables so that the modification of local variables is not an issue any more:

import java.util.function.Function;

public class HoyteClosureModified {
    static class Closure {
        private int lineNum;
        
        private  Function<String, Void> getLineLogger() {
            return (String id) -> { lineNum++; System.out.printf("[%s] Line number: %d\n",
                                                                 id, lineNum); return null; };
        }
    }

    private static void invokeLogger(String id, Function<String, Void> logger) {
        for (int i = 0; i < 5; i++) {
            logger.apply(id);
        }
        System.out.println();
    }

    public static void main(String[] args) {
        Function<String, Void> logger1 = new Closure().getLineLogger();
        Function<String, Void> logger2 = new Closure().getLineLogger();

        invokeLogger("logger1", logger1);
        invokeLogger("logger2", logger2);
    }
}

Taking it for a test spin, we get:

Timmys-MacBook-Pro:Java8 z0ltan$ javac HoyteClosureModified.java
Timmys-MacBook-Pro:Java8 z0ltan$ java -cp . HoyteClosureModified
[logger1] Line number: 1
[logger1] Line number: 2
[logger1] Line number: 3
[logger1] Line number: 4
[logger1] Line number: 5

[logger2] Line number: 1
[logger2] Line number: 2
[logger2] Line number: 3
[logger2] Line number: 4
[logger2] Line number: 5

It’s not the same though, is it? The whole point of using a closure was so that we wouldn’t have to do this explicit management of state ourselves. As such, Java doesn’t really have full-blown closures, just a poor man’
s version of it. Better luck next time, Java.

Finally, the same using Ruby. As I have said before, Ruby feels remarkably like a Lisp despite the syntactic differences.

module HoyteClosure 
    class Demo
        def self.get_line_logger
            line_no = 0

            lambda do |id|
                line_no += 1
                puts "[" + id + "] Line number: " + line_no.to_s
            end
        end

        def self.logger_demo(id, logger)
            5.times {
                logger.call(id)
            }
            puts ""
        end

        def self.main
            logger1 = get_line_logger
            logger2 = get_line_logger

            logger_demo("logger1", logger1)
            logger_demo("logger2", logger2)
        end
    end
end

And the final test run:

irb(main):009:0> load "./hoyte_closure.rb"
load "./hoyte_closure.rb"
=> true
irb(main):010:0> HoyteClosure::Demo.main
HoyteClosure::Demo.main
[logger1] Line number: 1
[logger1] Line number: 2
[logger1] Line number: 3
[logger1] Line number: 4
[logger1] Line number: 5

[logger2] Line number: 1
[logger2] Line number: 2
[logger2] Line number: 3
[logger2] Line number: 4
[logger2] Line number: 5

=> nil

Short, non-idiomatic, and sweet!

Hoyte’s Line Number closure in a few choice languages