A PigLatin translator in Common Lisp (and contrasting it with the Racket version)

As my curiosity had been piqued by the PigLatin exercise in Racket, I decided to basically translate that code into Common Lisp. The result was interesting to say the least.

Since the explanation of the program logic has already been done in the Racket version, I will skip that part, and only highlight the relevant differences.

First off, the code:

(defpackage #:piglatin
  (:use :cl :cl-user))


(in-package #:piglatin)


(defun translate (sentence)
  (let ((*readtable* (copy-readtable nil)))
    (setf (readtable-case *readtable*) :preserve)
    (mapcar #'make-symbol
            (mapcar #'english-to-piglatin
                    (mapcar #'string-to-list
                            (mapcar #'symbol-name sentence))))))


(defun english-to-piglatin (word)
  (if (starts-vowel-p word)
      (word-to-vowel-rule word)
      (word-to-consonant-rule word)))


(defun starts-vowel-p (word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))


(defun word-to-vowel-rule (word)
  (coerce (append word '(#\w #\a #\y)) 'string))


(defun word-to-consonant-rule (word)
  (let ((was-capital-p (upper-case-p (car word))))
    (labels ((f (word)
               (if (starts-vowel-p word)
                   (cond (was-capital-p (string-capitalize
                                          (coerce (append word '(#\a #\y)) 'string)))
                         (t (coerce (append word '(#\a #\y)) 'string)))
                   (f (append (cdr word) (list (char-downcase (car word))))))))
      (f word))))


(defun string-to-list (word)
  (coerce word 'list))

And a similar test run:

CL-USER> (load "/Users/z0ltan/Rabota/ProgrammingLanguages/CommonLisp/pig-latin.lisp")
T
CL-USER> (piglatin::translate '(|Hello| |world| |we| |meet| |again|))
(#:|Ellohay| #:|orldway| #:|eway| #:|eetmay| #:|againway|)
CL-USER> (piglatin::translate '(|Cucullus| |non| |facit| |monachum|))
(#:|Uculluscay| #:|onnay| #:|acitfay| #:|onachummay|)

This looks much dirtier than the Racket version’s output, but there is good reason for that.

Some notes:

The first thing you’ll observe is the strange input format for the Common Lisp version. What’s the deal with all the pipes in the input? Well, my understanding is this – the default manner in which the Common Lisp reader reads in symbols is to convert the symbol internally to upper-case. This can easily be verified:

CL-USER> (readtable-case *readtable*)
:UPCASE
CL-USER> (symbol-name 'hello)
"HELLO"
CL-USER> (symbol-name 'Hello)
"HELLO"
CL-USER> (symbol-name 'hElLo)
"HELLO"

The :UPCASE in the first output shows that the default behaviour is to convert the symbol to all upper-case. The other possible values for this are: :preserve, :downcase, and :invert. So how do we get around this problem? The easiest approach in this case is to change the reader behaviour to preserve the case of the input symbol (note that this will only work if the input symbol is ensconced within pipes) using the :preserve keyword. That is exactly what we’re doing in the translate function:

(defun translate (sentence)
  (let ((*readtable* (copy-readtable nil)))
    (setf (readtable-case *readtable*) :preserve)
    (mapcar #'make-symbol
            (mapcar #'english-to-piglatin
                    (mapcar #'string-to-list
                            (mapcar #'symbol-name sentence))))))

Remember that dynamic variables have special meaning in Common Lisp. Here, we simply create a copy of the read-table (read up on this if you are unaware of this concept) and bind it to *readtable*. We have to be really careful when working with special dynamic variables. What we are doing here is simply overwriting the actual read-table with our modified version for the scope of the let expression. This avoids poisoning the read-table globally.

So we simply set the read-table mode to :preserve in the line: (setf (readtable-case *readtable*) :preserve), and then we can proceed almost exactly as in the Racket version. We have a top-down approach in which we convert the symbol list into a string list, convert that list into a list of lists of chars (Common Lisp also shares Racket’s behaviour in the sense that a string is not automatically a list of chars), map our english-to-piglatin function over that list, and finally convert the whole list of processed strings into a list of symbols as the output.

Another point of interest is that unlike Racket, Common Lisp doesn’t have built-in convenience functions such as string->list or list->string. Instead, we have a useful (if a bit quirky) and powerful function called coerce. So in the following code, we are simply coercing a string into a list of characters:

(defun string-to-list (word)
  (coerce word 'list))

The rest of the code is pretty much in the same vein, with an interesting twist. As mentioned earlier, the read-table case-munging only works if we escape the symbol within pipes (such as |Hello|), and so the following input screws up the capitalisation entirely, as can be seen in the output:

CL-USER> (piglatin::translate '(hello world))
(#:|Ellohay| #:|Orldway|)

CL-USER> (piglatin::translate '(Hello WORLd))
(#:|Ellohay| #:|Orldway|)

As you can observe, when we use symbols without escaping them with pipes, the reader automatically converts the to upper-case anyway, so the output remains the same irrespective of the case of the input symbol.

This is quite in line with the overall quirkiness of Common Lisp. In that sense, I feel that Racket (and Scheme, by extension) is a much more functional and consistent language, easier to work with. However, Common Lisp did, and always will hold a soft spot in my heart, especially as it offers me countless more opportunities to explore and learn this magnificent beast!

A PigLatin translator in Common Lisp (and contrasting it with the Racket version)

A PigLatin translator in Racket

This post was inspired by a question posted on /r/programming on reddit. As part of helping the person, I realised I could create one just for fun in Racket!

I started out by reading the rules for Pig Latin on Wikipedia (dead simple rules), and the only irksome bit was ensuring that the capitalisation of the words remained intact. Also, the input and output formats were inspired by the posted question (it would have been much simpler to process a list of strings, or even a string of words directly).

Here is the code, followed by a bit of explanation.

(module piglatin racket
  (provide translate)

(define (translate sentence)
  (map string->symbol
       (map english->piglatin
           (map string->list
                (map symbol->string sentence)))))


(define (english->piglatin word)
  (if (starts-vowel? word)
      (word->vowel-rule word)
      (word->consonant-rule word)))


(define (starts-vowel? word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))


(define (word->vowel-rule word)
  (string-append (list->string word) "way"))


(define (word->consonant-rule word)
  (let [(was-capital? (char-upper-case? (car word)))]
    (letrec [(f (lambda (word)
                 (if (starts-vowel? word)
                     (cond [was-capital? (string-titlecase
                                          (string-append (list->string word)                  "ay"))]
                            [else (string-append (list->string word) "ay")])
                      (f (append (cdr word)
                                 (list (char-downcase (car word))))))))]
      (f word)))))

Explanation:

Let’s break down the code into chunks for easier explanation.

The input format is a list of symbols (not strings) such as: '(hello world). As such I figured that it was best to do some top-down programming here. To that end, the main function (and the only one exposed to the outside world) is the translate function:

(define (translate sentence)
  (map string->symbol
       (map english->piglatin
           (map string->list
                (map symbol->string sentence)))))

What I’m doing here is taking the input, sentence, which is in the specified format, and then I convert that into a list of strings, followed by converting that to a list of list of characters (Racket does not parse strings as lists of characters directly, like Haskell does). Following that, I convert each word in this list to its corresponding PigLatin form, and since the output format is again a list of symbols, map the strings back to symbols.

Then we have basically two rules – one for words that begin with vowels, and another one for words that begin with consonants.

(define (english->piglatin word)
  (if (starts-vowel? word)
      (word->vowel-rule word)
      (word->consonant-rule word)))

The starts-vowel? predicate simply checks if the word begins with a vowel or not (we convert the letter to lowercase for easier checking)

(define (starts-vowel? word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))

The rule for words beginning with a vowels is simply to append a “way” to the string:

(define (word->vowel-rule word)
  (string-append (list->string word) "way"))

For words that begin with consonants though, the rule is a bit more involved. Basically, we have to extract the consonant cluster (one or more letters) at the beginning of the word, append that to the end of the updated word (which now begins with a vowel), and also suffix an “ay” at the end. Note that we also have to take care of proper capitalisation in this case, which we didn’t have to do with the vowel rule:

define (word->consonant-rule word)
  (let [(was-capital? (char-upper-case? (car word)))]
    (letrec [(f (lambda (word)
                 (if (starts-vowel? word)
                     (cond [was-capital? (string-titlecase
                                          (string-append (list->string word)                  "ay"))]
                            [else (string-append (list->string word) "ay")])
                      (f (append (cdr word)
                                 (list (char-downcase (car word))))))))]
      (f word)))))

As you can see, since we are performing recursion to extract the consonant cluster, we maintain the original capitalisation of the word in a local binding called was-capital?. Then we recurse through the word, and in case was-capital> is true, we use the string-titlecase function to capitalise the first letter of the processed word. Note that we could have managed without using this built-in function, but the increase in verbosity for this little gain hardly seems worth the effort.

All right, let’s test it out on a variety of input cases:

> (load "pig-latin.rkt")
> (require 'piglatin)
> (translate '(My))
'(Ymay)
> (translate '(My hovercraft is full of eels))
'(Ymay overcrafthay isway ullfay ofway eelsway)
> (translate '(Always the odd one out))
'(Alwaysway ethay oddway oneway outway)
> (translate '(sticks and stones))
'(icksstay andway onesstay)
> (translate '(Hello world we meet again))
'(Ellohay orldway eway eetmay againway)
>(translate '(Cucullus non facit monachum)) 
'(Uculluscay onnay acitfay onachummay)
> (translate '(The quick brown fox jumped over the lazy dog))
'(Ethay uickqay ownbray oxfay umpedjay overway ethay azylay ogday)
> (translate '(Allo allo))
'(Alloway alloway)


This sort of stuff is what makes dynamic languages so dear to me, especially for prototyping. If I had done it in a static language, it would have taken quite a bit more effort (type inference is wonderful, but it does not actually save you from wrestling with the type system!)

A PigLatin translator in Racket

Goodbye Rust, and Hello, D!

After quite a bit of thought and consideration, I have decided to abandon my study of Rust, and move on to D.

I had been learning Rust for a while now, and I have become quite comfortable with it, but there are a few reasons that prompted me to move on to another systems language that would suit me better. Now while D is by no means perfect, I found the following advantages already:

  • A very consistent and intuitive (to me) core language. I have been learning it for only a couple of days now, and I’ve had quite a few “wow, it works exactly as I expected” moments already.
  • Very welcoming and helpful community that actually focuses on the technical side of things rather than getting sidetracked by social causes
  • A wide variety of compilers (DMD, GDC, LDC) targeted for both gcc and LLVM.
  • A stupendously fast compilation cycle. For small programs, I have found
    the compilation times to be on par with Go’s
  • A much saner syntax (in my very early estimation) that actually is quite readable (Rust’s syntax is not quite so readable for anything less-than-trivial in my opinion), and finally
  • Arrays (arguably the most important data structure) are actually sane, consistent, and very much logically intuitive in D unlike the mess that’s C (and C++).
  • Extremely powerful metaprogramming capabilities (an area I am particularly interested in)
  • A much much safer language than C++ while being much more programmer-friendly than Rust. Rust in that sense offloads the whole mental load of memory management onto the developer (and not in a crazy fun way like C or C++).

My use-cases for a systems programming language are quite simple, really – I don’t intend to do OS-level development (at least not in the foreseeable future), and so the GC dependency of D does not affect me in that sense. My research as well as my interactions with the wonderful folks over at forum.dlang.org have convinced me though that even with the GC, performance is still topnotch for a wide variety of systems-level programming.

As part of my learning process (just two days in to be exact), I thought of implementing two programs just to test out the waters before I dived in, and both of them were implemented with much less hassle (infinitely so to be honest), and they just worked beautifully the first time!

Insertion Sort

This is a barebones implementation of Insertion Sort in D:

// insertion_sort.d
import std.stdio;

void insertionSort(int[] arr) {
    foreach(i; 1..arr.length) {
        int key = arr[i];
        int j = cast(int)i-1;
        
        while (j >= 0 && arr[j] > key) {
            arr[j+1] = arr[j];
            j--;
        }
        arr[j+1] = key;
    }
}

void main() {
    int[] array = [100,-199,0,20,2,3,4,5,0,0,1,200];
    
    writefln("Before sorting: %s", array);
    
    insertionSort(array);
    
    writefln("After sorting: %s", array);
}

And here’s a test run:

sh-4.3$ dmd -run insertion_sort.d                                                                                                                                                   
Before sorting: [100, -199, 0, 20, 2, 3, 4, 5, 0, 0, 1, 200]                                                                                                              
After sorting: [-199, 0, 0, 0, 1, 2, 3, 4, 5, 20, 100, 200]  

Simple, and elegant! In fact, this code is directly readable to and understandable by anyone who has even the least bit of familiarity with C, C++, Java, or any member of the extended ALGOL family!

A line number closure

For my second program, I wanted to try out the line number closure (courtesy Hoyte) in D. As it turns out, either D is very logically designed, or my mind is quite attuned to D (or perhaps both!). It was quite pleasurable to be honest. Of course, at this stage, I have no idea if the code is idiomatic D or not. Anyway, let’s have some fun!

// line_closure.d
import std.stdio;

void delegate() getLineClosure() {
    int lineNum = 0;
    
    void printLineNumber() {
        lineNum++;
        
        writefln("Current line: %s", lineNum);
    }
    
    return &printLineNumber;
}

void main() {
    void delegate() x = getLineClosure();
    
    foreach(i; 0..5) {
        x();
    }
    writeln();
    
    void delegate() y = getLineClosure();
    
    foreach(j; 0..3) {
        y();
    }
}

and let’s take it for a spin:

sh-4.3$ dmd -run line_closure.d                                                                                                                                           
Current line: 1                                                                                                                                                           
Current line: 2                                                                                                                                                           
Current line: 3                                                                                                                                                           
Current line: 4                                                                                                                                                           
Current line: 5                                                                                                                                                           
                                                                                                                                                                          
Current line: 1                                                                                                                                                           
Current line: 2                                                                                                                                                           
Current line: 3    

Beautiful! The code probably deserves a bit of explanation – in D, functions are (as far as I can tell), first-class objects, and we can nest functions inside almost any scope. However, a function that is nested within a function or block scope is given a special name – a delegate. So the return type of the getLineClosure function:

void delegate()

indicates that we are returning a delegate which takes no parameters and returns nothing. Also note that we explicitly return the address of the delegate unlike in C, where the name of the function itself is the function pointer.

For this example to work, the delegate has to be able to capture the local variable, lineNum of course, and so delegates in D essentially function as closures in other Functional Programming languages. Also note the clean syntax of foreach for looping (C style loops are also supported, of course).

In conclusion, one major difference that I observed between Rust and D (I’m much more familiar with Rust, of course) is that I can more fully focus on the problem itself with D whereas with Rust, I have to always be aware of (and worrying about) what I am doing with the bindings/variables in my program, and I don’t think it’s altogether a question of getting familiar and comfortable with that. The ownership and borrowing concepts of Rust are, in and of themselves, trivial. However, the entire onus of managing proper memory behaviour is entirely on the developer. I wonder how much that would actually scale in real life. I suppose we’ll know the answer when people start developing large industry-standard (as much as I despise that term) in Rust.

Goodbye Rust, and Hello, D!

A small vector initialisation macro in Rust

Learning Rust’s macros was almost a sort of déjà vu for me. It is remarkably similar to the new-fangled macro system used in Racket. In the first place, Racket macros are themselves quite different (syntactically and semantically) from Common Lisp macros (which I love, but I have to admit that Racket macros are much more user-friendly), and so it was quite surprising to see Rust’s macro-rules! are almost identical to Racket’s syntax-rules, including the Pattern Matching, and of course, hygiene.

I doubt that’s an historical accident, of course, but the funny bit is that it only adds to Rust’s deep heritage of borrowing not only semantics, but also syntax from other languages (I have already seen traces of SML and Haskell, and now Racket). Nevertheless, the best part for me was that it’s helping me pick up Rust’s macros quite quickly and easily (not an expert by any means), including recursive macros (which are, admittedly, rather difficult to grok in Common Lisp).

So in this short blog, I wanted to post a small macro that I wrote as part of my ongoing study of Rust. It is a simple vector initialisation macro. All this macro does is to take in a binding of the form:

let some_vec = init_vec![size; init_value];

where size determines how many elements we want in the vector, and init_value determines what value the entire vector will be initialised with.

At this juncture, it would be probably be wise to mention another aspect of Rust’s macros that pleasantly surprised me – in the macro call shown above, the square brackets can actually be replaced by a pair of parentheses, or even a pair of flowery brackets (or braces as some folks call them). This is again very similar to Racket’s behaviour. However, we can mix two different styles of brackets. Just to clarify this further, the following three are identical:

let some_vec = init_vec![size; init_value];
let some_vec = init_vec!(size; init_value);
let some_vec = init_vec{size; init_value};

However, something like the following is illegal:

let some_vec = init_vec!{size; init_value];

Also, before showing the code, it is equally interesting to note that the rule also applies to the definition of the macro itself. Instead of flowery brackets, we could as well have chosen to use parentheses to wrap the rule in. In fact, that would even be my preferred style since Common Lisp has made me more or less immune to parentheses! (only half-joking).

Here is the macro and some test code for it:

/// Here we demonstrate a macro that takes in a size, and an init value
/// and returns a vector of that size initialised with the init value.

use std::fmt::Debug;
use std::fmt::{Display, Formatter, Result};

// the main man!
macro_rules! init_vec {
	( $size: expr; $init: expr ) => { {
			let mut temp_vec = Vec::new();

			for _ in 0..$size {
				temp_vec.push($init);
			}

			temp_vec
		};
	} // block necessary here
}

// a custom structure
#[derive(Debug)]
struct Foo {
	bar: f64,
}

impl Foo {
	fn new() -> Foo {
		Foo {
			bar: 0.0,
		}
	}
}

impl Display for Foo {
	fn fmt(&self, f: &mut Formatter) -> Result {
		write!(f, "{{ bar: {} }}", self.bar)
	}
}


fn print_vec<T: Debug + Display>(name: &str, vec: &Vec<T>) {
	println!("{} = {:?}", name, vec);

	print!("The contents are... ");

	for e in vec.iter() {
		print!("{} ", *e);
	}
	println!("\n");
}


fn main() {
	let int_vec = init_vec![10; 99];
	print_vec("int_vec", &int_vec);

	let str_vec = init_vec![5; "hello"];
	print_vec("str_vec", &str_vec);
	
	let mut foo_vec = init_vec![7; Foo::new()];
	print_vec("foo_vec", &foo_vec);

	// we can also use it like any normal vector
	foo_vec[0].bar = 100.0;
	foo_vec[3].bar = 2.71828;
	foo_vec[5].bar = 3.14159;

	println!("After mutation...\n");

	print_vec("foo_vec", &foo_vec);
}

Let’s take it for a quick run:

int_vec = [99, 99, 99, 99, 99, 99, 99, 99, 99, 99]
The contents are... 99 99 99 99 99 99 99 99 99 99 

str_vec = ["hello", "hello", "hello", "hello", "hello"]
The contents are... hello hello hello hello hello 

foo_vec = [Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }]
The contents are... { bar: 0 } { bar: 0 } { bar: 0 } { bar: 0 } { bar: 0 } { bar: 0 } { bar: 0 } 

After mutation...

foo_vec = [Foo { bar: 100 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 2.71828 }, Foo { bar: 0 }, Foo { bar: 3.14159 }, Foo { bar: 0 }]
The contents are... { bar: 100 } { bar: 0 } { bar: 0 } { bar: 2.71828 } { bar: 0 } { bar: 3.14159 } { bar: 0 } 

Excellent! So it all works pretty much as expected. The code probably merits a bit of explanation – in the macro, we simply define a rule which expects an expression ( $size: expr), followed by a semi-colon, followed by another expression ($init: expr).

Note that expr has a special meaning in Rust. It is what is called a “designator” and defines the type of the data being passed in. There are various such designators available (ident for identifiers. type for types, etc. Refer to the documentation for further details).

So the whole ( $size: expr; $init: expr ) denotes the pattern that, if matched, leads to the logic on the right-hand side of the =>. Now, the right hand-side is where all the action takes place. All we do there is to create a temporary vector, and then insert as many init values as size inside the vector, and finally return the vector. Since there are multiple expression/statements in the rule, we safely ensconce them within a block using the flowery brackets. Note that there is really no type-checking done inside to see if size is a valid numeric type. This may not be possible since macro-expansion should happen very early in the compilation phase. However, I might be wrong (as many times before!), and if I come across such means, I will update this blog accordingly.

So, it was a quick little fun exercise, and it makes Rust so much more appealing to me now especially since it’s been some time since I’ve played around with macros in Common Lisp. Maybe one of these days, eh?

A small vector initialisation macro in Rust

A simple demo of using channels in Rust

Continuing with my study of Rust, I ventured into concurrency using Rust at long last!

The standard library in Rust provides a few basic concurrency primitives, but of them all, my favourite has to be the support for channels. I decided to try out a small program where the factorial of a number is calculated in another thread, and the result piped back into the calling function.

The code is presented as follows:

/// Factorial using channel

use std::io;
use std::str::FromStr;
use std::thread;
use std::sync::mpsc::{channel, Sender};


fn factorial(tx: Sender<u64>, n: u64) {
    fn helper(acc: u64, n: u64) -> u64 {
        if n == 0 { acc } else { helper(acc * n, n - 1) }
    }

    thread::spawn(move || { tx.send(helper(1u64, n)).unwrap(); });
}


fn get_number() -> u64 {
    println!("Enter a number");

    let mut input = String::new();

    io::stdin()
        .read_line(&mut input)
        .expect("could not read input");

    match u64::from_str(input.trim()) {
        Ok(n) => n,
        Err(e) => panic!(e.to_string()),
    }
}


fn main() {
    let n = get_number();

    let (tx, rx) = channel();

    factorial(tx, n);

    println!("The factorial of {} is {}", n, rx.recv().unwrap());
}

Pretty straightforward code. We create a channel using std::sync::mspsc::channel, which returns a (Sender, Receiver) tuple. As the names suggest, Sender is used for transmitting data on the channel, and Receiver is used for receiving data on the same channel.

This approach also precludes the need for waiting on the thread explicitly.

Let’s take it for a spin:

bash-3.2$  rustc channel_factorial.rs && ./channel_factorial
Enter a number
20
The factorial of 20 is 2432902008176640000

Short and sweet!

The contrived example without using a channel might look something like so:

/// concurrent factorial using plain threads

use std::thread;
use std::sync::{Arc, Mutex};


fn factorial(n: u64) -> u64 {
    fn helper(acc: u64, n: u64) -> u64 {
        if n == 0 { acc } else { helper(acc * n, n - 1) }
    }

    helper(1u64, n)
}


fn main() {
    let data = Arc::new(Mutex::new(0u64));

    let data_clone = data.clone(); // bump the RC count

    let handle = thread::spawn(move || {
        let mut fact_cell = data_clone.lock().unwrap();
        *fact_cell = factorial(10);
    });

    match handle.join() {
        Ok(_) => println!("Factorial of 10 = {:?}", data),
        Err(e) => panic!(e),
    }
}

Not too bad, eh? The only difference is that we wrap the actual data inside a Mutex, which is itself wrapped inside an Arc object. This is to allow for shared mutable data using reference counting (since every object has exactly one owner at any point of time).

Inside the spawned thread, we can then get a lock on the cloned data object, and then we update the wrapped data with the factorial value. The problem with this approach is that it is not trivial to extract the value out. Taking it for a test run:

bash-3.2$ rustc concurrent_factorial.rs && ./concurrent_factorial
Factorial of 10 = Mutex { data: 3628800 }

As we can see, we do get the correct result, but unfortunately it’s wrapped up inside the mutex object, and getting it out is not easy since the API itself doesn’t seem to provide any way of doing so. Suggestions on StackOverflow seem to imply wrapping the data inside an Option object beforehand (as in something like

Arc<Mutex<Option<T>>>

), but that road didn’t lead me anywhere nice and safe either. I probably need to learn more about this approach in addition to perusing the documentation for a nice and elegant way out of this jumble of wrapped generic types (if there is one). No wonder I’d rather have the nice Go-esque channels!

UPDATE: Thanks to the good folks over at #rust, I got a simple and elegant solution for the unwrapping issue! Here is the update code:

/// concurrent factorial using plain threads

use std::thread;
use std::sync::{Arc, Mutex};


fn factorial(n: u64) -> u64 {
    fn helper(acc: u64, n: u64) -> u64 {
        if n == 0 { acc } else { helper(acc * n, n - 1) }
    }

    helper(1u64, n)
}


fn main() {
    let data = Arc::new(Mutex::new(0u64));

    let data_clone = data.clone(); // bump the RC count

    let handle = thread::spawn(move || {
        let mut fact_cell = data_clone.lock().unwrap();
        *fact_cell = factorial(10);
    });

    match handle.join() {
        Ok(_) => println!("Factorial of 10 = {:?}", *data.lock().unwrap()),
        Err(e) => panic!(e),
    }
}

The change is this: *data.lock().unwrap() inside the match in main. The explanation given was this: since the data itself is wrapped inside a Mutex within an Arc, it needs to be unwrapped in the reverse order – dereference the Arc to get the Mutex, and then dereference that to get the data! The unwrap is necessary since lock() returns a LockResult object.

The Rust community is indeed one of the most helpful communities on IRC. The problem which had me vexed for an hour or so was solved within minutes!

A simple demo of using channels in Rust

A simple Tree data structure in Haskell

This is part of my attempt to learn Haskell in a proper manner. I have been on and off learning Haskell for quite a few years now, always getting interrupted for some reason. This time, I decided to bite the bullet and get on with it!

As part of that, I started off with my favourite Haskell textbook, Professor Graham Hutton’s “Programming in Haskell”, 2nd Edition. This is arguably THE book that opened my eyes to Functional Programming in general. This book is almost perfect with one small flaw, in my opinion, of course. The bottom-up approach of solving problems is excellent, but it would have been rather nice if a small synopsis of the problem and the proposed approach were given at the beginning instead of having to read the whole code to understand it.

Having finished around half of that book, I realised that it was a bit short on examples, problems, and the downright dirty parts of Haskell, introducing IO only around the middle of the book. That’s when I decided to go for Allen and Moronuki’s book, “Haskell Programming from First Principles” to get up to speed on actual real-world programming in Haskell. The first few chapters are extremely basic (and I do find the tone a bit too prescriptive and pedantic, but that should suit a complete beginner just fine. For practice and hands-on though, it is a most excellent resource.

As part of my reboot of Haskell, I decided to implement a small Tree data structure in Haskell, all with the express aim of getting rid of the cobwebs. I can’t say that I’m too happy with my first attempt even though it clearly works more or less as expected. It feels very rough and creaky and the Null vs Leaf issue is pretty irritating as well. Well, after some more practice, I should be able to start iterating on better versions!

For now, here’s the code. It’s pretty much self-explanatory. The basic Tree type (single type parameter for now!), a helper function to create a Binary Search Tree (easier to confirm the output using preorder traversal), and the three standard binary tree traversal algorithms.

{- A binary tree implementation -}

data Tree a = Null | Leaf a | Node (Tree a) a (Tree a)  deriving Show


createNode :: a -> Tree a
createNode value = Leaf value 


addNode :: (Ord a) => Tree a -> a -> Tree a
addNode tree value = case tree of
                        Null -> createNode value
                        Leaf x -> if value <= x then
                                    Node (createNode value) x Null
                                  else
                                    Node Null x (createNode value)
                        Node l n r -> if value <= n then
                                        Node (addNode l value) n r
                                      else
                                        Node l n (addNode r value)

makeBST :: Ord a => [a] -> Tree a
makeBST [] = Null  
makeBST (n:ns) = addNode (makeBST ns) n
                     


-- inorder traversal (left, root, right)
inorder :: Tree a -> [a]
inorder tree = case tree of
                 Null -> []
                 Leaf x -> [x]
                 Node l n r -> inorder l ++ [n] ++  inorder r


-- preorder traversal (root, left, right)
preorder :: Tree a -> [a]
preorder tree = case tree of
                  Null -> []
                  Leaf x -> [x]
                  Node l n r -> [n] ++ preorder l ++ preorder r


-- postorder traversal (left, right, root)
postorder :: Tree a -> [a]
postorder tree = case tree of
                   Null -> []
                   Leaf x -> [x]
                   Node l n r -> postorder l ++ postorder r ++ [n]

A sample run:

*Main> let bst = makeBST [10,18,12,7,10,11,3,4,4,-2,-3,100]
*Main> bst
Node (Node Null (-3) (Node Null (-2) (Node (Node (Leaf 3) 4 Null) 4 (Node (Node (Node Null 7 (Leaf 10)) 10 Null) 11 (Node Null 12 (Leaf 18)))))) 100 Null
*Main> inorder bst
[-3,-2,3,4,4,7,10,10,11,12,18,100]
*Main> preorder bst
[100,-3,-2,4,4,3,11,10,7,10,12,18]
*Main> postorder bst
[3,4,10,7,10,18,12,11,4,-2,-3,100]
*Main> 

Ah well, it’s quite exciting to see something working as expected, isn’t it? Well, onwards then with Haskell (and then on to Agda and Idris, which I am very much interested in for edificational purposes!)

A simple Tree data structure in Haskell

A Line number closure in Rust, and an interesting battle with the Rust compiler

It had been a while since I’d played around with Rust, so I decided to implement one of my two favourite programs when learning new languages viz., a line number closure (with mutation, of course) as shown by Hoyte in his seminal book, “Let Over Lambda”. The other program is, of course, implementing a functioning Rational number class (as in the previous blog post, for instance).

Well, with Rust, it’s a very…interesting experience to say the least. The Rust Borrow Checker is a very strange beast indeed. So, let’s have a look at how we can arrive at a working version of the program.

Background

Rust does have powerful support for closures (what respectable language does not today?). However, the interesting part is that closures in Rust are simply syntactic sugar for traits. To be more precise, there are three available traits that a closure can belong to:

  • the Fn trait – this is the least permissive of all, and it allows a read-only closure (in effect). This is ideal when we want to simply have a self-contained immutable closure. For instance,
    fn square<F>(n: i32, callback: F)
        where
        F: Fn(i64) ->() {
        callback(n as i64 * n as i64);
    }
    
    fn main() {
        square(100, |n| {  println!("Received {}", n); });
    }
    

    This duly prints out:

    Received: 10000
    

    This is quite straightforward. As we can see, The type of the callback, which is the closure, is Fn(i64) -> (), which means that it takes a 64-bit integer and does some side-effect (printing to screen in this case). No mutation anywhere, no capturing of environments, and so the Fn trait works superbly here.

  • the FnMut trait is more interesting. This is needed when we need to mutate the environment captured by the closure. As a small example:
    fn process_mutating_closure<F>(mut closure: F)
        where
        F: FnMut(i32) -> () {
        closure(100);
    }
    
    fn main() {
        let mut local = 100;
    
        process_mutating_closure(|x| { 
                  local += 100; 
                  println!("Value = {}", x + local); 
        });
    
    

    which produces:

    Value = 300
    

    All right, this is a bit more complicated than the previous version. So, let’s break it down. What we intend to do is this – the closure that’s created in the main function captures a mutable variable, local, in the enclosing lexical scope. The closure then modifies this captured variable, and uses it for some dummy processing before printing the final value out.

    Since a captured variable is being modified, the type of the closure will be FnMut. This is exactly what’s reflected in the type signature when the closure is passed as an argument to process_mutating_closure.

  • , and finally, the FnOnce trait is the most restrictive of all. It basically transfers ownership of the capture to the closure itself. If you are not familiar with Rust’s wonderful (sarcasm) Ownership and Borrowing world, consider yourself lucky.
    fn move_closure_demo<F: FnOnce(i32) -> ()>(closure: F) {
        closure(100)
    }
    
    fn main() {
        let mut local = 100;
    
        move_closure_demo(move |x| {
            local += 100;
            println!("Result = {}", x + local);
        });
    
        println!("{}", local);
    }
    
    Result = 300
    

    Eh? This code looks almost identical to the previous one (except for the change in the function signature of the closure from FnMut to FnOnce and the move in front of the closure). Well, if you notice, you will also see that the function var, closure in the function move_closure_demo also does not require mut unlike in the previous case.

    What’s happening here is that because we have “moved” the closure, the closure is now the owner of the captured variable, local. Wait a minute, so how come we can still use local in main after it’s been moved? It turns out that because the binding of local is to a Copy-able type (i32 in this case), a copy of the binding’s value is actually moved. However, this won’t work for types that do not implement the Copy trait. Confused? Well, you should be. I mean, wtf?

All right, now let’s get on with the meat of the actual matter.

First attempt

Let’s put our knowledge of how closures are actually syntactic sugar for the aforementioned trifecta of traits to use. Here’s the first shot:

fn gen_line_number() -> (FnMut() -> ()) {
    let mut line_no = 0;

    || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    }
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

Since we’re modifying the capture var, line_no, we need the closure to be of the trait type, FnMut. Looks okay, let’s take it for a spin:

bash-3.2$ rustc hoyte-blog.rs 
error[E0277]: the trait bound `std::ops::FnMut() + 'static: std::marker::Sized` is not satisfied
 --> hoyte-blog.rs:1:26
  |
1 | fn gen_line_number() -> (FnMut() -> ()) {
  |                          ^^^^^^^^^^^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut() + 'static`
  |
  = note: `std::ops::FnMut() + 'static` does not have a constant size known at compile-time
  = note: the return type of a function must have a statically known size

error[E0308]: mismatched types
 --> hoyte-blog.rs:4:5
  |
4 |     || {
  |     ^ expected trait std::ops::FnMut, found closure
  |
  = note: expected type `std::ops::FnMut() + 'static`
  = note:    found type `[closure@hoyte-blog.rs:4:5: 8:6 line_no:_]`

error[E0277]: the trait bound `std::ops::FnMut(): std::marker::Sized` is not satisfied
  --> hoyte-blog.rs:12:9
   |
12 |     let foo = gen_line_number();
   |         ^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut()`
   |
   = note: `std::ops::FnMut()` does not have a constant size known at compile-time
   = note: all local variables must have a statically known size

error[E0277]: the trait bound `std::ops::FnMut(): std::marker::Sized` is not satisfied
  --> hoyte-blog.rs:20:9
   |
20 |     let bar = gen_line_number();
   |         ^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut()`
   |
   = note: `std::ops::FnMut()` does not have a constant size known at compile-time
   = note: all local variables must have a statically known size

error: aborting due to 4 previous errors

Whoa! Okay, let’s read the error messages in order. Maybe that will help us fix the issue better. Right, so the first one says something about std::marker::Sized not being implemented… blah blah blah. What all this verbosity actually means is that we are returning a closure from the function, and it needs to know the size of the closure at compile time itself.

Fair enough. However, why does it work when we’re passing a closure into a function the same way (instead of returning the closure from it)? Who the hell really knows? Well, the basic problem here (as I figure it) is that we need to wrap the whole damned thing inside a heap pointer (aka the Box type) since we’re returning from the function, and the stack frame for that function is going to get de-allocated, we need to return a heap pointer to the object so that it can be used even after the current function frame’s destroyed.

This is the reason why code like the following works because the callback function, closure cannot outlive the function where the closure was generated (main):

fn foo<F: Fn() -> ()>(closure: F) {
    closure();
}

fn main() {
    foo(|| { println!("Hello, world!");});
}
Hello, world!

Okay, let’s proceed.

Second Attempt

Having gained that great insight from the first version, here’s how the second version looks like:

fn gen_line_number() -> Box<FnOnce() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

and…

bash-3.2$ rustc hoyte-blog.rs 
error[E0161]: cannot move a value of type std::ops::FnOnce(): the size of std::ops::FnOnce() cannot be statically determined
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^

error[E0161]: cannot move a value of type std::ops::FnOnce(): the size of std::ops::FnOnce() cannot be statically determined
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^

error[E0382]: use of moved value: `*foo`
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^ value moved here in previous iteration of loop
   |
   = note: move occurs because `*foo` has type `std::ops::FnOnce()`, which does not implement the `Copy` trait

error[E0382]: use of moved value: `*bar`
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^ value moved here in previous iteration of loop
   |
   = note: move occurs because `*bar` has type `std::ops::FnOnce()`, which does not implement the `Copy` trait

error: aborting due to 4 previous errors

Sigh. Just when things were looking up… theoretically. All right, before we lose our sanity, let’s read all the bloody error messages and try to make some sense of them. Okay, ignoring the first two about statically size something, we see that the other errors are complaining about foo and bar being already moved in the previous iteration of the loop. At least that makes some sort of sense – since we have the closure passed to main and assigned to foo and bar, and, as the error message says, FnOnce does not implement the Copy trait, we cannot change the owner of the closure in subsequent iterations of the for loop.

Let’s rectify that by using the FnMut trait instead, which is actually what we want since we are not using any ownership changes, and we’re mutating the captured local variable, it sounds like the perfect candidate for our use case.

In case you’re wondering about modifying the main function instead to use references inside the for loops, I assure you that it’s not an oversight on my part – down that road lies perdition! (Reflect upon the fact that I have a working copy, and I’m actually working backwards through some sane progression amongst the many that I had to endure along the way!)

Third attempt

Here’s the version with the closure type changed to FnMut:

fn gen_line_number() -> Box<FnMut() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

And yes, you guessed it, it simply doesn’t work, but you’ll love the error messages:

bash-3.2$ rustc hoyte-blog.rs 
error: cannot borrow immutable `Box` content `*foo` as mutable
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^

error: cannot borrow immutable `Box` content `*bar` as mutable
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^

error: aborting due to 2 previous errors

Hmmm… hmmm? It mentions that when I call foo and bar, it’s trying to make something mutable that’s actually immutable. Eh? Okay, let’s see. The local captured variable was “moved” to live beyond the function’s stack frame, so that’s fine. I’m using FnMut so that I can actually mutate the captured variable, that’s fine. I’m wrapping the whole damned thing inside a Box so that the whole closure can persist beyond the function call, and finally, all I’m trying to do is to increment the mutable captured line_no var inside my for loops. That also sounds fine. So when did the contents of the box become immutable?

All right, let’s see the final version first.

Final version

fn gen_line_number() -> Box<FnMut() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let mut foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let mut bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

and running it,

bash-3.2$ rustc hoyte-blog.rs 
bash-3.2$ ./hoyte-blog 
Current line number: 1
Current line number: 2
Current line number: 3
Current line number: 4
Current line number: 5

Current line number: 1
Current line number: 2
Current line number: 3
Current line number: 4
Current line number: 5

Wow! Glory at last! Heh. Well, if you have the eyes of a hawk, you’ll observe that the only change was adding the mut qualifier to the closure handles in main. So what that extremely useful, intuitive, and helpful error message was telling us is that we have a boxed closure containing a captured mutable local var which we’re updating in every iteration of the respective for loops, but unfortunately we have assigned the closures to immutable handles, and so the whole thing just won’t work.

Whew. And seriously, wtf?

Rust appears to be great fun when things go well, but when things go badly (and they will, of course), it can be quite a PITA. Granted, I am not an expert or even an intermediate Rust programmer, but perhaps people’s complaints about the learning curve being seriously fucked up aren’t really exaggerated. If the learning curve had been at least logical, it would be a different matter. As things stand though, the more I venture into the language, the more “patched-together” it appears to be. Also, all the various exceptions to the rules in addition to the myriad rules about exceptional scenarios aren’t really helping, folks. I doubt anything can really be done about it now, though. Heh.

A Line number closure in Rust, and an interesting battle with the Rust compiler