Demoing a simple module in Java 9

In the spirit of the upcoming Java 9 release, I decided to port my D version of the “Guessing game” into Java, but with a twist. I thought that it would be a nice touch to develop the game into a nice module, and turn the process of developing it into an approachable starter tutorial for creating, using, and most importantly, understanding the basics of Java 9 modules.

Background

One of the biggest changes in Java 9 is the introduction of support for modules into the language itself. The rationale and reasoning behind it is very well documented on the OpenJDK site under JEP261.

However, if you find that heavy reading, a wonderfully succinct and clear Quick Start
is also provided for developers who want to simply get on with it. This post is based on that tutorial.

Before we get started with the actual game, I just wanted to ruminate a bit on why I think it’s a welcome idea that modules are being introduced into the language itself. A lot of people conflate packages in Java with modularity. Well, in a sense that does make sense. However, I like to view it as more of a namespacing mechanism than anything else.

Modularity, by definition, would imply a stronger separation of concerns – classes and interfaces wrapped in packages wrapped within modules. Java 9 modules achieve precisely that – they control which packages (and thereby which classes) are visible outside the module itself. Whereas previously we could just toss any old JAR file into the classpath and load classes at will, now we can only do so if the module they reside in explicitly expose them. Very nice!

The game

The game is dead simple – the program generates a secret number in the range [1, 100], and it’s the player’s job to guess the number.

The setup

Okay, so let’s start off by creating a directory for our game module.

Macushla:modules z0ltan$ mkdir guessing-game
Macushla:modules z0ltan$ cd guessing-game

Now, let’s create a directory called mods where we will store the generated module, and also a directory called src (following convention, not strictly enforced) which will hold the module source code:

Macushla:guessing-game z0ltan$ mkdir mods
Macushla:guessing-game z0ltan$ mkdir src
Macushla:guessing-game z0ltan$ tree .
.
├── mods
└── src

2 directories, 0 files

The naming convention for modules is identical to that of packages (com.foo.bar), so pay close attention to the next few steps. Inside the src directory, let us create a directory named com.guessing (note: it is a single directory, not nested
directories). Also, inside the com.guessing directory, let us create a file called module-info.java:

Macushla:guessing-game z0ltan$ mkdir src/com.guessing
Macushla:guessing-game z0ltan$ touch src/com.guessing/module-info.java
Macushla:guessing-game z0ltan$ tree .
.
├── mods
└── src
    └── com.guessing
        └── module-info.java

3 directories, 1 file

So here’s how it works – the directory com.guessing represents the module that we are creating for the game, and it is signified by the module declaration that we will store inside module-info.java. The casing and hyphenation might throw you off a bit, but that’s how Java knows that you have declared a module, so make sure to get the name right!

Right, so let’s go on ahead and write up the module declaration inside module-info.java. Open up the file in your favourite text editor, and then enter the following metadata inside it:

 module com.guessing {
    exports com.my.guessing.game;
 }

Let’s take a moment to understand this bit – the module keyword is how you introduce a module declaration. So we’re declaring here that com.guessing is a module, and that we are exposing the package structure, com.my.guessing.game
from this module. What does this mean? Well, it means that any other module that wishes to use our module can only access the classes that are in this package structure. This is how modules implement real modularity where only the code that needs to be visible to the outside world is made so.

Since we have declared that we are exposing the package com.my.guessing.game, it follows that we should have that package structure in our src directory as well. Let’s go ahead and
create it:

Macushla:guessing-game z0ltan$ mkdir -p src/com.guessing/com/my/guessing/game

And let’s create out single source file named TheGame.java inside the same location:

Macushla:guessing-game z0ltan$ touch src/com.guessing/com/my/guessing/game/TheGame.java

So here’s how the project tree looks at this stage:

Macushla:guessing-game z0ltan$ tree .
.
├── mods
└── src
    └── com.guessing
        ├── com
        │   └── my
        │       └── guessing
        │           └── game
        │               └── TheGame.java
        └── module-info.java
        

Excellent! So, let’s fill out the code for the game now, inside TheGame.java

package com.my.guessing.game;

import java.util.Random;
import java.util.Scanner;

public class TheGame {
	private static Random rand;

	public TheGame() {
		rand = new Random();
	}

	public static void main(String[] args) {
		TheGame game = new TheGame();
	
		game.play();
	}

	private void play() {
		try{
			Scanner in = new Scanner(System.in);		

			System.out.println("Welcome to the Guessing Game!");

			// secret number in the range [1, 100]
			int mySecretNumber = rand.nextInt(100) + 1;
			int attempts = 0;

			while (true) {
				attempts++;

				System.out.println("Enter your guess (between 1 and 100 inclusive)");
				int guess = in.nextInt();

				if (guess < mySecretNumber) {
					System.out.println("Too small!");
				} else if (guess == mySecretNumber) {
					System.out.format("You win! You took %d guesses!\n", attempts);
				} else {
					System.out.println("Too big!");
				}
			}
		} catch (Exception ex) {
			System.err.println("Oh no! Something went terribly wrong. Aborting game!");
			System.exit(1);
		}
	}
}

Nice! So now all we have left to do is to compile the game and run it. Ah, but there remains one more step before we can actually do that. Remember the mods directory we had created in the first step? That directory is meant to hold all the modules that we would generate for our project (there can be as many modules as you like in your project, with various modules dependent on various other modules). In our game, we have only one module,
com.guessing. To hold the contents of this module, we need to create the corresponding module directory in the mods directory, like so:

Macushla:guessing-game z0ltan$ mkdir mods/com.guessing
Macushla:guessing-game z0ltan$ tree .
.
├── mods
│   └── com.guessing
└── src
    └── com.guessing
        ├── com
        │   └── my
        │       └── guessing
        │           └── game
        │               └── TheGame.java
        └── module-info.java

8 directories, 2 files

Excellent! Let’s go ahead and compile the game:

Macushla:guessing-game z0ltan$ javac --module-path=mods -d mods/com.guessing \
  src/com.guessing/module-info.java src/com.guessing/com/my/guessing/game/TheGame.java 
Macushla:guessing-game z0ltan$ tree .
.
├── mods
│   └── com.guessing
│       ├── com
│       │   └── my
│       │       └── guessing
│       │           └── game
│       │               └── TheGame.class
│       └── module-info.class
└── src
    └── com.guessing
        ├── com
        │   └── my
        │       └── guessing
        │           └── game
        │               └── TheGame.java
        └── module-info.java

12 directories, 4 files

Okay, the compilation command requires a bit of explanation – the --module-path option is used to indicate the root directory (or directories) where the module (or modules) is to be stored. In our case, that is the mods directory. The -d flag indicates where the generated class files are to be stored. In our case, this is the mods/com.guessing
directory. Finally, we link in all the source files that need to be compiled, and this includes the module-info.java metadata file as well. Note that all this should be automatically done once IDE support is widely available for Java 9 (IntelliJ apparently has some support for Java 9 already,
and Eclipse also has an extra plugin that supports Java 9).

Also observe the contents of the mods directory. It’s an exact replica of the src directory albeit containing the generated classes.

As a final step, let’s run the game! This is how you would run it from the command-line:

Macushla:guessing-game z0ltan$ java --module-path mods -m com.guessing/com.my.guessing.game.TheGame
Welcome to the Guessing Game!
Enter your guess (between 1 and 100 inclusive)
50
Too small!
Enter your guess (between 1 and 100 inclusive)
75
Too big!
Enter your guess (between 1 and 100 inclusive)
63
Too big!
Enter your guess (between 1 and 100 inclusive)
56
You win! You took 4 guesses!

Again, a bit of commentary on the run command – we again use the --module-path option to tell Java where our module resides, and then we pass the actual fully-qualified name of the class to the -m flag, which tells Java which module we are interested in running. Note that there are various short forms (for instance, we can use
-p> instead of --module-path in this case, refer to the documentation for all the flags and options).

And we’re done!

Extras

One additional interesting feature in Java 9 is that we can bundle our game into a nice little app. To do that, we need to understand that Java 9 actually comes with all its functionality divided into modules. For instance, all core functionality resides in the java.base module. Also note that this module is automatically assumed to be used by all custom modules. For instance, the module declaration that we used for this game could very well have been written explicitly as:

module com.guessing {
   requires java.base;
   exports com.guessing.my.game;
}

where the requires module is the opposite of exports. Beware though that having the requires declaration is not enough to be able to use classes from that module. It only makes the exported packages in that module visible in the require-ing module. That means that in our code, we should still have explicit importS for those public packages. For instance, if we want to find out which modules are present
by default, we can view them so:

Macushla:guessing-game z0ltan$ java --list-modules
java.activation@9-ea
java.annotations.common@9-ea
java.base@9-ea
java.compact1@9-ea
java.compact2@9-ea
java.compact3@9-ea
java.compiler@9-ea
java.corba@9-ea
java.datatransfer@9-ea
java.desktop@9-ea
java.httpclient@9-ea
java.instrument@9-ea
java.jnlp@9-ea
java.logging@9-ea
java.management@9-ea
java.naming@9-ea
java.prefs@9-ea
java.rmi@9-ea
java.scripting@9-ea
java.se@9-ea
java.se.ee@9-ea
java.security.jgss@9-ea
java.security.sasl@9-ea
java.smartcardio@9-ea
java.sql@9-ea
java.sql.rowset@9-ea
... (content elided due to space contraints)

All the modules reside in the $JAVA_HOME/jmods directory, but you might be surprised if you check the contents – they won’t correspond to the output we saw above. The reason for this is the modules that come bundled with the JDK are in “jmod” form, which is a compiled format. The jmod tool comes bundled with the JDK, and can be used to create a jmod binary for our custom modules as well. The terminology for the module format that we have used here is “exploded module”. This JEP contains all the gruesomely interesting details.

Right, so where are we going with this? Ah yes, we wanted to bundle our game into a custom app. Let’s do that (but all this jmod discussion will come in handy later on)!

First off, let’s set our JAVA_HOME environment variable for convenience. Note, this section describes how to set it for macOS users only. For windows users, consult your documentation!

On macOS, one problem is that finding the paths of various installed tools can be a real PITA. Thankfully, we have a bundled tool called java_home to help us out:

Macushla:guessing-game z0ltan$ /usr/libexec/java_home -V
Matching Java Virtual Machines (3):
    9, x86_64:	"Java SE 9-ea"	/Library/Java/JavaVirtualMachines/jdk-9.jdk/Contents/Home
    1.8.0_112, x86_64:	"Java SE 8"	/Library/Java/JavaVirtualMachines/jdk1.8.0_112.jdk/Contents/Home
    1.8.0_45, x86_64:	"Java SE 8"	/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home

/Library/Java/JavaVirtualMachines/jdk-9.jdk/Contents/Home

Okay, so on my machine, I have three different versions of the JDK installed. Clearly, I need to use the Java 9 version, so let’s set the environment variable to use that:

Macushla:guessing-game z0ltan$ export JAVA_HOME="$(/usr/libexec/java_home -v 9)"
Macushla:guessing-game z0ltan$ echo $JAVA_HOME
/Library/Java/JavaVirtualMachines/jdk-9.jdk/Contents/Home

Excellent. Now, let’s go ahead and create our app. For this, we use another new tool that comes bundled with the JDK – jlink. As the name suggests, it is a linker, and we use this linker to link our module into Java, and create a custom version of the JVM that contains only our dependencies (in addition to the mandatory java.base module):

Macushla:guessing-game z0ltan$ jlink --module-path $JAVA_HOME/jmods:mlib --add-modules mods/com.guessing \
> --output my_guessing_app
Error: Module com.guessing not found

Module not found? How is it possible? We specified the fully-qualified module name, mods/com.guessing, didn’t we? Well, the reason jlink failed is that it accepts modules in only two formats – either JMOD (as discussed above), or as a JAR file containing our “exploded” module. So before we can bundle our app, there remains just one more step – bundling our app as a JAR or as JMOD. Let’s go for the jAR file
option:


Macushla:guessing-game z0ltan$ jar --create --file=guessing_game@0.0.1.jar \
  --module-version=0.0.1 --main-class=com.my.guessing.game.TheGame -C mods/com.guessing .

Note how we have to specify the module version explicitly. Java 9 modules do not (as yet) support automatic versioning, nor is versioning checked or enforced (as far as I can tell).

And now we can check that the JAR file contains a valid module by passing the --print-module-descriptor
flag to the jar command:

Macushla:guessing-game z0ltan$ jar --print-module-descriptor guessing_game\@0.0.1.jar 
^CMacushla:guessing-game z0ltan$ jar --print-module-descriptor --file guessing_game\@0.0.1.jar 

com.guessing@0.0.1
  requires mandated java.base
  exports com.my.guessing.game
  main-class com.my.guessing.game.TheGame

So it looks like all is in order. Let’s try creating the app now!

Macushla:guessing-game z0ltan$ jlink --module-path $JAVA_HOME/jmods:mlib --add-modules mods/com.guessing/ \
> --output my_guessing_game
Error: Module mods/com.guessing/ not found
Macushla:guessing-game z0ltan$ jlink --module-path $JAVA_HOME/jmods:mlib --add-modules com.guessing/ \
   --output my_guessing_game
Error: Module com.guessing/ not found

What gives? Note that the official convention is to create a directory to contain the JAR or JMOD file, called mlib. The modules that come bundled with the JDK all follow the same convention as well.

So let’s create a directory called mlib in our project root directory:

Macushla:guessing-game z0ltan$ jar --create --file=mlib/guessing_game@0.0.1.jar \
  --module-version=0.0.1 --main-class=com.my.guessing.game.TheGame \
   -C mods/com.guessing .

So the final directory structure looks like so:

Macushla:guessing-game z0ltan$ tree .
.
├── mlib
│   └── guessing_game@0.0.1.jar
├── mods
│   └── com.guessing
│       ├── com
│       │   └── my
│       │       └── guessing
│       │           └── game
│       │               └── TheGame.class
│       └── module-info.class
└── src
    └── com.guessing
        ├── com
        │   └── my
        │       └── guessing
        │           └── game
        │               └── TheGame.java
        └── module-info.java

13 directories, 5 files

The final step is to run the same command that we had tried before:

Macushla:guessing-game z0ltan$ jlink --module-path $JAVA_HOME/jmods:mlib --add-modules com.guessing --output my_guessing_game
Macushla:guessing-game z0ltan$ ls
mlib             mods             my_guessing_game src

Excellent! So we have a directory called my_guessing_game created. Let’s check its
contents:

Macushla:guessing-game z0ltan$ tree my_guessing_game/
my_guessing_game/
├── bin
│   ├── com.guessing
│   ├── java
│   └── keytool
├── conf
│   ├── net.properties
│   └── security
│       ├── java.policy
│       ├── java.security
│       └── policy
│           ├── README.txt
│           ├── limited
│           │   ├── default_US_export.policy
│           │   ├── default_local.policy
│           │   └── exempt_local.policy
│           └── unlimited
│               ├── default_US_export.policy
│               └── default_local.policy
├── lib
│   ├── jli
│   │   └── libjli.dylib
│   ├── jspawnhelper
│   ├── jvm.cfg
│   ├── libjava.dylib
│   ├── libjimage.dylib
│   ├── libjsig.dylib
│   ├── libnet.dylib
│   ├── libnio.dylib
│   ├── libosxsecurity.dylib
│   ├── libverify.dylib
│   ├── libzip.dylib
│   ├── modules
│   ├── security
│   │   ├── blacklist
│   │   ├── blacklisted.certs
│   │   ├── cacerts
│   │   ├── default.policy
│   │   └── trusted.libraries
│   ├── server
│   │   ├── Xusage.txt
│   │   ├── libjsig.dylib
│   │   └── libjvm.dylib
│   └── tzdb.dat
└── release

10 directories, 34 files

Whoa! That’s a lot of files and directories. The good part is that it is now a full-fledged, self-contained Java app containing our game!

Let’s take it for a spin!

Macushla:guessing-game z0ltan$ cd my_guessing_game/
Macushla:my_guessing_game z0ltan$ ls
bin     conf    lib     release
Macushla:my_guessing_game z0ltan$ bin/
com.guessing  java          keytool       
Macushla:my_guessing_game z0ltan$ bin/com.guessing 
Welcome to the Guessing Game!
Enter your guess (between 1 and 100 inclusive)
50
Too big!
Enter your guess (between 1 and 100 inclusive)
25
Too small!
Enter your guess (between 1 and 100 inclusive)
38
Too small!
Enter your guess (between 1 and 100 inclusive)
44
Too small!
Enter your guess (between 1 and 100 inclusive)
47
Too big!
Enter your guess (between 1 and 100 inclusive)
46
Too big!
Enter your guess (between 1 and 100 inclusive)
45
You win! You took 7 guesses!

Noiiiice! However, before you get too excited, note that this is not AOT (Ahead of time compilation)! bin/com.guessing is simply a script file that uses the bundled JVM instance to run the app. We can verify this by checking the contents of com.guessing:

Macushla:bin z0ltan$ cat com.guessing 
#!/bin/sh
JLINK_VM_OPTIONS=
DIR=`dirname $0`
$DIR/java $JLINK_VM_OPTIONS -m com.guessing/com.my.guessing.game.TheGame $@

Still, pretty nice, innit? We have come a long way from your grandfather’s Java, my friends.

One last interesting observation before we collapse from mental overload! Recall the command we had used to check the modules bundled in the default JDK, java --list-modules? Well, let’s try out the same command on the bundled JVM instance in the bin
directory of our app:

Macushla:my_guessing_game z0ltan$ bin/java --list-modules
com.guessing@0.0.1
java.base@9-ea

Heh! As we expected, since our project depends only on two modules, com.guessing, which is our game, and on the mandatory module, java.base, we see that this stripped down JVM only has these two modules, which leads to a very small overall app size as well (if we had dependencies on other modules, then those would be transitively included
as well):

 Macushla:my_guessing_game z0ltan$ cd ..
Macushla:guessing-game z0ltan$ ls
mlib             mods             my_guessing_game src
Macushla:guessing-game z0ltan$ du -m my_guessing_game/
1	my_guessing_game//bin
1	my_guessing_game//conf/security/policy/limited
1	my_guessing_game//conf/security/policy/unlimited
1	my_guessing_game//conf/security/policy
1	my_guessing_game//conf/security
1	my_guessing_game//conf
1	my_guessing_game//lib/jli
1	my_guessing_game//lib/security
13	my_guessing_game//lib/server
31	my_guessing_game//lib
31	my_guessing_game/
Macushla:guessing-game z0ltan$

A mere 31MB with our bundled JVM! Ain’t that nice? Time for a nice beer now!

Demoing a simple module in Java 9

A simple guessing game in D

This short post is inspired by the recent spate of languages introducing themselves through a small game program. As such, here is a version of the guessing game in D.

The objective of the game is very simple – there is a secret number randomly generated per game (in the range [1, 100] for simplicity), and the user has to guess the number to exit the game.

I also wanted to get a bit more practice with the de facto package cum dependency manager for D, dub.

Let’s start off by creating the project:

Macushla:MiniProjects z0ltan$ dub init guessing_game
Package recipe format (sdl/json) [json]: json
Name [guessing_game]: 
Description [A minimal D application.]: A simple guessing game in D
Author name [Timmy Jose]: 
License [proprietary]: MIT
Copyright string [Copyright © 2017, Timmy Jose]: 
Add dependency (leave empty to skip) []: 
Successfully created an empty project in '/Users/z0ltan/Rabota/ProgrammingLanguages/D/MiniProjects/guessing_game'.
Package successfully created in guessing_game

By default, dub also creates a .gitignore file in the root directory of the new project/package.

Here’s how the layout of the default configuration looks like:

Macushla:MiniProjects z0ltan$ cd guessing_game/
Macushla:guessing_game z0ltan$ tree .
.
├── dub.json
└── source
    ├── app.d

1 directory, 2 files

The dub.json file contains the basic configuration metadata for the project, and is for consumption by dub. Its initial contents are the data entered during the creation of the project.

A sample file, app.d is created inside the source directory. These can all be customised, of course.

Okay, so let’s create a new file to contain our guessing game code as a module.

Macushla:MiniProjects z0ltan$ touch source/guessing_game.d

All right, time to code up the game. Here are the contents of source/guessing_game.d

module guessing_game;

import std.stdio;

public void playGame() 
{
	import std.random: uniform;

	immutable uint secretNumber = uniform(1, 101);
	uint guesses = 0;

	writeln("\nLet's play a guessing game. Your task is to guess the secret number (between 1 and 100, inclusive)...");

	import std.conv: ConvException;

	while (true) {
		try {
			uint guess = getNumber("Enter your guess: ");

			++guesses;
			if (guess == secretNumber) {
				writefln("You win! You took %s guesses.", guesses);
				break;
			} else if (guess < secretNumber) {
				writeln("Too small!");
			} else {
				writeln("Too big!");
			}
		} catch (ConvException ex) {
			writeln("Did not get valid input. Try again...");
		}
	}
}


uint getNumber(string prompt) 
{
	writeln(prompt);

	import std.conv: to;
	import std.string: chomp;
		
	return readln().chomp.to!uint;
}

As you can see, this program is trivially readable for anyone who’s worked with C, C++, or indeed Java. The interesting bits to note would be the module system, as well as selective imports of module’s functions inside local lexical scopes.
Thankfully, D comes with a built-in module, std.random in its standard library (Phobos), and we can use the uniform function to generate our random number.

Another interesting bit to note is the following snippet of code:

return readln().chomp.to!uint;

The to! is a template function that is extremely powerful. It can be used to convert almost any type to any other type. In case the conversion fails though, this template function will throw a std.conv.ConvException exception. Note that D does not have checked exceptions (in Java-speak), only unchecked exceptions.

Also note the exception handling mechanism, which is extremely similar to that of Java.

Finally, observe the declaration of playGame as public. The helper function getNumber in source/guessing_game.d, by contrast is private, and not accessible outside the module in which it is defined.

Okay, so this is the guessing game logic. Now let’s take a look at our game’s entry point, source/app.d

import guessing_game;

void main()
{
	guessing_game.playGame();	
}

All we do is import the guessing_game module and then invoke playGame. Note that we could also have written this module like so:

import guessing_game: playGame;

void main() 
{
      playGame();
}     

Let’s try out the game!

Macushla:guessing_game z0ltan$ dub run
Performing "debug" build using dmd for x86_64.
guessing_game ~master: building configuration "application"...
Linking...
Running ./guessing_game 

Let's play a guessing game. Your task is to guess the secret number (between 1 and 100, inclusive)...
Enter your guess: 
50
Too small!
Enter your guess: 
hello
Did not get valid input. Try again...
Enter your guess: 
-1
Did not get valid input. Try again...
Enter your guess: 
75
Too small!
Enter your guess: 
88
Too big!
Enter your guess: 
81
Too big!
Enter your guess: 
78
You win! You took 5 guesses.
A simple guessing game in D

A simple letn macro in Racket

A long time back I had written a small blog about a ‘letn’ macro in Common Lisp (check it out here). Of late, I have been venturing deeper and deeper into Racket, and I am starting to like the language, and more importantly, the ecosystem more and more, especially with the express purpose of implementing languages and expanding my understanding of programming language theory.

This is why I decided to give the letn macro a go in Racket this time. Of course, this may by no means be the best way to implement it, but it was an enjoyable exercise all the same. The basic idea is to transform a form like:

   (letn [a 1 b 2 c 3]
       (+ a b c))

into the corresponding syntactic form:

   (let ([a 1]
         [b 2]
         [c 3])
      (+ a b c))

The only difference between this version and the Common Lisp version, functionally speaking, is that in this version, I expect the input to be well-formed pairs of variables and values. This makes more sense now since we cannot possibly substitute a sane value for a variable with no associated value (null? 0? void?), and expect a sane result.

Anyway, so here’s how it looks:

;;; a letn macro in Racket.

(module letn racket
  (provide letn)


;;; (letn [a 1 b 2 c 3 d 4 e 5 f 6] (+ a b c d e f)) ->
;;;
;;; (let ([a 1]
;;;       [b 2]
;;;       [c 3]
;;;       [d 4]
;;;       [e 5]
;;;       [f 6])
;;;    (+ a b c d e f))


(require (for-syntax racket/list)) ;; for empty

(begin-for-syntax
  (define (process-args-into-pairs lst)
    (letrec ([f (lambda (pairs lst)
                  (if (null? lst)
                      (reverse pairs)
                      (f (cons (list (car lst) (cadr lst)) pairs) (cddr lst))))])
      (f empty lst))))


(define-syntax letn
  (lambda (stx)
    (syntax-case stx ()
      [(_ (params ...) body0 body ...)
       (let ([pairs (process-args-into-pairs (syntax->datum #'(params ...)))])
         (with-syntax ([arg-pairs (datum->syntax stx pairs)])
           #'(let arg-pairs
                 body0 body ...)))]))))

Some notes

: Okay, so begin by defining the macro in a module of its own. Then we come to this interesting snippet of code”

    (require (for-syntax racket/list)) ;; for empty

What this code means is that we wish to use the racket/list module during compilation-time (since macro-expansion is part of the compilation phase). The reason for this is that we use empty, which denotes an empty list, in our program. Of course it would be easier to simply use the literal form, '() and eschew requiring racket/list altogether, but this is simply to demonstrate how we can require modules whose functions and symbols we would need at compile time.

Next up, we have the following code block:

(begin-for-syntax
  (define (process-args-into-pairs lst)
    (letrec ([f (lambda (pairs lst)
                  (if (null? lst)
                      (reverse pairs)
                      (f (cons (list (car lst) (cadr lst)) pairs) (cddr lst))))])
      (f empty lst))))

The begin-for-syntax starts off a new block where we can define functions that we need during compile time itself. In this case, we need a helper function called process-args-into-pairs which simply takes an input of the form

'(a 1 b 2 c 3 d 4 e 5)

and transforms those into a list of lists:

 '((a 1) (b 2) (c 3) (d 4) (e 5))

This is precisely what the actual macro needs during its expansion. Note that since we only have one helper function in this case, we could have used the simpler version, define-for-syntax to define our helper function like so:

    (define-for-syntax (process-args-into-pairs lst)
        (letrec ([f (lambda (pairs lst)
                      (if (null? lst)
                          (reverse pairs)
                          (f (cons (list (car lst) (cadr lst)) pairs) 
                             (cddr lst))))])
              (f empty lst))))

The process-args-into-pairs helper function itself is extremely straightforward – we simply accumulate lists of pairs of objects from the input list, and then return them to the caller. This code works on the understanding (as mentioned before) that we expect the input to consist of well-formed pairs).

Now let’s get down to the meat of the business, the letn macro itself. Here is the code:

(define-syntax letn
  (lambda (stx)
    (syntax-case stx ()
      [(_ (params ...) body0 body ...)
       (let ([pairs (process-args-into-pairs (syntax->datum #'(params ...)))])
         (with-syntax ([arg-pairs (datum->syntax stx pairs)])
           #'(let arg-pairs
                 body0 body ...)))]))))

There are various ways of defining macros in Racket – define-syntax-rule, define-syntax with syntax-rules,, define-syntax with syntax-case, define-syntax with custom transformer functions, syntax-parse,etc.

In this case, I have decided to use syntax-case since it suits the requirements quite nicely – built-in pattern matching is quite nifty!

So here’s how it works – we pattern-match on the supplied syntax (object), and we expect the pattern to be of the form: (letn [*] *). Note that in Racket, square brackets are essentially equivalent to (and are converted to, internally) parentheses. As a side-note, when entering code, however, mixing square brackets and parentheses for the same form is an error.

In the template (the right-hand-side of this case), we bind pairs to the output of the process-args-into-pairs helper function. Remember that all this happens during compile-time itself. Also note that we need to pass a plain list to the helper function. This is the reason why we need to convert the syntax object into a proper datum (done using (syntax->datum #'(params ...)).

Next, we need to construct the actual form that the invocation of the macro will expand into. In Common Lisp, we do all that using quasi-quoting and unquoting (with splicing if needed). However, in Racket, we have a different set of forms that deal with this business. The helper function returns a list of lists of variable-value pairs, and this list needs to be inserted into the let form that we use for the actual body of the template. This is why we need to convert pairs into a syntax object (since Racket works with syntax objects directly almost throughout). This is why we have
(datum->syntax stx pairs).

Finally, we now return the syntax from the template. Note the reader macro, #' which is shorthand for syntax.

Well, that’s about it! In macro, defining recursive macros is pretty easy using ellipses (...). This is used throughout for both the parameters and the body forms of the macro invocation. This also conveniently ensures that we can nest arbitrary forms inside out stonking new letn macro.

All right, let’s take it for a quick spin:

Here is the test code:

#lang racket

(require "letn.rkt")

(define (test-case-1)
  (letn [a 1 b 2]
        (displayln (+ a b))))

(define (test-case-2)
  (letn (a "hello" b "world")
        (displayln (string-append a ", " b))))


(define (test-case-3)
  (letn (a 1 b 2 c 3)
        (displayln "Adding three numbers")
        (displayln (+ a b c))))


(define (test-case-4)
  (letn (a 1 b 2 c 3)
        (letn (d 4 e 5)
              (displayln "Adding nested variables")
              (displayln (+ a b c d e)))))

(define (test-case-5)
  (letn (a 1 b 2 c 3)
        (let ([d 4]
              [e 5])
          (letn (f 6 g 7)
                (displayln "Nested letnS and letS")
                (displayln (+ a b c d e f g))))))

(define (run-all-tests)
  (test-case-1)
  (newline)
  (test-case-2)
  (newline)
  (test-case-3)
  (newline)
  (test-case-4)
  (newline)
  (test-case-5))  

And here is the output of a test run:

letn-test.rkt> (run-all-tests)
3

hello, world

Adding three numbers
6

Adding nested variables
15

Nested letnS and letS
28

Excellent! This was quite a satisfying little exercise. Now time to up the ante and start off with real languages!

A simple letn macro in Racket

A PigLatin translator in Common Lisp (and contrasting it with the Racket version)

As my curiosity had been piqued by the PigLatin exercise in Racket, I decided to basically translate that code into Common Lisp. The result was interesting to say the least.

Since the explanation of the program logic has already been done in the Racket version, I will skip that part, and only highlight the relevant differences.

First off, the code:

(defpackage #:piglatin
  (:use :cl :cl-user))


(in-package #:piglatin)


(defun translate (sentence)
  (let ((*readtable* (copy-readtable nil)))
    (setf (readtable-case *readtable*) :preserve)
    (mapcar #'make-symbol
            (mapcar #'english-to-piglatin
                    (mapcar #'string-to-list
                            (mapcar #'symbol-name sentence))))))


(defun english-to-piglatin (word)
  (if (starts-vowel-p word)
      (word-to-vowel-rule word)
      (word-to-consonant-rule word)))


(defun starts-vowel-p (word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))


(defun word-to-vowel-rule (word)
  (coerce (append word '(#\w #\a #\y)) 'string))


(defun word-to-consonant-rule (word)
  (let ((was-capital-p (upper-case-p (car word))))
    (labels ((f (word)
               (if (starts-vowel-p word)
                   (cond (was-capital-p (string-capitalize
                                          (coerce (append word '(#\a #\y)) 'string)))
                         (t (coerce (append word '(#\a #\y)) 'string)))
                   (f (append (cdr word) (list (char-downcase (car word))))))))
      (f word))))


(defun string-to-list (word)
  (coerce word 'list))

And a similar test run:

CL-USER> (load "/Users/z0ltan/Rabota/ProgrammingLanguages/CommonLisp/pig-latin.lisp")
T
CL-USER> (piglatin::translate '(|Hello| |world| |we| |meet| |again|))
(#:|Ellohay| #:|orldway| #:|eway| #:|eetmay| #:|againway|)
CL-USER> (piglatin::translate '(|Cucullus| |non| |facit| |monachum|))
(#:|Uculluscay| #:|onnay| #:|acitfay| #:|onachummay|)

This looks much dirtier than the Racket version’s output, but there is good reason for that.

Some notes:

The first thing you’ll observe is the strange input format for the Common Lisp version. What’s the deal with all the pipes in the input? Well, my understanding is this – the default manner in which the Common Lisp reader reads in symbols is to convert the symbol internally to upper-case. This can easily be verified:

CL-USER> (readtable-case *readtable*)
:UPCASE
CL-USER> (symbol-name 'hello)
"HELLO"
CL-USER> (symbol-name 'Hello)
"HELLO"
CL-USER> (symbol-name 'hElLo)
"HELLO"

The :UPCASE in the first output shows that the default behaviour is to convert the symbol to all upper-case. The other possible values for this are: :preserve, :downcase, and :invert. So how do we get around this problem? The easiest approach in this case is to change the reader behaviour to preserve the case of the input symbol (note that this will only work if the input symbol is ensconced within pipes) using the :preserve keyword. That is exactly what we’re doing in the translate function:

(defun translate (sentence)
  (let ((*readtable* (copy-readtable nil)))
    (setf (readtable-case *readtable*) :preserve)
    (mapcar #'make-symbol
            (mapcar #'english-to-piglatin
                    (mapcar #'string-to-list
                            (mapcar #'symbol-name sentence))))))

Remember that dynamic variables have special meaning in Common Lisp. Here, we simply create a copy of the read-table (read up on this if you are unaware of this concept) and bind it to *readtable*. We have to be really careful when working with special dynamic variables. What we are doing here is simply overwriting the actual read-table with our modified version for the scope of the let expression. This avoids poisoning the read-table globally.

So we simply set the read-table mode to :preserve in the line: (setf (readtable-case *readtable*) :preserve), and then we can proceed almost exactly as in the Racket version. We have a top-down approach in which we convert the symbol list into a string list, convert that list into a list of lists of chars (Common Lisp also shares Racket’s behaviour in the sense that a string is not automatically a list of chars), map our english-to-piglatin function over that list, and finally convert the whole list of processed strings into a list of symbols as the output.

Another point of interest is that unlike Racket, Common Lisp doesn’t have built-in convenience functions such as string->list or list->string. Instead, we have a useful (if a bit quirky) and powerful function called coerce. So in the following code, we are simply coercing a string into a list of characters:

(defun string-to-list (word)
  (coerce word 'list))

The rest of the code is pretty much in the same vein, with an interesting twist. As mentioned earlier, the read-table case-munging only works if we escape the symbol within pipes (such as |Hello|), and so the following input screws up the capitalisation entirely, as can be seen in the output:

CL-USER> (piglatin::translate '(hello world))
(#:|Ellohay| #:|Orldway|)

CL-USER> (piglatin::translate '(Hello WORLd))
(#:|Ellohay| #:|Orldway|)

As you can observe, when we use symbols without escaping them with pipes, the reader automatically converts the to upper-case anyway, so the output remains the same irrespective of the case of the input symbol.

This is quite in line with the overall quirkiness of Common Lisp. In that sense, I feel that Racket (and Scheme, by extension) is a much more functional and consistent language, easier to work with. However, Common Lisp did, and always will hold a soft spot in my heart, especially as it offers me countless more opportunities to explore and learn this magnificent beast!

A PigLatin translator in Common Lisp (and contrasting it with the Racket version)

A PigLatin translator in Racket

This post was inspired by a question posted on /r/programming on reddit. As part of helping the person, I realised I could create one just for fun in Racket!

I started out by reading the rules for Pig Latin on Wikipedia (dead simple rules), and the only irksome bit was ensuring that the capitalisation of the words remained intact. Also, the input and output formats were inspired by the posted question (it would have been much simpler to process a list of strings, or even a string of words directly).

Here is the code, followed by a bit of explanation.

(module piglatin racket
  (provide translate)

(define (translate sentence)
  (map string->symbol
       (map english->piglatin
           (map string->list
                (map symbol->string sentence)))))


(define (english->piglatin word)
  (if (starts-vowel? word)
      (word->vowel-rule word)
      (word->consonant-rule word)))


(define (starts-vowel? word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))


(define (word->vowel-rule word)
  (string-append (list->string word) "way"))


(define (word->consonant-rule word)
  (let [(was-capital? (char-upper-case? (car word)))]
    (letrec [(f (lambda (word)
                 (if (starts-vowel? word)
                     (cond [was-capital? (string-titlecase
                                          (string-append (list->string word)                  "ay"))]
                            [else (string-append (list->string word) "ay")])
                      (f (append (cdr word)
                                 (list (char-downcase (car word))))))))]
      (f word)))))

Explanation:

Let’s break down the code into chunks for easier explanation.

The input format is a list of symbols (not strings) such as: '(hello world). As such I figured that it was best to do some top-down programming here. To that end, the main function (and the only one exposed to the outside world) is the translate function:

(define (translate sentence)
  (map string->symbol
       (map english->piglatin
           (map string->list
                (map symbol->string sentence)))))

What I’m doing here is taking the input, sentence, which is in the specified format, and then I convert that into a list of strings, followed by converting that to a list of list of characters (Racket does not parse strings as lists of characters directly, like Haskell does). Following that, I convert each word in this list to its corresponding PigLatin form, and since the output format is again a list of symbols, map the strings back to symbols.

Then we have basically two rules – one for words that begin with vowels, and another one for words that begin with consonants.

(define (english->piglatin word)
  (if (starts-vowel? word)
      (word->vowel-rule word)
      (word->consonant-rule word)))

The starts-vowel? predicate simply checks if the word begins with a vowel or not (we convert the letter to lowercase for easier checking)

(define (starts-vowel? word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))

The rule for words beginning with a vowels is simply to append a “way” to the string:

(define (word->vowel-rule word)
  (string-append (list->string word) "way"))

For words that begin with consonants though, the rule is a bit more involved. Basically, we have to extract the consonant cluster (one or more letters) at the beginning of the word, append that to the end of the updated word (which now begins with a vowel), and also suffix an “ay” at the end. Note that we also have to take care of proper capitalisation in this case, which we didn’t have to do with the vowel rule:

define (word->consonant-rule word)
  (let [(was-capital? (char-upper-case? (car word)))]
    (letrec [(f (lambda (word)
                 (if (starts-vowel? word)
                     (cond [was-capital? (string-titlecase
                                          (string-append (list->string word)                  "ay"))]
                            [else (string-append (list->string word) "ay")])
                      (f (append (cdr word)
                                 (list (char-downcase (car word))))))))]
      (f word)))))

As you can see, since we are performing recursion to extract the consonant cluster, we maintain the original capitalisation of the word in a local binding called was-capital?. Then we recurse through the word, and in case was-capital> is true, we use the string-titlecase function to capitalise the first letter of the processed word. Note that we could have managed without using this built-in function, but the increase in verbosity for this little gain hardly seems worth the effort.

All right, let’s test it out on a variety of input cases:

> (load "pig-latin.rkt")
> (require 'piglatin)
> (translate '(My))
'(Ymay)
> (translate '(My hovercraft is full of eels))
'(Ymay overcrafthay isway ullfay ofway eelsway)
> (translate '(Always the odd one out))
'(Alwaysway ethay oddway oneway outway)
> (translate '(sticks and stones))
'(icksstay andway onesstay)
> (translate '(Hello world we meet again))
'(Ellohay orldway eway eetmay againway)
>(translate '(Cucullus non facit monachum)) 
'(Uculluscay onnay acitfay onachummay)
> (translate '(The quick brown fox jumped over the lazy dog))
'(Ethay uickqay ownbray oxfay umpedjay overway ethay azylay ogday)
> (translate '(Allo allo))
'(Alloway alloway)


This sort of stuff is what makes dynamic languages so dear to me, especially for prototyping. If I had done it in a static language, it would have taken quite a bit more effort (type inference is wonderful, but it does not actually save you from wrestling with the type system!)

A PigLatin translator in Racket

Goodbye Rust, and Hello, D!

After quite a bit of thought and consideration, I have decided to abandon my study of Rust, and move on to D.

I had been learning Rust for a while now, and I have become quite comfortable with it, but there are a few reasons that prompted me to move on to another systems language that would suit me better. Now while D is by no means perfect, I found the following advantages already:

  • A very consistent and intuitive (to me) core language. I have been learning it for only a couple of days now, and I’ve had quite a few “wow, it works exactly as I expected” moments already.
  • Very welcoming and helpful community that actually focuses on the technical side of things rather than getting sidetracked by social causes
  • A wide variety of compilers (DMD, GDC, LDC) targeted for both gcc and LLVM.
  • A stupendously fast compilation cycle. For small programs, I have found
    the compilation times to be on par with Go’s
  • A much saner syntax (in my very early estimation) that actually is quite readable (Rust’s syntax is not quite so readable for anything less-than-trivial in my opinion), and finally
  • Arrays (arguably the most important data structure) are actually sane, consistent, and very much logically intuitive in D unlike the mess that’s C (and C++).
  • Extremely powerful metaprogramming capabilities (an area I am particularly interested in)
  • A much much safer language than C++ while being much more programmer-friendly than Rust. Rust in that sense offloads the whole mental load of memory management onto the developer (and not in a crazy fun way like C or C++).

My use-cases for a systems programming language are quite simple, really – I don’t intend to do OS-level development (at least not in the foreseeable future), and so the GC dependency of D does not affect me in that sense. My research as well as my interactions with the wonderful folks over at forum.dlang.org have convinced me though that even with the GC, performance is still topnotch for a wide variety of systems-level programming.

As part of my learning process (just two days in to be exact), I thought of implementing two programs just to test out the waters before I dived in, and both of them were implemented with much less hassle (infinitely so to be honest), and they just worked beautifully the first time!

Insertion Sort

This is a barebones implementation of Insertion Sort in D:

// insertion_sort.d
import std.stdio;

void insertionSort(int[] arr) {
    foreach(i; 1..arr.length) {
        int key = arr[i];
        int j = cast(int)i-1;
        
        while (j >= 0 && arr[j] > key) {
            arr[j+1] = arr[j];
            j--;
        }
        arr[j+1] = key;
    }
}

void main() {
    int[] array = [100,-199,0,20,2,3,4,5,0,0,1,200];
    
    writefln("Before sorting: %s", array);
    
    insertionSort(array);
    
    writefln("After sorting: %s", array);
}

And here’s a test run:

sh-4.3$ dmd -run insertion_sort.d                                                                                                                                                   
Before sorting: [100, -199, 0, 20, 2, 3, 4, 5, 0, 0, 1, 200]                                                                                                              
After sorting: [-199, 0, 0, 0, 1, 2, 3, 4, 5, 20, 100, 200]  

Simple, and elegant! In fact, this code is directly readable to and understandable by anyone who has even the least bit of familiarity with C, C++, Java, or any member of the extended ALGOL family!

A line number closure

For my second program, I wanted to try out the line number closure (courtesy Hoyte) in D. As it turns out, either D is very logically designed, or my mind is quite attuned to D (or perhaps both!). It was quite pleasurable to be honest. Of course, at this stage, I have no idea if the code is idiomatic D or not. Anyway, let’s have some fun!

// line_closure.d
import std.stdio;

void delegate() getLineClosure() {
    int lineNum = 0;
    
    void printLineNumber() {
        lineNum++;
        
        writefln("Current line: %s", lineNum);
    }
    
    return &printLineNumber;
}

void main() {
    void delegate() x = getLineClosure();
    
    foreach(i; 0..5) {
        x();
    }
    writeln();
    
    void delegate() y = getLineClosure();
    
    foreach(j; 0..3) {
        y();
    }
}

and let’s take it for a spin:

sh-4.3$ dmd -run line_closure.d                                                                                                                                           
Current line: 1                                                                                                                                                           
Current line: 2                                                                                                                                                           
Current line: 3                                                                                                                                                           
Current line: 4                                                                                                                                                           
Current line: 5                                                                                                                                                           
                                                                                                                                                                          
Current line: 1                                                                                                                                                           
Current line: 2                                                                                                                                                           
Current line: 3    

Beautiful! The code probably deserves a bit of explanation – in D, functions are (as far as I can tell), first-class objects, and we can nest functions inside almost any scope. However, a function that is nested within a function or block scope is given a special name – a delegate. So the return type of the getLineClosure function:

void delegate()

indicates that we are returning a delegate which takes no parameters and returns nothing. Also note that we explicitly return the address of the delegate unlike in C, where the name of the function itself is the function pointer.

For this example to work, the delegate has to be able to capture the local variable, lineNum of course, and so delegates in D essentially function as closures in other Functional Programming languages. Also note the clean syntax of foreach for looping (C style loops are also supported, of course).

In conclusion, one major difference that I observed between Rust and D (I’m much more familiar with Rust, of course) is that I can more fully focus on the problem itself with D whereas with Rust, I have to always be aware of (and worrying about) what I am doing with the bindings/variables in my program, and I don’t think it’s altogether a question of getting familiar and comfortable with that. The ownership and borrowing concepts of Rust are, in and of themselves, trivial. However, the entire onus of managing proper memory behaviour is entirely on the developer. I wonder how much that would actually scale in real life. I suppose we’ll know the answer when people start developing large industry-standard (as much as I despise that term) in Rust.

Goodbye Rust, and Hello, D!

A small vector initialisation macro in Rust

Learning Rust’s macros was almost a sort of déjà vu for me. It is remarkably similar to the new-fangled macro system used in Racket. In the first place, Racket macros are themselves quite different (syntactically and semantically) from Common Lisp macros (which I love, but I have to admit that Racket macros are much more user-friendly), and so it was quite surprising to see Rust’s macro-rules! are almost identical to Racket’s syntax-rules, including the Pattern Matching, and of course, hygiene.

I doubt that’s an historical accident, of course, but the funny bit is that it only adds to Rust’s deep heritage of borrowing not only semantics, but also syntax from other languages (I have already seen traces of SML and Haskell, and now Racket). Nevertheless, the best part for me was that it’s helping me pick up Rust’s macros quite quickly and easily (not an expert by any means), including recursive macros (which are, admittedly, rather difficult to grok in Common Lisp).

So in this short blog, I wanted to post a small macro that I wrote as part of my ongoing study of Rust. It is a simple vector initialisation macro. All this macro does is to take in a binding of the form:

let some_vec = init_vec![size; init_value];

where size determines how many elements we want in the vector, and init_value determines what value the entire vector will be initialised with.

At this juncture, it would be probably be wise to mention another aspect of Rust’s macros that pleasantly surprised me – in the macro call shown above, the square brackets can actually be replaced by a pair of parentheses, or even a pair of flowery brackets (or braces as some folks call them). This is again very similar to Racket’s behaviour. However, we can mix two different styles of brackets. Just to clarify this further, the following three are identical:

let some_vec = init_vec![size; init_value];
let some_vec = init_vec!(size; init_value);
let some_vec = init_vec{size; init_value};

However, something like the following is illegal:

let some_vec = init_vec!{size; init_value];

Also, before showing the code, it is equally interesting to note that the rule also applies to the definition of the macro itself. Instead of flowery brackets, we could as well have chosen to use parentheses to wrap the rule in. In fact, that would even be my preferred style since Common Lisp has made me more or less immune to parentheses! (only half-joking).

Here is the macro and some test code for it:

/// Here we demonstrate a macro that takes in a size, and an init value
/// and returns a vector of that size initialised with the init value.

use std::fmt::Debug;
use std::fmt::{Display, Formatter, Result};

// the main man!
macro_rules! init_vec {
	( $size: expr; $init: expr ) => { {
			let mut temp_vec = Vec::new();

			for _ in 0..$size {
				temp_vec.push($init);
			}

			temp_vec
		};
	} // block necessary here
}

// a custom structure
#[derive(Debug)]
struct Foo {
	bar: f64,
}

impl Foo {
	fn new() -> Foo {
		Foo {
			bar: 0.0,
		}
	}
}

impl Display for Foo {
	fn fmt(&self, f: &mut Formatter) -> Result {
		write!(f, "{{ bar: {} }}", self.bar)
	}
}


fn print_vec<T: Debug + Display>(name: &str, vec: &Vec<T>) {
	println!("{} = {:?}", name, vec);

	print!("The contents are... ");

	for e in vec.iter() {
		print!("{} ", *e);
	}
	println!("\n");
}


fn main() {
	let int_vec = init_vec![10; 99];
	print_vec("int_vec", &int_vec);

	let str_vec = init_vec![5; "hello"];
	print_vec("str_vec", &str_vec);
	
	let mut foo_vec = init_vec![7; Foo::new()];
	print_vec("foo_vec", &foo_vec);

	// we can also use it like any normal vector
	foo_vec[0].bar = 100.0;
	foo_vec[3].bar = 2.71828;
	foo_vec[5].bar = 3.14159;

	println!("After mutation...\n");

	print_vec("foo_vec", &foo_vec);
}

Let’s take it for a quick run:

int_vec = [99, 99, 99, 99, 99, 99, 99, 99, 99, 99]
The contents are... 99 99 99 99 99 99 99 99 99 99 

str_vec = ["hello", "hello", "hello", "hello", "hello"]
The contents are... hello hello hello hello hello 

foo_vec = [Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 0 }]
The contents are... { bar: 0 } { bar: 0 } { bar: 0 } { bar: 0 } { bar: 0 } { bar: 0 } { bar: 0 } 

After mutation...

foo_vec = [Foo { bar: 100 }, Foo { bar: 0 }, Foo { bar: 0 }, Foo { bar: 2.71828 }, Foo { bar: 0 }, Foo { bar: 3.14159 }, Foo { bar: 0 }]
The contents are... { bar: 100 } { bar: 0 } { bar: 0 } { bar: 2.71828 } { bar: 0 } { bar: 3.14159 } { bar: 0 } 

Excellent! So it all works pretty much as expected. The code probably merits a bit of explanation – in the macro, we simply define a rule which expects an expression ( $size: expr), followed by a semi-colon, followed by another expression ($init: expr).

Note that expr has a special meaning in Rust. It is what is called a “designator” and defines the type of the data being passed in. There are various such designators available (ident for identifiers. type for types, etc. Refer to the documentation for further details).

So the whole ( $size: expr; $init: expr ) denotes the pattern that, if matched, leads to the logic on the right-hand side of the =>. Now, the right hand-side is where all the action takes place. All we do there is to create a temporary vector, and then insert as many init values as size inside the vector, and finally return the vector. Since there are multiple expression/statements in the rule, we safely ensconce them within a block using the flowery brackets. Note that there is really no type-checking done inside to see if size is a valid numeric type. This may not be possible since macro-expansion should happen very early in the compilation phase. However, I might be wrong (as many times before!), and if I come across such means, I will update this blog accordingly.

So, it was a quick little fun exercise, and it makes Rust so much more appealing to me now especially since it’s been some time since I’ve played around with macros in Common Lisp. Maybe one of these days, eh?

A small vector initialisation macro in Rust

A simple demo of using channels in Rust

Continuing with my study of Rust, I ventured into concurrency using Rust at long last!

The standard library in Rust provides a few basic concurrency primitives, but of them all, my favourite has to be the support for channels. I decided to try out a small program where the factorial of a number is calculated in another thread, and the result piped back into the calling function.

The code is presented as follows:

/// Factorial using channel

use std::io;
use std::str::FromStr;
use std::thread;
use std::sync::mpsc::{channel, Sender};


fn factorial(tx: Sender<u64>, n: u64) {
    fn helper(acc: u64, n: u64) -> u64 {
        if n == 0 { acc } else { helper(acc * n, n - 1) }
    }

    thread::spawn(move || { tx.send(helper(1u64, n)).unwrap(); });
}


fn get_number() -> u64 {
    println!("Enter a number");

    let mut input = String::new();

    io::stdin()
        .read_line(&mut input)
        .expect("could not read input");

    match u64::from_str(input.trim()) {
        Ok(n) => n,
        Err(e) => panic!(e.to_string()),
    }
}


fn main() {
    let n = get_number();

    let (tx, rx) = channel();

    factorial(tx, n);

    println!("The factorial of {} is {}", n, rx.recv().unwrap());
}

Pretty straightforward code. We create a channel using std::sync::mspsc::channel, which returns a (Sender, Receiver) tuple. As the names suggest, Sender is used for transmitting data on the channel, and Receiver is used for receiving data on the same channel.

This approach also precludes the need for waiting on the thread explicitly.

Let’s take it for a spin:

bash-3.2$  rustc channel_factorial.rs && ./channel_factorial
Enter a number
20
The factorial of 20 is 2432902008176640000

Short and sweet!

The contrived example without using a channel might look something like so:

/// concurrent factorial using plain threads

use std::thread;
use std::sync::{Arc, Mutex};


fn factorial(n: u64) -> u64 {
    fn helper(acc: u64, n: u64) -> u64 {
        if n == 0 { acc } else { helper(acc * n, n - 1) }
    }

    helper(1u64, n)
}


fn main() {
    let data = Arc::new(Mutex::new(0u64));

    let data_clone = data.clone(); // bump the RC count

    let handle = thread::spawn(move || {
        let mut fact_cell = data_clone.lock().unwrap();
        *fact_cell = factorial(10);
    });

    match handle.join() {
        Ok(_) => println!("Factorial of 10 = {:?}", data),
        Err(e) => panic!(e),
    }
}

Not too bad, eh? The only difference is that we wrap the actual data inside a Mutex, which is itself wrapped inside an Arc object. This is to allow for shared mutable data using reference counting (since every object has exactly one owner at any point of time).

Inside the spawned thread, we can then get a lock on the cloned data object, and then we update the wrapped data with the factorial value. The problem with this approach is that it is not trivial to extract the value out. Taking it for a test run:

bash-3.2$ rustc concurrent_factorial.rs && ./concurrent_factorial
Factorial of 10 = Mutex { data: 3628800 }

As we can see, we do get the correct result, but unfortunately it’s wrapped up inside the mutex object, and getting it out is not easy since the API itself doesn’t seem to provide any way of doing so. Suggestions on StackOverflow seem to imply wrapping the data inside an Option object beforehand (as in something like

Arc<Mutex<Option<T>>>

), but that road didn’t lead me anywhere nice and safe either. I probably need to learn more about this approach in addition to perusing the documentation for a nice and elegant way out of this jumble of wrapped generic types (if there is one). No wonder I’d rather have the nice Go-esque channels!

UPDATE: Thanks to the good folks over at #rust, I got a simple and elegant solution for the unwrapping issue! Here is the update code:

/// concurrent factorial using plain threads

use std::thread;
use std::sync::{Arc, Mutex};


fn factorial(n: u64) -> u64 {
    fn helper(acc: u64, n: u64) -> u64 {
        if n == 0 { acc } else { helper(acc * n, n - 1) }
    }

    helper(1u64, n)
}


fn main() {
    let data = Arc::new(Mutex::new(0u64));

    let data_clone = data.clone(); // bump the RC count

    let handle = thread::spawn(move || {
        let mut fact_cell = data_clone.lock().unwrap();
        *fact_cell = factorial(10);
    });

    match handle.join() {
        Ok(_) => println!("Factorial of 10 = {:?}", *data.lock().unwrap()),
        Err(e) => panic!(e),
    }
}

The change is this: *data.lock().unwrap() inside the match in main. The explanation given was this: since the data itself is wrapped inside a Mutex within an Arc, it needs to be unwrapped in the reverse order – dereference the Arc to get the Mutex, and then dereference that to get the data! The unwrap is necessary since lock() returns a LockResult object.

The Rust community is indeed one of the most helpful communities on IRC. The problem which had me vexed for an hour or so was solved within minutes!

A simple demo of using channels in Rust

A simple Tree data structure in Haskell

This is part of my attempt to learn Haskell in a proper manner. I have been on and off learning Haskell for quite a few years now, always getting interrupted for some reason. This time, I decided to bite the bullet and get on with it!

As part of that, I started off with my favourite Haskell textbook, Professor Graham Hutton’s “Programming in Haskell”, 2nd Edition. This is arguably THE book that opened my eyes to Functional Programming in general. This book is almost perfect with one small flaw, in my opinion, of course. The bottom-up approach of solving problems is excellent, but it would have been rather nice if a small synopsis of the problem and the proposed approach were given at the beginning instead of having to read the whole code to understand it.

Having finished around half of that book, I realised that it was a bit short on examples, problems, and the downright dirty parts of Haskell, introducing IO only around the middle of the book. That’s when I decided to go for Allen and Moronuki’s book, “Haskell Programming from First Principles” to get up to speed on actual real-world programming in Haskell. The first few chapters are extremely basic (and I do find the tone a bit too prescriptive and pedantic, but that should suit a complete beginner just fine. For practice and hands-on though, it is a most excellent resource.

As part of my reboot of Haskell, I decided to implement a small Tree data structure in Haskell, all with the express aim of getting rid of the cobwebs. I can’t say that I’m too happy with my first attempt even though it clearly works more or less as expected. It feels very rough and creaky and the Null vs Leaf issue is pretty irritating as well. Well, after some more practice, I should be able to start iterating on better versions!

For now, here’s the code. It’s pretty much self-explanatory. The basic Tree type (single type parameter for now!), a helper function to create a Binary Search Tree (easier to confirm the output using preorder traversal), and the three standard binary tree traversal algorithms.

{- A binary tree implementation -}

data Tree a = Null | Leaf a | Node (Tree a) a (Tree a)  deriving Show


createNode :: a -> Tree a
createNode value = Leaf value 


addNode :: (Ord a) => Tree a -> a -> Tree a
addNode tree value = case tree of
                        Null -> createNode value
                        Leaf x -> if value <= x then
                                    Node (createNode value) x Null
                                  else
                                    Node Null x (createNode value)
                        Node l n r -> if value <= n then
                                        Node (addNode l value) n r
                                      else
                                        Node l n (addNode r value)

makeBST :: Ord a => [a] -> Tree a
makeBST [] = Null  
makeBST (n:ns) = addNode (makeBST ns) n
                     


-- inorder traversal (left, root, right)
inorder :: Tree a -> [a]
inorder tree = case tree of
                 Null -> []
                 Leaf x -> [x]
                 Node l n r -> inorder l ++ [n] ++  inorder r


-- preorder traversal (root, left, right)
preorder :: Tree a -> [a]
preorder tree = case tree of
                  Null -> []
                  Leaf x -> [x]
                  Node l n r -> [n] ++ preorder l ++ preorder r


-- postorder traversal (left, right, root)
postorder :: Tree a -> [a]
postorder tree = case tree of
                   Null -> []
                   Leaf x -> [x]
                   Node l n r -> postorder l ++ postorder r ++ [n]

A sample run:

*Main> let bst = makeBST [10,18,12,7,10,11,3,4,4,-2,-3,100]
*Main> bst
Node (Node Null (-3) (Node Null (-2) (Node (Node (Leaf 3) 4 Null) 4 (Node (Node (Node Null 7 (Leaf 10)) 10 Null) 11 (Node Null 12 (Leaf 18)))))) 100 Null
*Main> inorder bst
[-3,-2,3,4,4,7,10,10,11,12,18,100]
*Main> preorder bst
[100,-3,-2,4,4,3,11,10,7,10,12,18]
*Main> postorder bst
[3,4,10,7,10,18,12,11,4,-2,-3,100]
*Main> 

Ah well, it’s quite exciting to see something working as expected, isn’t it? Well, onwards then with Haskell (and then on to Agda and Idris, which I am very much interested in for edificational purposes!)

A simple Tree data structure in Haskell

A Line number closure in Rust, and an interesting battle with the Rust compiler

It had been a while since I’d played around with Rust, so I decided to implement one of my two favourite programs when learning new languages viz., a line number closure (with mutation, of course) as shown by Hoyte in his seminal book, “Let Over Lambda”. The other program is, of course, implementing a functioning Rational number class (as in the previous blog post, for instance).

Well, with Rust, it’s a very…interesting experience to say the least. The Rust Borrow Checker is a very strange beast indeed. So, let’s have a look at how we can arrive at a working version of the program.

Background

Rust does have powerful support for closures (what respectable language does not today?). However, the interesting part is that closures in Rust are simply syntactic sugar for traits. To be more precise, there are three available traits that a closure can belong to:

  • the Fn trait – this is the least permissive of all, and it allows a read-only closure (in effect). This is ideal when we want to simply have a self-contained immutable closure. For instance,
    fn square<F>(n: i32, callback: F)
        where
        F: Fn(i64) ->() {
        callback(n as i64 * n as i64);
    }
    
    fn main() {
        square(100, |n| {  println!("Received {}", n); });
    }
    

    This duly prints out:

    Received: 10000
    

    This is quite straightforward. As we can see, The type of the callback, which is the closure, is Fn(i64) -> (), which means that it takes a 64-bit integer and does some side-effect (printing to screen in this case). No mutation anywhere, no capturing of environments, and so the Fn trait works superbly here.

  • the FnMut trait is more interesting. This is needed when we need to mutate the environment captured by the closure. As a small example:
    fn process_mutating_closure<F>(mut closure: F)
        where
        F: FnMut(i32) -> () {
        closure(100);
    }
    
    fn main() {
        let mut local = 100;
    
        process_mutating_closure(|x| { 
                  local += 100; 
                  println!("Value = {}", x + local); 
        });
    
    

    which produces:

    Value = 300
    

    All right, this is a bit more complicated than the previous version. So, let’s break it down. What we intend to do is this – the closure that’s created in the main function captures a mutable variable, local, in the enclosing lexical scope. The closure then modifies this captured variable, and uses it for some dummy processing before printing the final value out.

    Since a captured variable is being modified, the type of the closure will be FnMut. This is exactly what’s reflected in the type signature when the closure is passed as an argument to process_mutating_closure.

  • , and finally, the FnOnce trait is the most restrictive of all. It basically transfers ownership of the capture to the closure itself. If you are not familiar with Rust’s wonderful (sarcasm) Ownership and Borrowing world, consider yourself lucky.
    fn move_closure_demo<F: FnOnce(i32) -> ()>(closure: F) {
        closure(100)
    }
    
    fn main() {
        let mut local = 100;
    
        move_closure_demo(move |x| {
            local += 100;
            println!("Result = {}", x + local);
        });
    
        println!("{}", local);
    }
    
    Result = 300
    

    Eh? This code looks almost identical to the previous one (except for the change in the function signature of the closure from FnMut to FnOnce and the move in front of the closure). Well, if you notice, you will also see that the function var, closure in the function move_closure_demo also does not require mut unlike in the previous case.

    What’s happening here is that because we have “moved” the closure, the closure is now the owner of the captured variable, local. Wait a minute, so how come we can still use local in main after it’s been moved? It turns out that because the binding of local is to a Copy-able type (i32 in this case), a copy of the binding’s value is actually moved. However, this won’t work for types that do not implement the Copy trait. Confused? Well, you should be. I mean, wtf?

All right, now let’s get on with the meat of the actual matter.

First attempt

Let’s put our knowledge of how closures are actually syntactic sugar for the aforementioned trifecta of traits to use. Here’s the first shot:

fn gen_line_number() -> (FnMut() -> ()) {
    let mut line_no = 0;

    || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    }
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

Since we’re modifying the capture var, line_no, we need the closure to be of the trait type, FnMut. Looks okay, let’s take it for a spin:

bash-3.2$ rustc hoyte-blog.rs 
error[E0277]: the trait bound `std::ops::FnMut() + 'static: std::marker::Sized` is not satisfied
 --> hoyte-blog.rs:1:26
  |
1 | fn gen_line_number() -> (FnMut() -> ()) {
  |                          ^^^^^^^^^^^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut() + 'static`
  |
  = note: `std::ops::FnMut() + 'static` does not have a constant size known at compile-time
  = note: the return type of a function must have a statically known size

error[E0308]: mismatched types
 --> hoyte-blog.rs:4:5
  |
4 |     || {
  |     ^ expected trait std::ops::FnMut, found closure
  |
  = note: expected type `std::ops::FnMut() + 'static`
  = note:    found type `[closure@hoyte-blog.rs:4:5: 8:6 line_no:_]`

error[E0277]: the trait bound `std::ops::FnMut(): std::marker::Sized` is not satisfied
  --> hoyte-blog.rs:12:9
   |
12 |     let foo = gen_line_number();
   |         ^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut()`
   |
   = note: `std::ops::FnMut()` does not have a constant size known at compile-time
   = note: all local variables must have a statically known size

error[E0277]: the trait bound `std::ops::FnMut(): std::marker::Sized` is not satisfied
  --> hoyte-blog.rs:20:9
   |
20 |     let bar = gen_line_number();
   |         ^^^ the trait `std::marker::Sized` is not implemented for `std::ops::FnMut()`
   |
   = note: `std::ops::FnMut()` does not have a constant size known at compile-time
   = note: all local variables must have a statically known size

error: aborting due to 4 previous errors

Whoa! Okay, let’s read the error messages in order. Maybe that will help us fix the issue better. Right, so the first one says something about std::marker::Sized not being implemented… blah blah blah. What all this verbosity actually means is that we are returning a closure from the function, and it needs to know the size of the closure at compile time itself.

Fair enough. However, why does it work when we’re passing a closure into a function the same way (instead of returning the closure from it)? Who the hell really knows? Well, the basic problem here (as I figure it) is that we need to wrap the whole damned thing inside a heap pointer (aka the Box type) since we’re returning from the function, and the stack frame for that function is going to get de-allocated, we need to return a heap pointer to the object so that it can be used even after the current function frame’s destroyed.

This is the reason why code like the following works because the callback function, closure cannot outlive the function where the closure was generated (main):

fn foo<F: Fn() -> ()>(closure: F) {
    closure();
}

fn main() {
    foo(|| { println!("Hello, world!");});
}
Hello, world!

Okay, let’s proceed.

Second Attempt

Having gained that great insight from the first version, here’s how the second version looks like:

fn gen_line_number() -> Box<FnOnce() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

and…

bash-3.2$ rustc hoyte-blog.rs 
error[E0161]: cannot move a value of type std::ops::FnOnce(): the size of std::ops::FnOnce() cannot be statically determined
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^

error[E0161]: cannot move a value of type std::ops::FnOnce(): the size of std::ops::FnOnce() cannot be statically determined
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^

error[E0382]: use of moved value: `*foo`
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^ value moved here in previous iteration of loop
   |
   = note: move occurs because `*foo` has type `std::ops::FnOnce()`, which does not implement the `Copy` trait

error[E0382]: use of moved value: `*bar`
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^ value moved here in previous iteration of loop
   |
   = note: move occurs because `*bar` has type `std::ops::FnOnce()`, which does not implement the `Copy` trait

error: aborting due to 4 previous errors

Sigh. Just when things were looking up… theoretically. All right, before we lose our sanity, let’s read all the bloody error messages and try to make some sense of them. Okay, ignoring the first two about statically size something, we see that the other errors are complaining about foo and bar being already moved in the previous iteration of the loop. At least that makes some sort of sense – since we have the closure passed to main and assigned to foo and bar, and, as the error message says, FnOnce does not implement the Copy trait, we cannot change the owner of the closure in subsequent iterations of the for loop.

Let’s rectify that by using the FnMut trait instead, which is actually what we want since we are not using any ownership changes, and we’re mutating the captured local variable, it sounds like the perfect candidate for our use case.

In case you’re wondering about modifying the main function instead to use references inside the for loops, I assure you that it’s not an oversight on my part – down that road lies perdition! (Reflect upon the fact that I have a working copy, and I’m actually working backwards through some sane progression amongst the many that I had to endure along the way!)

Third attempt

Here’s the version with the closure type changed to FnMut:

fn gen_line_number() -> Box<FnMut() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

And yes, you guessed it, it simply doesn’t work, but you’ll love the error messages:

bash-3.2$ rustc hoyte-blog.rs 
error: cannot borrow immutable `Box` content `*foo` as mutable
  --> hoyte-blog.rs:15:9
   |
15 |         foo();
   |         ^^^

error: cannot borrow immutable `Box` content `*bar` as mutable
  --> hoyte-blog.rs:23:9
   |
23 |         bar();
   |         ^^^

error: aborting due to 2 previous errors

Hmmm… hmmm? It mentions that when I call foo and bar, it’s trying to make something mutable that’s actually immutable. Eh? Okay, let’s see. The local captured variable was “moved” to live beyond the function’s stack frame, so that’s fine. I’m using FnMut so that I can actually mutate the captured variable, that’s fine. I’m wrapping the whole damned thing inside a Box so that the whole closure can persist beyond the function call, and finally, all I’m trying to do is to increment the mutable captured line_no var inside my for loops. That also sounds fine. So when did the contents of the box become immutable?

All right, let’s see the final version first.

Final version

fn gen_line_number() -> Box<FnMut() -> ()> {
    let mut line_no = 0;

    Box::new(move || {
        line_no += 1;

        println!("Current line number: {}", line_no);
    })
}

fn main() {
    let mut foo = gen_line_number();

    for _ in 0..5 {
        foo();
    }
    
    println!("");

    let mut bar = gen_line_number();

    for _ in 0..5 {
        bar();
    }
}

and running it,

bash-3.2$ rustc hoyte-blog.rs 
bash-3.2$ ./hoyte-blog 
Current line number: 1
Current line number: 2
Current line number: 3
Current line number: 4
Current line number: 5

Current line number: 1
Current line number: 2
Current line number: 3
Current line number: 4
Current line number: 5

Wow! Glory at last! Heh. Well, if you have the eyes of a hawk, you’ll observe that the only change was adding the mut qualifier to the closure handles in main. So what that extremely useful, intuitive, and helpful error message was telling us is that we have a boxed closure containing a captured mutable local var which we’re updating in every iteration of the respective for loops, but unfortunately we have assigned the closures to immutable handles, and so the whole thing just won’t work.

Whew. And seriously, wtf?

Rust appears to be great fun when things go well, but when things go badly (and they will, of course), it can be quite a PITA. Granted, I am not an expert or even an intermediate Rust programmer, but perhaps people’s complaints about the learning curve being seriously fucked up aren’t really exaggerated. If the learning curve had been at least logical, it would be a different matter. As things stand though, the more I venture into the language, the more “patched-together” it appears to be. Also, all the various exceptions to the rules in addition to the myriad rules about exceptional scenarios aren’t really helping, folks. I doubt anything can really be done about it now, though. Heh.

A Line number closure in Rust, and an interesting battle with the Rust compiler