Interop mini-series – Callbacks special! (Common Lisp special) (Part 2b)

This is a continuation of the previous post callbacks interlude. I decided to give the section pertaining to Common Lisp its own post as I think there is some good educational value in this part itself.

We carry on from where we left off last time. We continue with the same squaring number callback example.

As a quick refresher, the idea is to implement a synchronous callback scenario. The function client invokes another function squarify which squares the passed value and invokes a callback function callback.

How it’s done in Common Lisp

Let’s start off with our first attempt to implement the solution in Common Lisp.

;;;; Callback demo using the squarify example.

(defpackage :callback-demo-user
  (:use :cl))

(in-package :callback-demo-user)

(defun callback(n)
  (format t "Received: ~d~%" n))

(defun squarify(n cb)
  (funcall cb (* n n)))

(defun client ()
  (let ((n (progn
             (princ "Enter a number: ")
             (read))))
    (squarify n #'callback)))
CALLBACK-DEMO-USER> (client)
Enter a number: 19
Received: 361
NIL

That’s the direct equivalent of all the demos shown so far. However, since Common Lisp is a functional language (albeit not as pure as, say, Scheme or Haskell), we can certainly do better!

In most Functional Programming languages, higher order functions are usually deployed to do the job. So let’s see if we can cook up something nicely functional like function composition.
Here’s a first attempt:

(defun client()
  (funcall #'(lambda (n)
               (format t "Received: ~d~%" n))
           (funcall #'(lambda (n)
                        (* n n))
                    (funcall #'(lambda ()
                                 (princ "Enter number: ")
                                 (read))))))

Which produces:

CALLBACK-DEMO-USER> (client)
Enter number: 19
Received: 361
NIL

As expected! Now, as you may know, funcall simply takes a function and some arguments (optional), and applies the function to those arguments. In this case, we simply compose them in the proper order so that the types match up: read a number -> square it -> print message.

However, let’s work our way to a generic compose function that simulates the behaviour of Haskell’s composition operator. The previous function can be improved by defining a new version that composes the three functions in the mentioned order (so as to match types):

The compose function:

(defun compose (fn gn hn)
  #'(lambda (&rest args)
      (funcall fn (funcall gn (apply hn args)))))

And the client to test it:

(defun client ()
  (funcall (compose #'(lambda (x)
                        (format t "Received: ~d~%" x))
                    #'(lambda (x)
                        (* x x))
                    #'(lambda ()
                        (princ "Enter a number: ")
                        (read)))))

And the output is the same:

CALLBACK-DEMO-USER> (client)
Enter a number: 19
Received: 361
NIL

So what’s changed? Well, taking inspiration from the nested funcall function, we defined compose to invoke the functions in the proper order – first read the number, and then square it, and then finally print it! (Remember that the functions are composed in reverse order in which they are entered).

Note that the last function invocation is done using apply instead of funcall because &rest args produces a list of arguments, and funcall does not work with that (unless the function definition takes a list itself as a parameter, but that is not the general case, and apply works very well with lists and destructures them correctly.

How can we make this generic enough though? We notice the pattern – we invoke apply on the innermost function call, but we use funcall for the rest of the function call chain. This means that we must handle two cases – if there is a single function passed in, we should simply use apply on that, and if not, we should take care to chain them up as discussed. This lends itself to a nice recursive definition as shown next.

The updated compose function:

(defun compose (&rest funcs)
  (labels ((f (funcs args)
             (if (null (cdr funcs))
                 (apply (car funcs) args)
                 (funcall (car funcs) (f (cdr funcs) args)))))
    #'(lambda (&rest args)
        (f funcs args))))
)

And the test client for it:

(defun client ()
  (funcall (compose #'(lambda (x)
                        (format t "Received: ~d~%" x))
                    #'(lambda (x)
                        (* x x))
                    #'(lambda ()
                        (princ "Enter number: ")
                        (read)))))

And the output:

CALLBACK-DEMO-USER> (client)
Enter number: 19
Received: 361
NIL

Explanation: What we do is simply generalise the three-function version of compose into a generic function. For this we, define an internal function f that takes the supplied functions and the arguments as input.

f then recursively decomposes the function applications. The base condition (stopping condition) is when there is only one function left. The (if (null (cdr funcs)) bit then takes care to return the only apply call that we need, and that is of course, applied to the args argument.

As the recursion unwinds the call stack, successive funcallS are applied at each stage. This is exactly in line with the algorithm discussed at the end of the last section.

Now we are almost home and dry! Pay special attention to the order in which the lambda equivalents of the functions are entered in the client function. They are applied in the following order – callback, squarify, and then client.

We could stop here, but there’s one more change that we can make. The current version of compose works absolutely as expected, but the intuitive order of supplying functions is the opposite of what we could expect as a user. The expected order would be, in English, “read in the number, square it, and then print out a message indicating that the number was received”.

Let’s fix that last bit for out final version of compose.

Final version of compose:

;;; final version of compose
(defun compose(&rest funcs)
  (labels ((f (funcs args)
             (if (null (cdr funcs))
                 (apply (car funcs) args)
                 (funcall (car funcs) (f (cdr funcs) args)))))
    #'(lambda (&rest args)
        (f (reverse funcs) args)))))

And the corresponding test code:

;;; test out the final version of compose
(defun client ()
  (funcall (compose #'(lambda ()
                        (princ "Enter a number: ")
                        (read))
                    #'(lambda (x)
                        (* x x))
                    #'(lambda (x)
                        (format t "Received: ~d~%" x)))))

And now let’s test out and see if it works!

CALLBACK-DEMO-USER> (client)
Enter a number: 19
Received: 361
NIL

Success!

The only difference is this line: (f (reverse funcs) args). We simply reverse the order of the received functions while passing it to the recursive function f, and the rest of the code remains exactly the same!

And, of course, this is purely functional! Sweet, ain’t it?

The compose function could be optimised in multiple ways – converting it to an iterative version for instance, but conceptually, this works exactly as advertised.

Conclusion

This post illustrates why I love Common Lisp! Even as I make my journey through the world of Common Lisp, my admiration for it only grows. If there is some feature that we would like to incorporate into the language, it can be done in a just a few lines of code! No other language truly comes close in terms of expressiveness and extensibility.

Interop mini-series – Callbacks special! (Common Lisp special) (Part 2b)

Interop mini-series – Callbacks special! (Part 2a)

This post was actually meant to be part of the previous post(Calling C and C++ from Common Lisp).

However, as I began writing the section of “callbacks”, it started growing to such an extent that I decided to give its own post with a slightly more comprehensive treatment than originally planned!

Contents

  1. What is a callback?
    1. Uses of callbacks
    2. Methods of implementation
  2. Demos
    1. How it’s done in C
    2. How it’s done in C++
    3. How it’s done in Java
    4. How its’s done in Common Lisp
    5. How it’s done in other languages
      1. JavaScript
      2. Python
  3. References

What exactly is a callback?

A callback, in essence, is simply a function that is executed by another function which has a reference of sorts to the first function. Yes, that’s really it!

Uses

One major use is to ensure proper separation of concerns.

Suppose we are writing some client code that makes use of a library, and say that our client function wishes to invoke a library function. Now, this library function executes code that might result in some form of platform specific signalling that will need to be handled in disparate ways depending on the specific signal received. The library writer could not possibly have imagined all the scenarios for such signalling when he was writing the library. So how can this work? Callbacks to the rescue!

So what the library writer did was to hold a reference to a callback function in his own function, and then his function invokes this callback function as and when the need arises (say an error condition or an OS interrupt). The callback function then takes care of all the handling and bookkeeping involved.

This callback function is, of course, expected to be supplied by the client code. This makes sense since the client has the best knowledge of its own domain. This then means that the library writer can make his code as generic as possible, leaving the specifics for the client to manage.

Another common use of callbacks is asynchronous programming. For example, suppose we have a function that needs to be activated when some specific conditions have arisen, and those conditions are decided by some other code. This is a good case to use a callback.

The current function can “register” itself with the condition-generating code, and then that code can invoke a callback in the current function’s module, which can then proceed to completion. Node, in particular, makes extensive use of this approach. The general Observer pattern is, in essence, the generalisation of a callback.

Implementation

Top

Callbacks may be implemented through various means – function pointers, function objects, or lambda abstractions. The important bit is to understand the concept and defer the specifics of the modes of implementation to the language at hand.

Callbacks can be both synchronous or asynchronous (think Node).

So much for the concept. As far as the terminology goes, it is important to remember that the callback itself is the actual function that is invoked by the function that takes the callback as the parameter. A lot of confusion arises precisely for the reason that some people tend to assume that the function taking the function parameter is the callback function. Quite contrary, as we have just surmised. The one mnemonic that always works for me is to remember that both the client function and the callback function are in the same conceptual module.

Finally, a caveat – extensive use of callbacks can lead to what is known as “callback hell” (check the Reference section – there is a whole site dedicated to it!). The rule of thumb is to use a callback only when it is absolutely needed. Otherwise, it can lead to code which is both unreadable and unmaintainable.

Demos

Top

Let’s now take a brief look at the functionality offered by callbacks is implemented in various languages. Of course, there may be different mechanisms for doing so, but I have chosen what I feel to be the idiomatic form in each language under discussion.

For all these examples, we will consider the same example – we have a function (squarify) which takes two parameters – a number and a callback function (callback). squarify simply squares the parameter, and then invokes callback with the squared value.

callback simply prints out the received value with a small message. The whole chain is triggered by another function client, which invokes squarify.

Note that all the examples here are, for the sake of simplicity, synchronous.

How it’s done in C

Top

In C and C++, we make use of function pointers like so:

#include <stdio.h>

void squarify(int, void(*)(int));
void callback(int);
void client();

int main()
{
    client();

    return 0;
}

void client()
{
    int n;
    
    printf("Enter a number: ");
    scanf("%d", &n);

    squarify(n, &callback);
}

void squarify(int n, void (*cb)(int))
{
    (*cb)(n*n);
}

void callback(int n)
{
    printf("Received %d\n", n);
}

And the output:

Timmys-MacBook-Pro:C z0ltan$ gcc -Wall -o callbackdemo callbackdemo.c
 
Timmys-MacBook-Pro:C z0ltan$ ./callbackdemo 
Enter a number: 19
Received 361

Notice how we pass the address of callback to squarify using &callback inside the client function.

How it’s done in C++

Top

The technique used in the C example (declaring the callback as a function pointer parameter to squarify and then passing it the address of callback at runtime will work just the same way in C++ as well.

However, in addition, C++ offers a whole lot more ways of achieving the same result. Let’s explore three of these in the same demo – lambda abstractions, function objects, and functors.

To this end, we use a std::function object to hold a reference to the callback in squarify. This class type is specified in the header.

The logic remains unchanged from that used in the C demo.

Note that this code only works in C++11 (or above).

//C++11 or above
#include <iostream>
#include <functional>

// Define the functor class
typedef struct {
    public:
        void operator()(int n)
        {
            std::cout << "Received: " << n << std::endl;
        }
} backcall;

void squarify(int, std::function<void(int)>);
void callback(int);
void client();

int main()
{
    client();

    return 0;
}

void client()
{
    int n;

    std::cout << "Enter a number: ";
    std::cin >> n;

    // simply pass in a lambda abstraction!
    squarify(n, [](int x) 
				{ std::cout << "Received: " 
					<< x << std::endl; 
			});
    
    // or specify a function explicitly
    squarify(n, callback);

    // or pass in a functor!
    squarify(n, backcall());
}

void squarify(int n, std::function<void(int)> cb)
{
    cb(n*n);
}

void callback(int n)
{
    std::cout << "Received: " << n << std::endl;
}

And the output:

Timmys-MacBook-Pro:C++ z0ltan$ g++ -std=c++11 -Wall -o callbackdemo callbackdemo.cpp 

Timmys-MacBook-Pro:C++ z0ltan$ ./callbackdemo 
Enter a number: 19
Received: 361
Received: 361
Received: 361

Et voila!

How it’s done in Java

Top

In Java, the situation is a bit more complicated than usual for many reasons – lack of pure function objects, extreme verboseness, lack of pure generic functions, etc.

However, the code below demonstrates how we would do it pre-Java 8 (and frankly, most code written today still follow this idiomatic approach).

import java.io.InputStreamReader;
import java.io.BufferedReader;
import java.io.IOException;

interface Callback {
    void call(int x);
}

public class CallbackDemo {
    public static void main(String[] args) {
            client(); 
    }

    public static void client() {
        int n;

        Callback cb = new Callback() {
                        @Override
                        public void call(int n) {
                            System.out.println("Received: " + n);
                        }
                    };

        try (BufferedReader reader = new BufferedReader(new InputStreamReader(System.in))) {
                System.out.print("Enter a number: ");
                n = Integer.parseInt(reader.readLine());
                squarify(n, cb);
        } catch (NumberFormatException |IOException ex) {
            ex.printStackTrace();
        }
    }

    public static void squarify(int n, Callback callback) {
        callback.call(n*n);
    }
}
Timmys-MacBook-Pro:Java z0ltan$ javac CallbackDemo.java 
Timmys-MacBook-Pro:Java z0ltan$ java -cp . CallbackDemo
Enter a number: 19
Received: 361

The code is mostly self-explanatory. To simulate function pointers/function objects, we simply make use of essentially what’s equivalent to the C++ functor (backcall) used in the previous demo.

The Callback interface declares a single abstract method called call which takes a single int parameter, and prints out a small message onto the console.

The squarify function takes an int parameter along with an instance of Callback, and then calls that instance’s call function. (On a side note, this is precisely why even C++’s functors are superior to Java’s. C++ has operator overloading, Java unfortunately does not).

Now, let’s take a look at how it would be done using Java 8 (and above). The Java 8 version is a marked improvement in terms of readability and conciseness.

Here’s the code:

import java.io.InputStreamReader;
import java.io.BufferedReader;
import java.io.IOException;

import java.util.function.Function;

public class CallbackDemo8 {
    public static void main(String[] args) {
        client();
    }

    public static void client() {
        try (BufferedReader reader = new BufferedReader(new InputStreamReader(System.in))) {
                System.out.print("Enter a number: ");
                int n = Integer.parseInt(reader.readLine());
                
                squarify(n, (x) -> { System.out.println("Received: " + x); return null; });
        } catch (NumberFormatException | IOException ex) {
            ex.printStackTrace();
        }
    }

    public static void squarify(int n, Function<Integer,Void> cb) {
        cb.apply(n*n);
    }
}


And here’s the output:


Timmys-MacBook-Pro:Java z0ltan$ java -cp . CallbackDemo8
Enter a number: 19
Received: 361

We observe a few things that differentiate it from the pre-Java 8 version:

  • The Callback interface is gone, having been replaced by the built-in Function function interface.
  • The callback function is also gone, and a lambda abstraction replaces it instead.

The lambda expression (x) -> { System.out.println(“Received: “ + x); return null; } still looks ugly with that return null; call. It is clearly redundant, but because of the way the Function functional interface is defined, this statement is mandatory.

We could fix that by creating our own functional interface like so:

@FunctionalInterface
interface Function<T> {
	void apply(T o);
}

However, it would reintroduce a custom interface in our code. So, not much gained there!

How it’s done in Common Lisp

Top

A full post containing a detailed discussion on this topic (along with the relevant demos) is available here in the next part of this series. Make sure to check that out!

How it’s done in other languages

Top

Let’s implement the same example in a few other languages for our own edification! For the sake of brevity (this post is already quite long!), we will stick to very commonly used languages - JavaScript and Python.

I feel these should be representative of most of the mainstream languages. Haskell is a bit of a different beast, but that is worthy of its own series of posts!

JavaScript

Top

Since client-side JavaScript does not provide any means of taking input in from the command line, we will use Node.js for this demo. For I/O from the console, we will use the readline module that now comes bundled with Node.

const readline = require('readline');

const stream = readline.createInterface({
                    input: process.stdin,
                    output: process.stdout
                });

function callback(n) {
    console.log("Received: " + n);
}


function squarify(n, cb) {
    cb(n*n);
}

function client() {
    stream.question("Enter a number: ", function(n) {
        squarify(n, callback);
        stream.close();
        process.stdin.destroy();
    });
}

// run!
client();

And the output:

Timmys-MacBook-Pro:JavaScript z0ltan$ node callback.js 
Enter a number: 19
Received: 361

Again, this is equivalent to the C version. We simply pass the function name (which is a reference to the function object) to the squarify function as the callback function.

However, we could do it more idiomatically using a lambda abstraction as follows:

const readline = require('readline');

const stream = readline.createInterface({
                    input: process.stdin,
                    output: process.stdout
                });

function squarify(n, cb) {
    cb(n*n);
}

function client() {
    stream.question("Enter a number: ", function(n) {
        squarify(n, function(x) {
            console.log("Received: " + x);
        });

        stream.close();
        process.stdin.destroy();
    });
}

// run!
client()

Note how the callback function has now been replaced by a a lambda abstraction that does the same operation.

And the output:

Timmys-MacBook-Pro:JavaScript z0ltan$ node callback_demo_lambda.js 
Enter a number: 19
Received: 361

Nice!

Python

Top

In Python, just like in JavaScript, the function name itself is an intrinsic reference to the function object. Functions are, after all, first-class objects in Python, and we can simply pass it around like so:

def callback(n):
    print("Received: " + str(n))

def squarify(n, cb):
    cb(n*n)

def client():
    print("Enter a number: ", end='')
    n = int(input())
    
    squarify(n, callback)

if __name__ == '__main__':
    client()

Note that the code was written in Python 3. However, it will easily work with minimal changes with Python 2.x as well.

And the output:

Timmys-MacBook-Pro:Python z0ltan$ python3 callback_demo.py 
Enter a number: 19
Received: 361

However, since Python also supports a crude form of lambda abstractions, we could rewrite the demo like so:

def squarify(n, cb):
    cb(n*n)

def client():
    print("Enter a number: ", end='')
    n = int(input())

    squarify(n, lambda x: print("Received: " + str(x)))

if __name__ == '__main__':
    client()

So now we have simply passed the callback function as a lambda abstraction to the squarify function.

And just to verify, the output:

Timmys-MacBook-Pro:Python z0ltan$ python3 callback_demo_lambda.py 
Enter a number: 19
Received: 361

So that’s all for now! Next up, callbacks in Common Lisp, and how we can write a simple function to perform function composition.

References

Top

Here are a few references related to the topics discussed in this blog post that you might find useful:

Interop mini-series – Callbacks special! (Part 2a)

Interop mini-series – Calling C and C++ code from Common Lisp using CFFI (Part 1)

Starting with this post, the new few posts will dig into interoperability between various languages. The next couple of posts will cover C and C++ code from Common Lisp, and how to write callback functions in Common Lisp that plug into code residing in a shared library. This will make use of the cffi library.

Then the following two posts will cover the analogue in Java-world. To this end, we will make use of the JNA project to indicate interop between C/C++ and Java.

Finally, this series will (hopefully) conclude with a mini-project of sorts – a completely embedded JVM instance inside a Common Lisp image! A number of demos will illustrate different uses of embedding Java within Common Lisp. This is a bit of an undertaking though, and will definitely take some time to implement least of all due to the fact that I want to extract the maximum amount of learning from this activity!

Contents

  1. Introducing the cffi library
  2. Demos
    1. Interop with C
    2. Interop with C++
  3. Summary of useful functions
  4. References

Setup used for this tutorial series

In order to keep things sane, I will be sticking to a single platform (unless otherwise noted) with the following configuration for this whole mini-series:

  • Mac OS X El Capitan system
  • 8 cores, 16GB RAM, 1600MHz processor
  • SBCL as my Common Lisp implementation
  • JDK 9
  • Apple LLVM 6.1.0 (with Clang as the frontend) as my C and C++ compiler

Note that even though the compiler used is LLVM, the behaviour is more or less the same as that of standard gcc/g++. The same flags also work for compilation, and the only difference vis-a-vis this tutorial will be how the shared (dynamic) library is created.

Introduction to the CFFI library

Let’s install the cffi library using QuickLisp (if you haven’t done so already) first:

LISP-TO-C-USER> (ql:quickload :cffi)
To load "cffi":
  Load 4 ASDF systems:
    alexandria babel trivial-features uiop
  Install 1 Quicklisp release:
    cffi
; Fetching #<URL "http://beta.quicklisp.org/archive/cffi/2016-03-18/cffi_0.17.1.tgz">
; 234.48KB
==================================================
240,107 bytes in 0.33 seconds (712.70KB/sec)
; Loading "cffi"
[package cffi-sys]................................
[package cffi]....................................
..................................................
[package cffi-features]
(:CFFI)

Now, let’s talk a bit about this library and the features that it provides. Links to the download site and manual are provided in the “References” section.

The cffi library is a cross-platform (across Common Lisp implementations that is) library that supports interop with C (and with C++, but we’ll talk more about that later). What this means is that you can load a native library (dylib, so file, DLL, etc.) and use the functions defined therein within your Lisp code.

The interop is two-ways – the general case is that you want to invoke C functions from Lisp code, or you may want to invoke a function in the library that expects a callback, and you can define this callback as pure Common Lisp code! Nifty, isn’t it?

The library is very well-designed and personally I find the Lispy nature of the APIs (and generated functions) an extra bonus.

The best way to learn the library is to see it in action, so let’s get on with it!

Note: Platform support for different features varies according to the quirks of the specific Common Lisp implementation. Refer to the cffi documentation for specifics).

Demos

Top

These demos are aimed to be simple and small, and yet somewhat useful in terms of real-world applicability.

I personally feel that purely contrived demos are best avoided since they hardly teach anything well, most of all for the reason that they are extremely boring!

In the first demo, we will see how a C library may be loaded and run from Common Lisp. This will be the most common use case.

In the second demo, we will do the same, but for a C++ library with a C wrapper around the C++ functionality.

Since both demos are defined in the same package, let’s define the package first:

(require 'cffi)

(defpackage :lisp-to-c-user
  (:use :cl :cffi))

(in-package :lisp-to-c-user)

Interop with C

Top

For the C example, I decided to use use the native library to get some useful system information – architecture type, model name, memory, number of cpus, and the number of logical cpus (cores).

Note that this example works only in Mac OS X. For Linux, the sysctlbyname function can be replaced by sysctl with appropriate changes. For Windows, you will have to check which kernel call provides the same functionality.

We will use the sysctlbyname function to extract these values.

Let’s define the header file first (in system_info.h):

#ifndef __SYSTEM_INFO_H__
#define __SYSTEM_INFO_H__ "system_info.h"

#include <sys/types.h>
#include <sys/sysctl.h>

#ifdef __cplusplus
extern "C" {
#endif

char* get_machine();
char* get_model();
int64_t get_memory();
int32_t get_ncpu();
int32_t get_nlogicalcpu();

#ifdef __cplusplus
}
#endif
#endif

And the corresponding C implementation (in system_info.c:

#include <stdio.h>
#include "system_info.h"

#define MAXSIZE 210

char* get_machine()
{
    static char machine[MAXSIZE];
    size_t len = sizeof(machine);

    sysctlbyname("hw.machine", &machine, &len, NULL, 0);

    return machine;
}

char* get_model()
{
    static char model[MAXSIZE];
    size_t len = sizeof(model);

    sysctlbyname("hw.model", &model, &len, NULL, 0);

    return model;
}

int64_t get_memory()
{
    int64_t mem;
    size_t len = sizeof(mem);

    sysctlbyname("hw.memsize", &mem, &len, NULL, 0);

    return mem;
}

int32_t get_ncpu()
{
    int32_t cpu;
    size_t len = sizeof(cpu);

    sysctlbyname("hw.ncpu", &cpu, &len, NULL, 0);

    return cpu;
}


int32_t get_nlogicalcpu()
{
    int32_t logical_cpu;
    size_t len = sizeof(logical_cpu);

    sysctlbyname("hw.logicalcpu", &logical_cpu, &len, NULL, 0);

    return logical_cpu;
}

int main(int argc, char* argv[])
{
    printf("%s, %s, %lld, %d, %d\n", 
            get_machine(),
            get_model(),
            get_memory(),
            get_ncpu(),
            get_nlogicalcpu());

    return 0;
}

Let’s compile it into a shared library (in this case, Clang + LLVM on Mac OS X. For other compilers such as gcc proper, check the relevant documentation):

Timmys-MacBook-Pro:c_demo_system_info z0ltan$ clang -dynamiclib -o libsysteminfo.lib system_info.c

Timmys-MacBook-Pro:c_demo_system_info z0ltan$ ls
libsysteminfo.lib	system_info.c		system_info.h

Excellent! Finally, let’s write the Common Lisp client code to use this library:

;;; C-demo

(define-foreign-library libsysteminfo
  (:darwin "libsysteminfo.dylib")
  (:unix "libsysteminfo.so")
  (t (:default "libsysteminfo.dylib")))

(load-foreign-library 'libsysteminfo)

(defcfun "get_machine" :string)
(defcfun "get_model" :string)
(defcfun "get_memory" :long-long)
(defcfun "get_ncpu" :int)
(defcfun "get_nlogicalcpu" :int)

(defun print-system-info ()
  (let ((arch (get-machine))
        (model (get-model))
        (mem (/ (get-memory) (* 1024 1024 1024)))
        (ncpu (get-ncpu))
        (nlogicalcpu (get-nlogicalcpu)))
    (format t "System Information~%")
    (format t "Arch: ~a, Model: ~a, Mem = ~dGB, CPUs = ~d, Logical CPUs = ~d~%"
            arch model mem ncpu nlogicalcpu)))

(print-system-info)

(close-foreign-library 'libsysteminfo)

And the output:

LISP-TO-C-USER> (print-system-info)
System Information
Arch: x86_64, Model: MacBookPro11,2, Mem = 16GB, CPUs = 8, Logical CPUs = 8
NIL

Explanation: We define the native library by using the cffi:define-foreign-library macro. This macro also allows us to define the specific name of the shared library depending on the OS.

Then we can load the specified library using the cffi:load-foreign-library macro. Take care to observed that the name of the library is quoted. This can save you a lot of anguish later on.

The next part is interesting – we use the cffi:decfun macro to define the C functions present in the library as Lispy function. For instance, the C function “get_machine” which is defined in the libsysteminfo.dylib library, is proxied into the current Lisp image as “get-machine”. There are ways to perform such name mangling automatically, but letting the cffi library take care of this is my recommendation.

The general syntax of the defcfun macro is:

	(cffi:defcun <C-function-name> &optional<return-type> 
		<arg with types>*) 

So the first defcfun indicates that get_machine is a C function that returns a character array (represented by cffi’s local type, :string), and that it doesn’t take any parameter(s), The cffi library defines a huge set of types that map to C’s primitive, pointer, and structure types extremely well.

Now that we have create proxies for the C functions, we can invoke them as seen in the print-system-info function by passing in the appropriate return type and parameters.

Finally, we unload the native library using another macro, cffi:close-foreign-library, which also takes a quoted library representation.

Interop with C++

Top

This is the more interesting demo for more than one reason! In this example, let’s try and sort an array of integers using the native library.

Again, let’s write the interface out first (in number_sorting.h):

#ifndef __NUMBER_SORTING_H__ 
#define __NUMBER_SORTING_H__ "number_sorting.h"

void callback_function(int[], int);

extern "C" {
    void sort_numbers(int[], int);
}
#endif

Looks good, but what’s the deal with the callback_function? We’ll get to that in just a moment. For now, let’s flesh out the functionality for this interface (in number_sorting.cpp:

#include <iostream>
#include <vector>
#include <algorithm>
#include <functional>

void sort_vector(std::vector<int>&, int[], int);

void callback_function(int array[], int size)
{
    std::vector<int> vec(size);

    sort_vector(vec, array, size);

    int i = 0;
    for (std::vector<int>::const_iterator it = vec.begin(); it != vec.end(); it++)
        array[i++] = *it;
}
    

template <typename T>
void display_elements(const std::vector<T>& vec)
{
    for (std::vector<int>::const_iterator it = vec.begin(); it != vec.end(); it++)
        std::cout << *it << " ";
    std::cout << std::endl;
}


void sort_vector(std::vector<int>& v, int numbers[], int count)
{
    for (int i = 0; i < count; i++)
        v[i] = numbers[i];

    display_elements(v);

    std::sort(v.begin(), v.end(), [](int x, int y) { return x < y; });
}

int main()
{
    std::ios_base::sync_with_stdio(false);

    int sample[] = { 1, 2, 0, -1, 3, 199, 200, 110, -234, 12345 };

    callback_function(sample, sizeof(sample)/sizeof(sample[0]));

    for (int i = 0; i < (int)sizeof(sample)/sizeof(sample[0]); i++)
        std::cout << sample[i] << " ";
    std::cout << std::endl;

    return 0;
}

Hmmm, this seems a bit too convoluted for this simple example? Why all the indirection? The reason will become crystal clear once we define the corresponding C file (in number_sorting.c) as well:

#include "number_sorting.h"

void sort_numbers(int numbers[], int n)
{
    callback_function(numbers, n);
}

Explanation: The reasons why we need both number_sorting.c and number_sorting.cpp, both of which implement the same interface, number_sorting.h
are two-fold:

  1. Since we are using some C++-only features such std::vector , std::sort, and C++11 lambdas, we need to invoke them in a separate function
  2. And the more important reason – C++’s pernicious name-mangling

Now, if we had simply written the entire sorting functionality using integer arrays and sorted using with C-like constructs (say, qsort, or a manually written sorting function), we wouldn’t need all this indirection, and we could have simply written the header as:

#ifndef __NUMBER_SORTING_H__ 
#define __NUMBER_SORTING_H__ "number_sorting.h"

extern "C" {
    void sort_numbers(int[], int);
}
#endif

and provided the implementation in number_sorting.cpp alone. That would have worked out fine. However, because we use all those C++ templated constructs as well as functional constructs, if we had used this same header file, we would have got a name-mangling issue, and the function would not be visible to the Common Lisp client!

To get around this, we write a C wrapper (number_sorting.c) which simply invokes the C++ function callback_function defined in number_sorting.cpp. Now you may think that we could have simply embedded callback_function inside the definition of sort_numbers in the C++ file alone, but that would not work either. Check out the reference “How to mix C and C++” in the “References section” for more details.

All right, let’s compile the code and generate the shared library:

Timmys-MacBook-Pro:c++_demo_sorting z0ltan$ clang++ -std=c++11 -stdlib=libc++ -dynamiclib -o libnumbersorting.dylib number_sorting.c number_sorting.cpp
clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated

Timmys-MacBook-Pro:c++_demo_sorting z0ltan$ ls
libnumbersorting.dylib	number_sorting.c	number_sorting.cpp	number_sorting.h

Timmys-MacBook-Pro:c++_demo_sorting z0ltan$ nm -gU libnumbersorting.dylib 
0000000000000a10 T __Z11sort_vectorRNSt3__16vectorIiNS_9allocatorIiEEEEPii
0000000000000730 T __Z17callback_functionPii
0000000000000c00 T _main
0000000000000700 T _sort_numbers

We can also see that the function sort_numbers has not been subjected to name-mangling. Now that we’ve resolved that, let’s flesh out the Common Lisp client, and run the demo!

;;; C++-demo

(define-foreign-library libnumbersorting
  (:darwin "libnumbersorting.dylib")
  (:unix "libnumbersorting.so")
  (t (:default "libnumbersorting.dylib")))

(use-foreign-library libnumbersorting)

(defun sort-some-numbers (&optional (n 10))
  (with-foreign-object (numbers :int n)
    (dotimes (i n)
      (setf (mem-aref numbers :int i) (random 100)))
    (let ((before (loop for i below n
                     collect (mem-aref numbers :int i))))
      (format t "Before: ~{~d ~}~%" before))
    (foreign-funcall "sort_numbers" :pointer numbers :int n :void)
    (let ((after (loop for j below n
                    collect (mem-aref numbers :int j))))
      (format t "After: ~{~d ~}~%" after))))

(sort-some-numbers)

(close-foreign-library 'libnumbersorting)

And the output:

LISP-TO-C-USER> (sort-some-numbers 15)
Before: 52 11 18 62 39 89 2 48 48 66 73 89 73 26 97 
After: 2 11 18 26 39 48 48 52 62 66 73 73 89 89 97 
NIL

Cool!

Explanation: This demo differs only slightly from the C demo in terms of Common Lisp code. We define the native library in the same manner, but we use another macro,
use-foreign-library instead this time. This is my preferred way of loading a native library since I always forget the quoting with load-foreign-library!

Jokes aside, we can see another way of executing a function defined in a native library: cffi:foreign-funcall.

This macro has the following syntax:

	(cffi:foreign-funcall <C-function-name> &optional<args with types>* 
		<return-type>)

I tend to prefer foreign-funcall for functions with only side-effects (as in this case), and use defcfun when I need to use the function in the Common Lisp part more than once. YMMV.

The most interesting bit, of course, in the with-foreign-object macro. I won’t bother to show its general syntax, but suffice to say that this macro is used to allocate, set, and use foreign memory (i.e., from the native library) with encapsulation within its body.

In this case, we simply generate a C integer array (not the usage of the type specifier, :int), set the values of the elements of this array using cffi:mem-aref, and read the values of the array using the same accessor function.

Note that value of the var numbersis a pointer type, and also that this is available only within the body of the macro. In the next post, we will see how we can work with custom C-style structs.

Useful basic functions

Top

Here is a summarised list of the functions used in the demos in this blog post.

  • cffi:define-foreign-library
  • cffi:load-foreign-library
  • cffi:close-foreign-library
  • cffi:use-foreign-library
  • cffi:defcfun
  • cffi:foreign-funcall
  • cffi:with-foreign-object
  • cffi:mem-aref

References

Top

Some references that you might find useful on this subject matter:

Interop mini-series – Calling C and C++ code from Common Lisp using CFFI (Part 1)

How to compile a Java program embedded in a String? Here’s how! (Better)

I am not quite satisfied with the way I left things in the last post.

There was just a bit too much hand-waving and magic about. Since then, I’ve experimented a bit more, and found out that some things that were claimed in the last couple of posts on this topic are also blatantly wrong – for instance, this approach of dynamic compilation does require a JDK (even though it does not need javac). The javax.tools package depends on tools.jar, which does not come bundled with the JRE.

This post is meant to make amends for that mistake, plus show a better way of handling compilation of Java code stored inside a String object without resorting to creating directories dynamically. In this attempt, we will try and make the following improvements:

  • The client code now invokes a custom compiling class loader.
  • The class loader then compiles the code stored inside the string.
  • The code is then loaded by the class loader into the current JVM.
  • Finally, the code is executed by the Client using Reflection.

Contents

  1. Code Listing
    1. The Class Loader
    2. The Compiler
  2. Testing
    1. The Client
    2. Test Run
  3. Conclusion

The Code

The code is organised as follows:

Timmys-MacBook-Pro:Better z0ltan$ tree
.
├── com
│   └── z0ltan
│   ├── compilers
│   │   └── JavaStringCompiler.java
│   └── loaders
│   └── CompilingClassLoader.java
└── org
└── z0ltan
└── client
└── Client.java

The Class Loader

Top

package com.z0ltan.loaders;

import java.io.File;
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;

import java.nio.file.Files;
import java.nio.file.Paths;

import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.util.logging.Logger;

import com.z0ltan.compilers.JavaStringCompiler;

public class CompilingClassLoader extends ClassLoader {
    private static final CompilingClassLoader __instance
	= new CompilingClassLoader();

    private static final Logger logger =
        Logger.getLogger(CompilingClassLoader.class.getName());

    private Pattern namePattern;
    private Pattern packagePattern;

    private CompilingClassLoader() {
        this.namePattern =
	Pattern.compile(".*class[ ]+([a-zA-Z0-9$_]+).*");
        this.packagePattern =
	Pattern.compile(".*package[ ]+([a-zA-Z0-9$_.]+).*");
    }

    public static CompilingClassLoader getInstance() {
        return __instance;
    }

    // load the class file after compiling the code
    public Class<?> loadClassFromString(final String program) throws ClassNotFoundException {
        final String className = getClassName(program);
        final String packagePath = getPackagePath(program);

        final String fullClassName;
        if (packagePath != null) {
            fullClassName = packagePath + '.' + className;
        } else {
            fullClassName = className;
        }

        logger.info("Loading " + fullClassName);

        // compile it!
        boolean result =
		JavaStringCompiler.INSTANCE
		.compileStringCode(fullClassName, program);

        if (result) {
            byte[] classBytes = getClassBytes(className);
            if (classBytes != null) {
                logger.info("Loaded " + fullClassName);
                return defineClass(fullClassName, classBytes, 0, classBytes.length);
            } else
                throw new ClassNotFoundException("Unable to load: " + fullClassName +
                                                 ". Reason = failed to load class bytes.");
        } else
            throw new ClassNotFoundException("Unable to load: " + fullClassName +
                                             ". Reason = compilation failed.");
    }

    private String getClassName(final String program) {
        Matcher m = namePattern.matcher(program);

        if (m.matches() && (m.groupCount() == 1)) {
            return m.group(1);
        }
        throw new RuntimeException("Could not find main class to load!");
    }

    private String getPackagePath(final String program) {
        Matcher m = packagePattern.matcher(program);

        if (m.matches() && (m.groupCount() == 1)) {
            return m.group(1);
        }
        return null;
    }        

    private byte[] getClassBytes(final String className) {
        final String classFilePath =
            className.replace('.', File.separatorChar) + ".class";

        try (BufferedInputStream bin = new BufferedInputStream(new FileInputStream(classFilePath));
             ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
            byte[] buffer = new byte[4 * 1024];
            int bytesRead = -1;

            while ((bytesRead = bin.read(buffer)) != -1) {
                baos.write(buffer, 0, bytesRead);
            }

            // delete the class file before returning
            try {
                Files.deleteIfExists(Paths.get(classFilePath));
            } catch (IOException ex) {
                //
            }

            return baos.toByteArray();
        } catch (IOException ex) {
            return null;
        }
    }
}

The code is as simple as it gets. The loadClassFromString method takes in the Java String containing our program, and constructs the full class name by appending the package path (if any).

It then invokes the compiler to generate the .class file. Finally, it deletes the .class file once the bytes have been read from it, thus cleaning up the unnecessary file.

The Compiler

Top

package com.z0ltan.compilers;

import java.net.URI;

import java.io.File;
import java.io.IOException;

import java.util.Collections;
import java.util.Arrays;
import java.util.logging.Logger;

import javax.tools.ToolProvider;
import javax.tools.JavaCompiler;
import javax.tools.DiagnosticCollector;
import javax.tools.Diagnostic;
import javax.tools.DiagnosticCollector;
import javax.tools.StandardJavaFileManager;
import javax.tools.SimpleJavaFileObject;
import javax.tools.JavaFileObject;

public enum JavaStringCompiler {
    INSTANCE;

    private JavaCompiler compiler;
    private DiagnosticCollector<JavaFileObject> collector;
    private StandardJavaFileManager manager;

    private static final Logger logger =
        Logger.getLogger(JavaStringCompiler.class.getName());

    private JavaStringCompiler() {
        this.compiler = ToolProvider.getSystemJavaCompiler();
        this.collector  = new DiagnosticCollector<JavaFileObject>();
        this.manager = compiler.getStandardFileManager(collector, null, null);
    }

    // class to represent a string object as a source file
    class StringCodeObject extends SimpleJavaFileObject {
        private String code;

        StringCodeObject(final String name, final String code) {
            super(URI.create("string:///" + name.replace('.', File.separatorChar) +
                             Kind.SOURCE.extension),
                  Kind.SOURCE);
            this.code = code;
        }

        @Override
        public CharSequence getCharContent(boolean ignoreEncodingErrors) {
            return this.code;
        }
    }

    // Compile the Java code stored inside the string
    public boolean compileStringCode(final String name, final String code) {
        logger.info("Compiling: " + name);

        boolean result = false;
        StringCodeObject source = new StringCodeObject(name, code);

        result = compiler.getTask(null, manager, null, null, null,
                                  Collections.unmodifiableList(Arrays.asList(source))).call();

        // display errors, if any
        for (Diagnostic<? extends JavaFileObject> d : collector.getDiagnostics()) {
            System.err.format("Error at line: %d, in file: %s\n",
                              d.getLineNumber(),
                              d.getSource().toUri());
        }

        try {
            manager.close();
        } catch (IOException ex) {
            //
        }

        logger.info("Finished compiling: " + name);

        return result;
    }
}

This again makes use of the same facilities available in the javax.tools package. However, unlike last time, this time we make sure that we use the StandardJavaFileManager to output any compilation errors correctly. Apparently, even explicitly specifying a DiagnosticCollector instance in the compiler.getTask() call does not work. We need to have a file manager to be able to catch errors.

The rest of the is pretty much the same as last time.

Testing

Top

The testing code agains covers the two scenarios – using the default package, and using a more realistic package structure.

The Client

Top

Here is the sample client that we will use to test the code:

// The Client code

package org.z0ltan.client;

import java.lang.reflect.Method;
import java.util.logging.Logger;

import com.z0ltan.loaders.CompilingClassLoader;

public class Client {
    private static final Logger logger =
        Logger.getLogger(Client.class.getName());

    public static void main(String[] args) throws Exception {
        final String simpleProgram = "public class SimpleProgram {" +
            " public static void main(String[] args) {" +
            "    System.out.println(\"Hello from SimpleProgram!\");}}";

        testSimpleProgram(simpleProgram);

        final String complexProgram = "package foo.bar.baz.quux;" +
            "import java.util.Random;" +
            "public class ComplexProgram {"+
            "  public static void main(String[] args) {" +
            "   System.out.println(\"'Sup from Fubar\");}" +
            "public int getRandomNumber() {" +
            "  return (new Random()).nextInt(100);}}";

        testComplexProgram(complexProgram);
    }

    private static void testSimpleProgram(final String simpleProgram) throws Exception {
        logger.info("Testing SimpleProgram");

        Class<?> simpleClazz =
            CompilingClassLoader.getInstance().loadClassFromString(simpleProgram);

        if (simpleClazz != null) {
            Method main = simpleClazz.getDeclaredMethod("main", String[].class);

            if (main != null) {
                main.invoke(null, (Object)null);
            }
        }
        logger.info("Finished testing SimpleProgram");
    }

    private static void testComplexProgram(final String complexProgram) throws Exception {
        logger.info("Testing ComplexProgram");

        Class<?> complexClazz =
            CompilingClassLoader.getInstance().loadClassFromString(complexProgram);

        if (complexClazz != null) {
            Object obj = complexClazz.getConstructor().newInstance();
            if (obj != null) {
                Method main = complexClazz.getDeclaredMethod("main", String[].class);

                if (main != null) {
                    main.invoke(null, (Object)null);
                }

                Method getRandomNumber = complexClazz.getDeclaredMethod("getRandomNumber");
                if (getRandomNumber != null) {
                    int n = (int)getRandomNumber.invoke(obj);
                    System.out.format("Random number = %d\n", n);
                }
            }
        }
        logger.info("Finished testing ComplexProgram");
    }
}

Test Run

Top

Timmys-MacBook-Pro:Better z0ltan$ javac -cp . com/z0ltan/compilers/JavaStringCompiler.java 

Timmys-MacBook-Pro:Better z0ltan$ javac -cp . com/z0ltan/loaders/CompilingClassLoader.java 

Timmys-MacBook-Pro:Better z0ltan$ javac -cp . org/z0ltan/client/Client.java 

Timmys-MacBook-Pro:Better z0ltan$ java -cp . org.z0ltan.client.Client

Sep 16, 2016 2:17:49 PM org.z0ltan.client.Client testSimpleProgram
INFO: Testing SimpleProgram
Sep 16, 2016 2:17:49 PM com.z0ltan.loaders.CompilingClassLoader loadClassFromString
INFO: Loading SimpleProgram
Sep 16, 2016 2:17:50 PM com.z0ltan.compilers.JavaStringCompiler compileStringCode
INFO: Compiling: SimpleProgram
Sep 16, 2016 2:17:50 PM com.z0ltan.compilers.JavaStringCompiler compileStringCode
INFO: Finished compiling: SimpleProgram
Sep 16, 2016 2:17:50 PM com.z0ltan.loaders.CompilingClassLoader loadClassFromString
INFO: Loaded SimpleProgram
<strong>Hello from SimpleProgram!</strong>
Sep 16, 2016 2:17:50 PM org.z0ltan.client.Client testSimpleProgram
INFO: Finished testing SimpleProgram
Sep 16, 2016 2:17:50 PM org.z0ltan.client.Client testComplexProgram
INFO: Testing ComplexProgram
Sep 16, 2016 2:17:50 PM com.z0ltan.loaders.CompilingClassLoader loadClassFromString
INFO: Loading foo.bar.baz.quux.ComplexProgram
Sep 16, 2016 2:17:50 PM com.z0ltan.compilers.JavaStringCompiler compileStringCode
INFO: Compiling: foo.bar.baz.quux.ComplexProgram
Sep 16, 2016 2:17:50 PM com.z0ltan.compilers.JavaStringCompiler compileStringCode
INFO: Finished compiling: foo.bar.baz.quux.ComplexProgram
Sep 16, 2016 2:17:50 PM com.z0ltan.loaders.CompilingClassLoader loadClassFromString
INFO: Loaded foo.bar.baz.quux.ComplexProgram
<strong>'Sup from Fubar
Random number = 63</strong>
Sep 16, 2016 2:17:50 PM org.z0ltan.client.Client testComplexProgram
INFO: Finished testing ComplexProgram

And just to verify that no .class files are left in the file system:

Timmys-MacBook-Pro:Better z0ltan$ ls
com		compiler.jar	loader.jar	org

Excellent! That’s much better.

Conclusion

Top

Well, that was much more satisfying that the unnecessary complexity (and extra work!) of the previous version of this program. Now there is better separation of concerns, better diagnostics, and is much easier to read and understand.

Till next time then, folks!

How to compile a Java program embedded in a String? Here’s how! (Better)

How to compile a Java program embedded in a String? Here’s how!

Taking a small break from the mini-series on interop between languages (and also working on the main project – embedding a JVM instance in a Common Lisp image), I thought I’d share something that I frankly found amazing – how to compile and run a complete Java program contained inside a Java String! Who needs files any more, eh?

The implications of this are enormous – this means that on any machine where there is a JVM (version 6 or above) installed, even without the JDK itself*, not only can we compile and run arbitrary Java code, but also pass around Java code in strings, and compile and run them on the fly!

*Of course, it’s understood that the compiler code itself must be available in some form – as a class file, or as a Jar file.

Content

  1. A little background
  2. Implementation
    1. Compile Java code inside a String
    2. Compile Java code inside a String – fixed
  3. References

Background

I had previously written a post on how to compile Java code using the classes in the javax.tools package. That deals with the first part of the equation – how to dynamically compile Java code without resorting to brittle approaches like spawning OS processes , and without any dependency on javac.

You can find that discussion here – Dynamic Compilation of Java code (no JDK).

In this post, we will focus on the second aspect – how to store an entire Java program inside a Java String instance, and then compile and run the code.

Note: Since the javax.tools package was introduced only in Java 6, it’s only natural to deduce that these examples will work only in Java 6 (or above).

All right then, let’s get on with it!

Implementation

Top

The implementation will be in two parts. We’ll first implement a solution that appears to do the job – it can compile a Java program contained inside a string, and in many cases, even run it perfectly.

However, there is a major problem with the first implementation that we will then propose to fix by taking care of the various scenarios that might crop up in real-world use.

A first go

Top

The way we implement this solution is quite similar to the one that we used to compile Java source files. I recommend checking out the earlier post (mentioned in the Background section), but in the case of compiling a Java source string, we just have the following steps:

  • Get an instance of JavaCompiler using ToolProvider.
  • Construct the “compilation units” (which are instances of Iterable. In this specific case, we will need to construct a sub-type of SimpleJavaFileObject to pass in as the “compilation unit”.
  • Finally, create a compilation task using the compiler, and invoke call.

Well, that’s the general idea anyway. Let’s code that up now:

import java.io.File;
import java.io.IOException;

import java.util.Collections;
import java.util.Arrays;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.util.logging.Logger;

import java.net.URI;

import javax.tools.ToolProvider;
import javax.tools.JavaCompiler;
import javax.tools.SimpleJavaFileObject;
import javax.tools.JavaFileObject;
import javax.tools.JavaFileObject.Kind;


public enum JavaStringCompilerBasic {
	INSTANCE;
	
	private final JavaCompiler compiler;

	private Pattern namePattern;
		
	private JavaStringCompilerBasic() {
		this.compiler = ToolProvider.getSystemJavaCompiler();
		this.namePattern = Pattern.compile(".*class[ ]+([a-zA-Z0-9$_]+).*");
	}
	
	// class that defines the Java String object as a valid source file
	class EmbeddedJavaSource extends SimpleJavaFileObject {
		private String code;
		
		EmbeddedJavaSource(final String name, final String code) {
			super(URI.create("string:///" + 
							name.replace('.', File.separatorChar) + 
							Kind.SOURCE.extension),
					Kind.SOURCE);
		this.code = code;
		}
		
		@Override
		public CharSequence getCharContent(boolean ignoreEncodingErrors) {
			return this.code;
		}
	}

	/**
	  * Compile the Java code embedded inside the String.
	  * 
	  * @param program Java code
	  */	
	public void compileJavaString(final String program) {
		final String className = getClassName(program);
		
		final EmbeddedJavaSource sourceString 
				= new EmbeddedJavaSource(className, program);
		;
		
		compiler.getTask(null, null, null, null, null, 
			Collections.unmodifiableList(Arrays.asList(sourceString))).call();
	}
	
	private String getClassName(final String program) {
		final Matcher m = namePattern.matcher(program);
		
		if (m.matches() && (m.groupCount() == 1)) {
			return m.group(1);
		} else {
			throw new RuntimeException("Could not extract class name");
		}
	}

	public static void main(String[] args) {
		final String demoProgram 
			= "public class DemoProgram { " +
			  "   public static void main(String[] args) { " +
			  "       System.out.println(\"Hello from DemoProgram!\");" +
			  "    }" +
			  "}"; 	
	
		JavaStringCompilerBasic.INSTANCE.compileJavaString(demoProgram);
	}
}

As you can see, our little demo program is stored inside the demoProgram variable in the main method.


final String demoProgram
= "public class DemoProgram { " +
" public static void main(String[] args) { " +
" System.out.println(\"Hello from DemoProgram!\");" +
" }" +
"}";

Notes:: The main point of interest here is the EmbeddedJavaSource class which extends SimpleJavaFileObject. In this case, it doesn’t do much apart from creating a pseudo-URI pertaining to the fully-qualified name of the class (within the Java string).

Take it for a little spin:

Timmys-MacBook-Pro:Basic z0ltan$ javac -cp . JavaStringCompilerBasic.java 

Timmys-MacBook-Pro:Basic z0ltan$ java -cp . JavaStringCompilerBasic

Timmys-MacBook-Pro:Basic z0ltan$ java -cp . DemoProgram
Hello from DemoProgram!

Brilliant! Everything seems hunky-dory. Well, not quite. Let’s clean out the class files and try out another example to see if it is indeed working as we expect it to:

Here is the snippet we add to main:


final String problemProgram
= "package com.foo.bar;" +
"public class ProblemDemoProgram {" +
" public static void main(String[] args) {" +
" System.out.println(\"Hello from ProblemDemoProgram!\");" +
" }" +
"}";

So what’s the problem? Let’s run it:

Timmys-MacBook-Pro:Basic z0ltan$ java -cp . ProblemDemoProgram
Error: Could not find or load main class ProblemDemoProgram

Ah! That doesn’t look too good! What really happened is this – The compilation went through fine because the compiler doesn’t really care what the package definition inside the embedded Java program is (because the file doesn’t really exist on the file system now, does it?).

So it goes ahead and generates the .class file anyway. Now, when we try to run it, the java tool reads the byte code, sees a package declaration com.foo.bar, and tries to load up com/foo/bar/ProblemDemoProgram.class which, clearly, doesn’t exist!

So how do we fix it? The solution I propose it to ensure that the use case does succeed instead of failing. I propose reading the package information from the embedded Java program, parsing out the package details (if any), and creating that entire directory structure, and copying the .class file into that directory structure.

This will ensure that when java goes looking for DemoProgram.class, it really does find it, load it, and run it.

We’ll see how that solution might look like in the next section.

Final version

Top

import java.io.File;
import java.io.IOException;

import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.Path;
import java.nio.file.StandardCopyOption;

import java.util.Collections;
import java.util.Arrays;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.util.logging.Logger;

import java.net.URI;

import javax.tools.ToolProvider;
import javax.tools.JavaCompiler;
import javax.tools.SimpleJavaFileObject;
import javax.tools.JavaFileObject;
import static javax.tools.JavaFileObject.Kind;


public enum JavaStringCompilerFinal {
	INSTANCE;
	
	private final JavaCompiler compiler;

	private Pattern namePattern;
	private Pattern pkgPattern;
		
	private JavaStringCompilerFinal() {
		this.compiler = ToolProvider.getSystemJavaCompiler();
		this.namePattern = Pattern.compile(".*class[ ]+([a-zA-Z0-9$_]+).*");
		this.pkgPattern = Pattern.compile(".*package[ ]+([a-zA-Z0-9$_.]+).*");
	}
	
	// class that defines the Java String object as a valid source file
	class EmbeddedJavaSource extends SimpleJavaFileObject {
		private String code;
		
		EmbeddedJavaSource(final String name, final String code) {
			super(URI.create("string:///" + 
							name.replace('.', File.separatorChar) + 
							Kind.SOURCE.extension),
					Kind.SOURCE);
			this.code = code;
		}
		
		@Override
		public CharSequence getCharContent(boolean ignoreEncodingErrors) {
			return this.code;
		}
	}

	/**
	  * Compile the Java code embedded inside the String.
	  * 
	  * @param program Java code
	  */	
	public void compileJavaString(final String program) {
		String className = getClassName(program);
		
		final String packagePath = getPackagePath(program);
		
		if (packagePath != null) {
			makePackagePaths(packagePath);
			className = packagePath + '.' + className;
		} 
		
		final EmbeddedJavaSource sourceString 
				= new EmbeddedJavaSource(className, program);

		compiler.getTask(null, null, null, null, null, 
				Collections.unmodifiableList(Arrays.asList(sourceString))).call();
				
		// move the compiled class into the created folder
		if (packagePath != null) {
			moveClassIntoPackagePath(className);
		}
	}
	
	private void moveClassIntoPackagePath(final String className) {
		final String sourceFile
			= className.substring(className.lastIndexOf('.') +1)
				+ ".class";

		final String targetFile 
			= className.substring(0, className.lastIndexOf('.'))
						.replace('.', File.separatorChar)
						+ File.separatorChar
						+ sourceFile;
						
		try {
			Files.move(Paths.get(sourceFile), 
					   Paths.get(targetFile), 
					   StandardCopyOption.REPLACE_EXISTING);
		} catch (IOException ex) {
			throw new RuntimeException("Error while moving file: " + sourceFile + " to "
				+ targetFile + ". Message = " + ex.getLocalizedMessage());
		}
	}
	
	private void makePackagePaths(final String pkgPath) {
		final String pkgFilePath = pkgPath.replace('.', File.separatorChar);
		
		try {
			if (!Files.exists(Paths.get(pkgFilePath))) {
				Files.createDirectories(Paths.get(pkgFilePath));
			}
		} catch (IOException ex) {
			throw new RuntimeException("Could not create directories: " +
						pkgFilePath);
		}
	}
	
	private String getPackagePath(final String program) {
		final Matcher m = pkgPattern.matcher(program);
		
		if (m.matches() && (m.groupCount() == 1)) {
			return m.group(1);
		} 
		return null;
	}
			 
	private String getClassName(final String program) {
		final Matcher m = namePattern.matcher(program);
		
		if (m.matches() && (m.groupCount() == 1)) {
			return m.group(1);
		} else {
			throw new RuntimeException("Could not extract class name");
		}
	}
	
	
	public static void main(String[] args) {
		final String demoProgram 
			= "public class DemoProgram { " +
			  "   public static void main(String[] args) { " +
			  "       System.out.println(\"Hello from DemoProgram!\");" +
			  "    }" +
			  "}"; 	
	
		JavaStringCompilerFinal.INSTANCE.compileJavaString(demoProgram);
		
		final String problemProgram
			= "package com.foo.bar;" +
			  "public class ProblemDemoProgram {" +
			  "   public static void main(String[] args) {" +
			  "      System.out.println(\"Hello from ProblemDemoProgram!\");" +
			  "  }" +
			  "}";
			  
		JavaStringCompilerFinal.INSTANCE.compileJavaString(problemProgram);
	}
}

And a sample test run with the same example as in the previous section:

Timmys-MacBook-Pro:Final z0ltan$ javac -cp . JavaStringCompilerFinal.java 

Timmys-MacBook-Pro:Final z0ltan$ java -cp . JavaStringCompilerFinal

Timmys-MacBook-Pro:Final z0ltan$ ls
DemoProgram.class					JavaStringCompilerFinal.class		com
JavaStringCompilerFinal$EmbeddedJavaSource.class	JavaStringCompilerFinal.java

Timmys-MacBook-Pro:Final z0ltan$ java -cp . DemoProgram
Hello from DemoProgram!

Timmys-MacBook-Pro:Final z0ltan$ java -cp . com.foo.bar.ProblemDemoProgram
Hello from ProblemDemoProgram!

Timmys-MacBook-Pro:Final z0ltan$ tree com/
com/
└── foo
    └── bar
        └── ProblemDemoProgram.class

2 directories, 1 file

Woo-hoo! As we can see, the directory structure is created, the .class file moved into the directory, and the code runs beautifully.

Some observations:

The only difference from the first version is that we have added extra code to create the directories (using regex to extract the relevant details from the program code), compiled the Java code stored in the strings, and copied the class file to the relevant directories. The program is then invoked normally by specifying the full class path of the Java class.

Of course, the javax.tools package is primarily aimed at compiler writers who want to create custom compilers on top of Java, but covering that use case is beyond the scope of this blog.

For instance, the better approach would have been to implement a custom JavaFileManager that then implemented the creation of the directories (if needed), or we could have used a Class Loader to load the compiled class (in case we don’t need the source, etc. The possibilities are endless!

References

Top

Here is basically the only reference that you need (really) to get started with this topic, and run with it!

Till next time then, folks!

How to compile a Java program embedded in a String? Here’s how!

Basic Concurrency and Parallelism in Common Lisp – Part 4b (Parallelism using lparallel – Error Handling)

In this final part, we will discuss the very important topic of error handling, how lparallel handles it, and cap off the series with a small benchmarking example that will tie in all the concepts covered thus far.

The demo will help check actual core usage on the machine when using the lparallel library.

Contents

Initial Setup

There is no additional setup required for this tutorial from the last tutorial.

In case you missed it, please check out the previous post – Parallelism fundamentals using lparallel.

5-minute Error Handling refresher

Before we jump headlong into the demos, here is quick refresher guide to Conditions and Restarts in Common Lisp (error handling). In case you are comfortably familiar with this topic, please skip ahead to the next section.

In case you are a novice interested in getting a more comprehensive treatment of Conditions and Restarts in Common Lisp, I recommend two things – firstly, check out my detailed post on the fundamentals of Conditions and Restarts in Common Lisp, and secondly, check out the links in the References section at the end of this post.

For our refresher, let’s take a simple example. We have a custom square root function. To keeps things simple, let us have a single check to ensure that the argument is zero or positive. We will forego all other validation.

First we define the relevant error condition:

(defpackage :positive-sqrt-user
  (:use :cl))

(in-package :positive-sqrt-user)

;;; define the error condition
(define-condition negative-error (error)
  ((message :initarg :message :reader error-message)))

Now let’s define the square root function itself. It is a simple implementation of the Newton-Raphson algorithm for finding the square root of a positive number (or zero). We take the first approximation/guess as 1.0d0:

(defconstant +eps+ 1e-9)

(defun square-root (n)
  "Find the square root using the Newton-Raphson method."
  (if (< n 0)
      (error 'negative-error :message "number must be zero or positive"))
  (let ((f 1.0d0))
    (loop
       when (< (abs (- (* f f) n)) +eps+)
       do (return f)
       do (setf f (/ (+ f (/ n f)) 2.0d0)))))

Nothing special there. The function simply loops until the candidate square root is within acceptable limits from the actual square root of the argument. For the sake of completion, the key step in the algorithm is the following:

(setf f (/ (+ f (/ n f)) 2.0d0)

This is as per the formula for calculating the next square root approximation at each stage:

x_{n} = \frac{1}{2}\left(x_{n-1}+ \frac{n}{f}\right)

In terms of error handling, we can handle the error in three different canonical ways (amongst others).

First, we can catch and process the error directly (similar to the try-catch-finally construct in some other languages:

;;; handle the error directly
(defun test-sqrt-handler-case ()
  (let ((n (progn
             (princ "Enter a number: ")
             (read))))
    (unwind-protect (handler-case (square-root n)
                      (negative-error (o) (format t "Caught ~a~%" (error-message o)) nil))
      (format t "Nothing to clean up!"))))

Testing it out:

POSITIVE-SQRT-USER> (test-sqrt-handler-case)
Enter a number: 200
Nothing to clean up!
14.142135623730955d0

POSITIVE-SQRT-USER> (test-sqrt-handler-case)
Enter a number: -200
Caught number must be zero or positive
Nothing to clean up!
NIL

Or, we could handle it automatically using a restart. Suppose we want to automatically return 1.0d0 as the result if we encounter an invalid argument to square-root, we could something like this:

;;; automatic restart
(defun test-sqrt-handler-bind ()
  (let ((n (progn
             (princ "Enter a number: ")
             (read))))
    (handler-bind
        ((negative-error #'(lambda (c)
                             (format t "Caught: ~a~%" (error-message c))
                             (invoke-restart 'return-one))))
      (restart-case (square-root n)
        (return-one () 1.0d0)))))

Test run:

POSITIVE-SQRT-USER> (test-sqrt-handler-bind)
Enter a number: 200

14.142135623730955d0
POSITIVE-SQRT-USER> (test-sqrt-handler-bind)
Enter a number: -200
Caught: number must be zero or positive
1.0d0

Of course, the real usefulness of this scheme is realised when we have more restart cases available than these trivial ones.

And finally, we could handle it interactively, which allows us to enter a new value for the argument to square-root. (This interactive mode of development/operation is unique to the Lisp world).

(defun read-new-value ()
  (format *query-io* "Enter a new value: ")
  (force-output *query-io*)
  (multiple-value-list (read)))

;;; Interactive restart
(defun test-sqrt-interactive ()
  (let ((n (progn
             (princ "Enter a number: ")
             (read))))
    (restart-case (square-root n)
      (return-nil () nil)
      (enter-new-value (num)
        :report "Try entering a positive number.”
        :interactive (lambda () (read-new-value))
        (square-root num)))))

Test drive!

POSITIVE-SQRT-USER> (test-sqrt-interactive)
Enter a number: 200

14.142135623730955d0

POSITIVE-SQRT-USER> (test-sqrt-interactive)
Enter a number: -200

Condition POSITIVE-SQRT-USER::NEGATIVE-ERROR was signalled.
   [Condition of type NEGATIVE-ERROR]

Restarts:
 0: [RETURN-NIL] RETURN-NIL
 1: [ENTER-NEW-VALUE] Try entering a positive number.
 2: [RETRY] Retry SLIME REPL evaluation request.
 3: [*ABORT] Return to SLIME's top level.
 4: [ABORT] abort thread (#<THREAD "repl-thread" RUNNING {1004E98003}>)

Enter a new value: 200

14.142135623730955d0

Error Handling in lparallel

Top

lparallel provides the lparallel:task-handler-bind construct. This is, for all means and purposes, equivalent to the handler-bind construct in Common Lisp. However, it is optimised for error handling inside of parallel tasks launched using the lparallel library.

The problem

Top

Why is this important? Well, take the following example for instance:

(define-condition foo (error) ())

;;; error handling with handler-bind
(defun test-errors-normal ()
  (handler-bind
      ((foo #'(lambda (c)
                (declare (ignore c))
                (invoke-restart 'print-error-message))))
    (pmap 'vector #'(lambda (x)
              (declare (ignore x))
              (restart-case (error 'foo)
                (print-error-message () "error!")))
          '(1 2 3 4 5))))

We declare a handler-bind in the current thread, and we invoke the restart print-error-message when we encounter an error of type foo.

Then we have a single pmap task inside the handler-bind. Notice that we define the restart-case inside the lambda function passed to pmap.

Now, inside the lambda function, we explicitly signal foo. Our expectation then is that the result of the operation is a vector of size 5, with each element being “error!”, right? Wrong! Here’s what we get instead:

Condition CONDS-RESTARTS-USER::FOO was signalled.
   [Condition of type CONDS-RESTARTS-USER::FOO]

Restarts:
 0: [PRINT-ERROR-MESSAGE] CONDS-RESTARTS-USER::PRINT-ERROR-MESSAGE
 1: [TRANSFER-ERROR] Transfer this error to a dependent thread, if one exists.
 2: [KILL-ERRORS] Kill errors in workers (remove debugger instances).
 3: [ABORT] abort thread (#<THREAD "lparallel" RUNNING {1002A01EF3}>)

So what happened? The transfer-error restart case presents a clue. The reason the code didn’t’ work is because the error was spawned in a different context (inside a task), whereas we are trying to handle it in the current thread. To fix this, we can modify the code so that handler-bind is places inside the lambda function itself, in the same thread context:

;;; error handling with handler-bind modified
(defun test-errors-normal-modified ()
  (pmap 'vector #'(lambda (x)
                    (declare (ignore x))
                    (handler-bind
                        ((foo #'(lambda (c)
                                  (declare (ignore c))
                                  (invoke-restart 'print-error-message))))
                      (restart-case (error 'foo)
                        (print-error-message () "error!"))))
        '(1 2 3 4 5)))

Take it for a spin:

CONDS-RESTARTS-USER> (test-errors-normal-modified)

#("error!" "error!" "error!" "error!" "error!")

And now we see the correct output! However, this approach does not scale. Imagine having 100 tasks, each with its own handler-bind! This is one of the compelling reasons we should use what the library provides us – lparallel:task-handler-bind as we shall see next.

The solution

Top

The lparallel:task-handler-bind version of the code looks so:

;;; error handling with task-handler-bind
(defun test-errors-lparallel ()
  (task-handler-bind
      ((foo #'(lambda (c)
                (declare (ignore c))
                (invoke-restart 'print-error-message))))
    (pmap 'vector #'(lambda (x)
              (declare (ignore x))
              (restart-case (error 'foo)
                (print-error-message () "error!")))
          '(1 2 3 4 5))))

And the output is exactly what we expect:

CONDS-RESTARTS-USER> (test-errors-lparallel)

#("error!" "error!" "error!" "error!" "error!")

All we did was to replace handler-bind with lparallel:task-handler-bind in the original code!

Note: You can still override the behaviour per task using: (lparallel:task-handler-bind ((error #’invoke-transfer-error)…), which automatically transfers the error to a thread capable of providing a proper restart for the error condition (if available), by using (lparallel:task-handler-bind ((error #’invoke-transfer-error) …) to always trigger the debugger (good for interactive mode).

Let’s move on now to the demo to complete this whole series!

Demos

Top

The best way of observing performance differences between parallel and non-parallel operations is through a real example (albeit a simple one).

Prime number generation

Top

The code:

;;;; A  benchmarking demo using prime number generation.

(defpackage :benchmarking-demo
  (:use :cl :lparallel))

(in-package :benchmarking-demo)

;;; error conditions
(define-condition prime-number-error (error) ())

(defun primep (x)
  (cond ((<= x 0)
         (error 'prime-number-error))
        ((= x 1)
         nil)
        ((= x 2)
         t)
        (t (loop for i from 2 to (floor (sqrt x))
              when (zerop (mod x i))
              do (return nil)
              finally (return t)))))

;;; prime number generation
(defun gen-prime-numbers (start end)
  (premove-if-not #'(lambda (x)
                      (restart-case (if (primep x) t nil)
                        (just-continue () nil)))
                  (loop for i from start to end
                     collect i)))

(defun prime-client ()
  (task-handler-bind
      ((prime-number-error #'(lambda (c)
                               (declare (ignore c))
                               (invoke-restart 'just-continue))))
    (dotimes (i 1000000000000)
      (gen-prime-numbers (1+ i) (+ i 1000000))
      (incf i 1000000))))

This is a direct implementation of the basic prime number generation algorithm – test from 2 upto sqrt(number) for divisibility. I’m basically creating 1e6 chunks of 1e6 numbers each for the prime number test.

premove-if-not simply filters out the prime numbers from the list that is created from the start and end arguments to gen-prime-numbers.

Core Usage during Prime Number generation using lparallel

The code took a long long time to run, and I could hear the poor machine hissing in protest (I just killed the process after 15 minutes), but on the bright side, all the cores were overloaded full time. Note that I don’t collect the generated numbers into a list because that would definitely have crashed SLIME in any case if I had let it run on.

I had contemplated doing another demo with matrix multiplication, but from an edificational perspective, this single demo seems to have done the job, so I’ll skip matrix multiplication for now.

References

Top

Some additional useful references (definitely check out the video in the second link. Patrick Stein’s tutorial using a simple range class example is most excellent):

That concludes this series on Concurrency and Parallelism using Common Lisp! Next up, we will discuss another extremely important topic – interop between languages. That will also be a mini-series of sorts, and I might throw in a random but useful post in between (depending on what interests me at that point!).

Till then, happy hacking!

Basic Concurrency and Parallelism in Common Lisp – Part 4b (Parallelism using lparallel – Error Handling)

A modest attempt at simulating a page printing script using plain JavaScript

This is a small bit of nano-project that I did recently just for fun! The whole project took no more than 5 minutes to code up, and another 10 minutes to test out (you’ll understand why when you see the code!)

The idea came about when I used to be on Quora. I had decided to quit Quora due to the fast falling quality of the site, and the clearly biased moderation in place there, especially with regards to American politics. Anyway, the details are not important.

It so happened that I had accumulated quite a few bookmarked answers and questions during my time there. I decided to download them locally (preferably in PDF format) to my local machine. Since Quora doesn’t offer any such facility, I decided to search for it online. Five minutes of Googling led to to a nice Chrome extension.

It worked well enough for my needs, but it got me thinking. How the tool operated (from a user’s perspective) was so:

  • Log on to Quora
  • Click on the timestamp on an answer
  • A “Download” link appears below the answer in a new page
  • Upon clicking this link, the Chrome print dialog pops out, allowing saving the page in PDF

Pretty nifty, isn’t it? After a couple of minutes of thought, I realised that the task wasn’t really that complicated for two reasons:

  1. This was a Chrome extension, and chrome has supported “Print to PDF” for quite some time now, and
  2. The extension required the user to explicitly click on the timestamp of an answer. It did not seem to work for questions as a whole

My reasoning went like this: simulating the script on a local system, or on a website that I own is quite easy. I can simply add event listeners to specific buttons and links, and invoke window.print() to open the Chrome print dialog.

However, modifying a third-party site wouldn’t be that easy. Since the user has to click on the timestamp of an answer to generate the “Download” link, clearly the script is trying to get some spam or div DOM node that he can append his link object to. However, there can be many such nodes with dynamically generated IDs.

This implies that Chrome’s extensions tools/APIs must provide functionality to checks for click events, and get then get a handle to the specific node currently activated. This then makes it trivial to append the node, add an event listener, and simply trigger window.print()!

Following this logic, I created a simple demo on my local system that uses a minimal static page with a button. On clicking this button, a download link appears, and then when this hyperlink it clicked, it pops out the Chrome print dialog!

Let’s see it in action!

Demo

Simply save the code into a a file with an html extension.

Now fire it up in Chrome (it works on Safari as well, haven’t tested it on other browsers, but should work on most modern browsers that at least support addEventListener).

img2

Click on the button.

img2

Nice! Try out the “Download” hyperlink.

img3

Excellent! Not only does the print dialog pop up, the “Download” link has now disappeared.

The Code

The code is absolutely dead simple, and was cooked up to demonstrate this simple demo. So, don’t go looking for loopholes!🙂

<!DOCTYPE html>
<html lang="en">
<head>
    <title>Testing window.print()</title>
<style language="text/css">
        h1 {
            text-align:center;
        }

        #centeredDiv, .myDiv, .hidden, .visible {
            text-align:center;
        }

        .hidden {
            display:none;
        }

        .visible {
            display:block;
        }
    </style>

</head>
<body>
<h1>This is a print page demo</h1>
<div id="centeredDiv">

To generate the download link, click on the button below!

        <button id="myButton">Click me</button></div>
<script type="text/javascript">
        function init() {
            var myButton = document.getElementById("myButton");
            if (myButton) {
                myButton.addEventListener("click", function() {
        		    var centeredDiv = document.getElementById("centeredDiv");
		            if (centeredDiv) {
			            var link = document.createElement("a");
        		    	link.setAttribute("id", "myLink");
    		        	link.setAttribute("class", "visible");
            			link.setAttribute("href", "#");
                        
                              var text = document.createTextNode("Download");
                              link.appendChild(text);
                        
                              centeredDiv.appendChild(link);
		              }	
                          var myLink = document.getElementById("myLink");
                          if (myLink) {
                              myLink.addEventListener("click", function() {
                              myLink.className = "hidden";  
                             window.print();
                           });
                       }
                   });
               }
           }

       var body = document.getElementsByTagName("body")[0];
       body.onload = init();
    </script>
</body>

Explanation: The code is dead straightforward. We just have a button inside a div. Once the body of the page has been loaded, we append an event listener to the button.

When the button is clicked, we create an anchor object dynamically, set up its properties, and append it to the same div that contains the button. We also append a hyperlink to this new anchor object so that when it is clicked, we set its class to hidden, and invoke window.print() that brings out the browser’s print dialog.

The CSS rules simply define when the dynamically created anchor object is visible (when created), or hidden (once clicked). Really, that’s all there is to it!

Wrap-up

Well, this was a small fun project that didn’t take much time, and also exercised my (admittedly rusty) JavaScript skills!

I encourage you to take inspiration from daily situations and implement small and simple solutions to problems you certainly face every day. These can provide great fun and be a good refresher for your own skills. That just goes to show that the best project ideas come from personal needs rather than coding up from a list compiled by someone else.

Have fun!

A modest attempt at simulating a page printing script using plain JavaScript

Basic Concurrency and Parallelism in Common Lisp – Part 4a (Parallelism using lparallel – fundamentals)

In these concluding parts of this mini-series, we will have a taste of parallel programming in Common Lisp using the lparallel library.

It is important to note that lparallel also provides extensive support for asynchronous programming, and is not a purely parallel programming library. As stated before, parallelism is merely an abstract concept in which tasks are conceptually independent of one another.

Contents

Installation

lparallel can be installed using Quicklisp. In case you are not sure about how Quicklisp works, please check my previous post on how to setup a Common Lisp environment.

Let’s check if lparallel is available for download using Quicklisp:

CL-USER> (ql:system-apropos "lparallel")
#<SYSTEM lparallel / lparallel-20160825-git / quicklisp 2016-08-25>
#<SYSTEM lparallel-bench / lparallel-20160825-git / quicklisp 2016-08-25>
#<SYSTEM lparallel-test / lparallel-20160825-git / quicklisp 2016-08-25>
; No value

Looks like it is. Let’s go ahead and install it:

CL-USER> (ql:quickload :lparallel)
To load "lparallel":
  Load 2 ASDF systems:
    alexandria bordeaux-threads
  Install 1 Quicklisp release:
    lparallel
; Fetching #<URL "http://beta.quicklisp.org/archive/lparallel/2016-08-25/lparallel-20160825-git.tgz">
; 76.71KB
==================================================
78,551 bytes in 0.62 seconds (124.33KB/sec)
; Loading "lparallel"
[package lparallel.util]..........................
[package lparallel.thread-util]...................
[package lparallel.raw-queue].....................
[package lparallel.cons-queue]....................
[package lparallel.vector-queue]..................
[package lparallel.queue].........................
[package lparallel.counter].......................
[package lparallel.spin-queue]....................
[package lparallel.kernel]........................
[package lparallel.kernel-util]...................
[package lparallel.promise].......................
[package lparallel.ptree].........................
[package lparallel.slet]..........................
[package lparallel.defpun]........................
[package lparallel.cognate].......................
[package lparallel]
(:LPARALLEL)

And that’s all it took! Now let’s see how this library actually works.

The lparallel library

Top

The lparallel library is built on top of the Bordeaux threading library (see previous post for more on this library).

As mentioned in the previous post, parallelism and concurrency can be (and usually are) implemented using the same means — threads, processes, etc. The difference between lies in their conceptual differences.

Note that not all the examples shown in this post are necessarily parallel. Asynchronous constructs such as Promises and Futures are, in particular, more suited to concurrent programming than parallel programming.

The modus operandi of using the lparallel library (for a basic use case) is as follows:

  • Create an instance of what the library calls a kernel using lparallel:make-kernel. The kernel is the component that schedules and executes tasks.
  • Design the code in terms of futures, promises and other higher level functional concepts. To this end, lparallel provides support for channels, promises, futures, and cognates.
  • Perform operations using what the library calls cognates, which are simply functions which have equivalents in the Common Lisp language itself. For instance, the lparallel:pmap function is the parallel equivalent of the Common Lisp map function.
  • Finally, close the kernel created in the first step using lparallel:end-kernel.

Note that the onus of ensuring that the tasks being carried out are logically parallelisable as well as taking care of all mutable state is on the developer.

Demos

Top

First, let’s get hold of the number of threads that we are going to use for our parallel examples. Ideally, we’d like to have a 1:1 match between the number of worker threads and the number of available cores.

We can use the wonderful cffi library to this end. I plan to have a detailed blog post for this extremely useful library soon, but for now, let’s get on with it:

Install CFFI:

CL-USER> (ql:quickload :cffi)
To load "cffi":
  Load 4 ASDF systems:
    alexandria babel trivial-features uiop
  Install 1 Quicklisp release:
    cffi
; Fetching #<URL "http://beta.quicklisp.org/archive/cffi/2016-03-18/cffi_0.17.1.tgz">
; 234.48KB
==================================================
240,107 bytes in 5.98 seconds (39.22KB/sec)
; Loading "cffi"
[package cffi-sys]................................
[package cffi]....................................
..................................................
[package cffi-features]
(:CFFI)

Write C code to get the number of logical cores on the machine:

#include <stdio.h>
#include <sys/types.h>
#include <sys/sysctl.h>

int get_core_count();

int main()
{
    printf("%d\n", get_core_count());

    return 0;
}

int32_t get_core_count()
{
    const char* s = "hw.logicalcpu";
    int32_t core_count;
    size_t len = sizeof(core_count);

    sysctlbyname(s, &core_count, &len, NULL, 0);
    
    return core_count;
}

Bundle the C code into a shared library (note, I am using Mac OS X which comes bundled with Clang. For pure gcc, refer to the relevant documentation):

Timmys-MacBook-Pro:Parallelism z0ltan$ clang -dynamiclib get_core_count.c -o libcorecount.dylib

Invoke the function from Common Lisp:

CL-USER> (cffi:use-foreign-library "libcorecount.dylib")
#<CFFI:FOREIGN-LIBRARY LIBCORECOUNT.DYLIB-853 "libcorecount.dylib">
CL-USER> (cffi:foreign-funcall "get_core_count" :int)
8

We can see that the result is 8 cores on the machine (which is correct) and can be verified from the command line as well:

Timmys-MacBook-Pro:Parallelism z0ltan$ sysctl -n "hw.logicalcpu"
8

Common Setup

Top

In this example, we will go through the initial setup bit, and also show some useful information once the setup is done.

Load the library:

CL-USER> (ql:quickload :lparallel)
To load "lparallel":
  Load 1 ASDF system:
    lparallel
; Loading "lparallel"

(:LPARALLEL)

Initialise the lparallel kernel:

CL-USER> (setf lparallel:*kernel* (lparallel:make-kernel 8 :name "custom-kernel"))
#<LPARALLEL.KERNEL:KERNEL :NAME "custom-kernel" :WORKER-COUNT 8 :USE-CALLER NIL :ALIVE T :SPIN-COUNT 2000 {1003141F03}>

Note that the *kernel* global variable can be rebound — this allows multiple kernels to co-exist during the same run. Now, some useful information about the kernel:

CL-USER> (defun show-kernel-info ()
           (let ((name (lparallel:kernel-name))
                 (count (lparallel:kernel-worker-count))
                 (context (lparallel:kernel-context))
                 (bindings (lparallel:kernel-bindings)))
             (format t "Kernel name = ~a~%" name)
             (format t "Worker threads count = ~d~%" count)
             (format t "Kernel context = ~a~%" context)
             (format t "Kernel bindings = ~a~%" bindings)))
           
             
WARNING: redefining COMMON-LISP-USER::SHOW-KERNEL-INFO in DEFUN
SHOW-KERNEL-INFO

CL-USER> (show-kernel-info)
Kernel name = custom-kernel
Worker threads count = 8
Kernel context = #<FUNCTION FUNCALL>
Kernel bindings = ((*STANDARD-OUTPUT* . #<SLIME-OUTPUT-STREAM {10044EEEA3}>)
                   (*ERROR-OUTPUT* . #<SLIME-OUTPUT-STREAM {10044EEEA3}>))
NIL

End the kernel (this is important since *kernel* does not get garbage collected until we explictly end it):

CL-USER> (lparallel:end-kernel :wait t)
(#<SB-THREAD:THREAD "custom--kernel" FINISHED values: NIL {100723FA83}>
 #<SB-THREAD:THREAD "custom--kernel" FINISHED values: NIL {100723FE23}>
 #<SB-THREAD:THREAD "custom--kernel" FINISHED values: NIL {10072581E3}>
 #<SB-THREAD:THREAD "custom--kernel" FINISHED values: NIL {1007258583}>
 #<SB-THREAD:THREAD "custom--kernel" FINISHED values: NIL {1007258923}>
 #<SB-THREAD:THREAD "custom--kernel" FINISHED values: NIL {1007258CC3}>
 #<SB-THREAD:THREAD "custom--kernel" FINISHED values: NIL {1007259063}>
 #<SB-THREAD:THREAD "custom--kernel" FINISHED values: NIL {1007259403}>)

Let’s move on to some more examples of different aspects of the lparallel library.

For these demos, we will be using the following initial setup from a coding perspective:

(require ‘lparallel)
(require ‘bt-semaphore)

(defpackage :lparallel-user
  (:use :cl :lparallel :lparallel.queue :bt-semaphore))

(in-package :lparallel-user)

;;; initialise the kernel
(defun init ()
  (setf *kernel* (make-kernel 8 :name "channel-queue-kernel")))

(init)

So we will be using a kernel with 8 worker threads (one for each CPU core on the machine).

And once we’re done will all the examples, the following code will be run to close the kernel and free all used system resources:

;;; shut the kernel down
(defun shutdown ()
  (end-kernel :wait t))

(shutdown)

Using channels and queues

Top

First some definitions are in order.

A task is a job that is submitted to the kernel. It is simply a function object along with its arguments.

A channel in lparallel is similar to the same concept in Go. A channel is simply a means of communication with a worker thread. In our case, it is one particular way of submitting tasks to the kernel.

A channel is created in lparallel using lparallel:make-channel. A task is submitted using lparallel:submit-task, and the results received via lparallel:receive-result.

For instance, we can calculate the square of a number as:

(defun calculate-square (n)
  (let* ((channel (lparallel:make-channel))
         (res nil))
    (lparallel:submit-task channel #'(lambda (x)
                                       (* x x))
                           n)
    (setf res (lparallel:receive-result channel))
    (format t "Square of ~d = ~d~%" n res)))

And the output:

LPARALLEL-USER> (calculate-square 100)
Square of 100 = 10000
NIL

Now let’s try submitting multiple tasks to the same channel. In this simple example, we are simpy creating three tasks that square, triple, and quadrupls the supplied input respectively.

Note that in case of multiple tasks, the output will be in non-deterministic order:

(defun test-basic-channel-multiple-tasks ()
  (let ((channel (make-channel))
        (res '()))
    (submit-task channel #'(lambda (x)
                             (* x x))
                 10)
    (submit-task channel #'(lambda (y)
                             (* y y y))
                 10)
    (submit-task channel #'(lambda (z)
                             (* z z z z))
                 10)
     (dotimes (i 3 res)
       (push (receive-result channel) res))))

And the output:

LPARALLEL-USER> (dotimes (i 3)
                        	  (print (test-basic-channel-multiple-tasks)))

(100 1000 10000) 
(100 1000 10000) 
(10000 1000 100) 
NIL

lparallel also provides support for creating a blocking queue in order to enable message passing between worker threads. A queue is created using lparallel.queue:make-queue

Some useful functions for using queues are:

  • lparallel.queue:make-queue: create a FIFO blocking queue
  • lparallel.queue:push-queue: insert an element into the queue
  • lparallel.queue:pop-queue: pop an item from the queue
  • lparallel.queue:peek-queue: inspect value without popping it
  • lparallel.queue:queue-count: the number of entries in the queue
  • lparallel.queue:queue-full-p: check if the queue is full
  • lparallel.queue:queue-empty-p:check if the queue is empty
  • lparallel.queue:with-locked-queue: lock the queue during access
  • A basic demo showing basic queue properties:

    (defun test-queue-properties ()
      (let ((queue (make-queue :fixed-capacity 5)))
        (loop
           when (queue-full-p queue)
           do (return)
           do (push-queue (random 100) queue))
         (print (queue-full-p queue))
        (loop
           when (queue-empty-p queue)
           do (return)
           do (print (pop-queue queue)))
        (print (queue-empty-p queue)))
      nil)
    

    Which produces:

    LPARALLEL-USER> (test-queue-properties)
    
    T 
    17 
    51 
    55 
    42 
    82 
    T 
    NIL
    

    Note: lparallel.queue:make-queue is a generic interface which is actually backed by different types of queues. For instance, in the previous example, the actual type of the queue is lparallel.vector-queue since we specified it to be of fixed size using the :fixed-capacity keyword argument.

    The documentation doesn’t actually specify what keyword arguments we can pass to lparallel.queue:make-queue, so let’s and find that out in a different way:

    LPARALLEL-USER> (describe 'lparallel.queue:make-queue)
    LPARALLEL.QUEUE:MAKE-QUEUE
      [symbol]
    
    MAKE-QUEUE names a compiled function:
      Lambda-list: (&REST ARGS)
      Derived type: FUNCTION
      Documentation:
        Create a queue.
        
        The queue contents may be initialized with the keyword argument
        `initial-contents'.
        
        By default there is no limit on the queue capacity. Passing a
        `fixed-capacity' keyword argument limits the capacity to the value
        passed. `push-queue' will block for a full fixed-capacity queue.
      Source file: /Users/z0ltan/quicklisp/dists/quicklisp/software/lparallel-20160825-git/src/queue.lisp
    
    MAKE-QUEUE has a compiler-macro:
      Source file: /Users/z0ltan/quicklisp/dists/quicklisp/software/lparallel-20160825-git/src/queue.lisp
    ; No value
    

    So, as we can see, it supports the following keyword arguments – :fixed-capacity, and initial-contents.

    Now, if we do specify :fixed-capacity, then the actual type of the queue will be lparallel.vector-queue, and if we skip that keyword argument, the queue will be of type lparallel.cons-queue (which is a queue of unlimited size), as can be seen from the output of the following snippet:

    (defun check-queue-types ()
      (let ((queue-one (make-queue :fixed-capacity 5))
            (queue-two (make-queue)))
        (format t "queue-one is of type: ~a~%" (type-of queue-one))
        (format t "queue-two is of type: ~a~%" (type-of queue-two))))
    
    
    LPARALLEL-USER> (check-queue-types)
    queue-one is of type: VECTOR-QUEUE
    queue-two is of type: CONS-QUEUE
    NIL
    

    Of course, you can always create instances of the specific queue types yourself, but it is always better, when you can, to stick to the generic interface and letting the library create the proper type of queue for you.

    Now, let’s just see the queue in action!

    (defun test-basic-queue ()
      (let ((queue (make-queue))
            (channel (make-channel))
            (res '()))
        (submit-task channel #'(lambda ()
                         (loop for entry = (pop-queue queue)
                            when (queue-empty-p queue)
                            do (return)
                            do (push (* entry entry) res))))
        (dotimes (i 100)
          (push-queue i queue))
        (receive-result channel)
        (format t "~{~d ~}~%" res)))
    

    Here we submit a single task that repeatedly scans the queue till it’s empty, pops the available values, and pushes them into the res list.

    And the output:

    LPARALLEL-USER> (test-basic-queue)
    9604 9409 9216 9025 8836 8649 8464 8281 8100 7921 7744 7569 7396 7225 7056 6889 6724 6561 6400 6241 6084 5929 5776 5625 5476 5329 5184 5041 4900 4761 4624 4489 4356 4225 4096 3969 3844 3721 3600 3481 3364 3249 3136 3025 2916 2809 2704 2601 2500 2401 2304 2209 2116 2025 1936 1849 1764 1681 1600 1521 1444 1369 1296 1225 1156 1089 1024 961 900 841 784 729 676 625 576 529 484 441 400 361 324 289 256 225 196 169 144 121 100 81 64 49 36 25 16 9 4 1 0 
    NIL
    

    Killing tasks

    Top

    A small note mentioning the lparallel:kill-task function would be apropos at this juncture. This function is useful in those cases when tasks are unresponsive. The lparallel documentation clearly states that this must only be used as a last resort.

    All tasks which are created are by default assigned a category of :default. The dynamic property, *task-category* holds this value, and can be dynamically bound to different values (as we shall see).

    ;;; kill default tasks
    (defun test-kill-all-tasks ()
      (let ((channel (make-channel))
            (stream *query-io*))
        (dotimes (i 10)
          (submit-task channel #'(lambda (x)
                                   (sleep (random 10))
                                   (format stream "~d~%" (* x x))) (random 10)))
        (sleep (random 2))
        (kill-tasks :default)))
    

    Sample run:

    LPARALLEL-USER> (test-kill-all-tasks)
    16
    1
    8
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    

    Since we had created 10 tasks, all the 8 kernel worker threads were presumably busy with a task each. When we killed tasks of category :default, all these threads were killed as well and had to be regenerated (which is an expensive operation). This is part of the reason why lparallel:kill-tasks must be avoided.

    Now, in the example above, all running tasks were killed since all of them belonged to the :default category. Suppose we wish to kill only specific tasks, we can do that by binding *task-category* when we create those tasks, and then specifying the category when we invoke lparallel:kill-tasks.

    For example, suppose we have two categories of tasks – tasks which square their arguments, and tasks which cube theirs. Let’s assign them categories ’squaring-tasks and ’cubing-tasks respectively. Let’s then kill tasks of a randomly chosen category ’squaring-tasks or ’cubing-tasks.

    Here is the code:

    ;;; kill tasks of a randomly chosen category
    (defun test-kill-random-tasks ()
      (let ((channel (make-channel))
            (stream *query-io*))
        (let ((*task-category* 'squaring-tasks))
          (dotimes (i 5)
            (submit-task channel #'(lambda (x)
                                     (sleep (random 5))
                                     (format stream "~%[Squaring] ~d = ~d" x (* x x))) i)))
        (let ((*task-category* 'cubing-tasks))
          (dotimes (i 5)
            (submit-task channel #'(lambda (x)
                                     (sleep (random 5))
                                     (format stream "~%[Cubing] ~d = ~d" x (* x x x))) i)))
        (sleep 1)
        (if (evenp (random 10))
            (progn
              (print "Killing squaring tasks")
              (kill-tasks 'squaring-tasks))
            (progn
              (print "Killing cubing tasks")
              (kill-tasks 'cubing-tasks)))))
    

    And here is a sample run:

    LPARALLEL-USER> (test-kill-random-tasks)
    
    [Cubing] 2 = 8
    [Squaring] 4 = 16
    [Cubing] 4
     = [Cubing] 643 = 27
    "Killing squaring tasks" 
    4
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    
    [Cubing] 1 = 1
    [Cubing] 0 = 0
    
    LPARALLEL-USER> (test-kill-random-tasks)
    
    [Squaring] 1 = 1
    [Squaring] 3 = 9
    "Killing cubing tasks" 
    5
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    
    [Squaring] 2 = 4
    WARNING: lparallel: Replacing lost or dead worker.
    WARNING: lparallel: Replacing lost or dead worker.
    
    [Squaring] 0 = 0
    [Squaring] 4 = 16
    

    Using promises and futures

    Top

    Promises and Futures provide support for Asynchronous Programming.

    In lparallel-speak, a lparallel:promise is a placeholder for a result which is fulfilled by providing it with a value. The promise object itself is created using lparallel:promise, and the promise is given a value using the lparallel:fulfill macro.

    To check whether the promise has been fulfilled yet or not, we can use the lparallel:fulfilledp predicate function.
    Finally, the lparallel:force function is used to extract the value out of the promise. Note that this function blocks until the operation is complete.

    Let’s solidify these concepts with a very simple example first:

    (defun test-promise ()
      (let ((p (promise)))
        (loop
           do (if (evenp (read))
                  (progn
                    (fulfill p 'even-received!)
                    (return))))
        (force p)))
    

    Which generates the output:

    LPARALLEL-USER> (test-promise)
    5
    1
    3
    10
    EVEN-RECEIVED!
    [/code]

    Explanation: This simple example simply keeps looping forever until an even number has been entered. The promise is fulfilled inside the loop using lparallel:fulfill, and the value is then returned from the function by forcing it with lparallel:force.

    
    Now, let’s take a bigger example. Assuming that we don’t want to have to wait for the promise to be fulfilled, and instead have the current do some useful work, we can delegate the promise fulfillment to external explicitly as seen in the next example.
    
    Consider we have a function that squares its argument. And, for the sake of argument, it consumes a lot of time doing so. From our client code, we want to invoke it, and wait till the squared value is available.
    
    
    (defun promise-with-threads ()
      (let ((p (promise))
            (stream *query-io*)
            (n (progn
                 (princ "Enter a number: ")
                 (read))))
        (format t "In main function...~%")
        (bt:make-thread
         #'(lambda ()
             (sleep (random 10))
             (format stream "Inside thread... fulfilling promise~%")
             (fulfill p (* n n))))
        (bt:make-thread
         #'(lambda ()
             (loop
                when (fulfilledp p)
                do (return)
                do (progn
                     (format stream "~d~%" (random 100))
                     (sleep (* 0.01 (random 100)))))))
        (format t "Inside main function, received value: ~d~%" (force p))))
    

    And the output:

    LPARALLEL-USER> (promise-with-threads)
    Enter a number: 19
    In main function...
    44
    59
    90
    34
    30
    76
    Inside thread... fulfilling promise
    Inside main function, received value: 361
    NIL
    

    Explanation: There is nothing much in this example. We create a promise object p, and we spawn off a thread that sleeps for some random time and then fulfills the promise by giving it a value.

    Meanwhile, in the main thread, we spawn off another thread that keeps hecking if the promise has been fulfilled or not. If not, it prints some random number and continues checking. Once the promise has been fulfilled, we can extract the value using lparallel:force in the main thread as shown.

    This shows that promises can be fulfilled by different threads while the code that created the promise need not wait for the promise to be fulfilled. This is especially important since, as mentioned before, lparallel:force is a blocking call. We want to delay forcing the promise until the value is actually available.

    Another point to note when using promises is that once a promise has been fulfilled, invoking force on the same object will always return the same value. That is to say, a promise can be successfully fulfilled only once.

    For instance:

    (defun multiple-fulfilling ()
      (let ((p (promise)))
        (dotimes (i 10)
          (fulfill p (random 100))
          (format t "~d~%" (force p)))))
    

    Which produces:

    LPARALLEL-USER> (multiple-fulfilling)
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    NIL
    

    So how does a future differ from a promise?

    A lparallel:future is simply a promise that is run in parallel, and as such, it does not block the main thread like a default use of <code<lparallel:promise would. It is executed in its own thread (by the lparallel library, of course).

    Here is a simple example of a future:

    (defun test-future ()
      (let ((f (future
                 (sleep (random 5))
                 (print "Hello from future!"))))
        (loop
           when (fulfilledp f)
           do (return)
           do (sleep (* 0.01 (random 100)))
             (format t "~d~%" (random 100)))
        (format t "~d~%" (force f))))
    

    And the output:

    LPARALLEL-USER> (test-future)
    5
    19
    91
    11
    Hello from future!
    NIL
    

    Explanation: This exactly is similar to the promise-with-threads example. Observe two differences, however - first of all, the lparallel:future macro has a body as well. This allows the future to fulfill itself! What this means is that as soon as the body of the future is done executing, lparallel:fulfilledp will always return true for the future object.

    Secondly, the future itself is spawned off on a separate thread by the library, so it does not interfere with the execution of the current thread very much unlike promises as could be seen in the promise-with-threads example (which needed an explicit thread for the fulfilling code in order to avoid blocking the current thread).

    The most interesting bit is that (even in terms of the actual theory propounded by Dan Friedman and others), a Future is conceptually something that fulfills a Promise. That is to say, a promise is a contract that some value will be generated sometime in the future, and a future is precisely that “something” that does that job.

    What this means is that even when using the lparallel library, the basic use of a future would be to fulfill a promise. This means that hacks like promise-with-threads need not be made by the user.

    Let’s take a small example to demonstrate this point (a pretty contrived example, I must admit!).

    Here’s the scenario: we want to read in a number and calculate its square. So we offload this work to another function, and continue with our own work. When the result is ready, we want it to be printed on the console without any intervention from us.

    Here’s how the code looks:

    ;;; Callback example using promises and futures
    (defun callback-promise-future-demo ()
      (let* ((p (promise))
             (stream *query-io*)
             (n (progn
                  (princ "Enter a number: ")
                  (read)))
             (f (future
                  (sleep (random 10))
                  (fulfill p (* n n))
                  (force (future
                           (format stream "Square of ~d = ~d~%" n (force p)))))))
        (loop
           when (fulfilledp f)
           do (return)
           do (sleep (* 0.01 (random 100))))))
    

    And the output:

    LPARALLEL-USER> (callback-promise-future-demo)
    Enter a number: 19
    Square of 19 = 361
    NIL
    

    Explanation: All right, so first off, we create a promise to hold the squared value when it is generated. This is the p object. The input value is stored in the local variable n.

    Then we create a future object f. This future simply squares the input value and fulfills the promise with this value. Finally, since we want to print the output in its own time, we force an anonymous future which simply prints the output string as shown.

    Note that this is very similar to the situation in an environment like Node, where we pass callback functions to other functions with the understanding that the callback will be called when the invoked function is done with its work.

    Finally note that the following snippet is still fine (even if it uses the blocking lparallel:force call because it’s on a separate thread):


    (force (future
    (format stream "Square of ~d = ~d~%" n (force p))))

    To summarise, the general idiom of usage is: define objects which will hold the results of asynchronous computations in promises, and use futures to fulfill those promises.

    Using cognates

    Top

    Cognates are argubaly the raison d’etre of the lparallel library. These constructs are what truly provide parallelism in the lparalle. Note, however, that most (if not all) of these constructs are built on top of futures and promises.

    To put it in a nutshell, cognates are simply functions that are intended to be the parallel equivalents of their Common Lisp counterparts. However, there are a few extra lparallel cognates that have no Common Lisp equivalents.

    At this juncture, it is important to know that cognates come in two basic flavours:

    1. Constructs for fine-grained parallelism: defpun, plet, plet-if, etc.
    2. Explicit functions and macros for performing parallel operations - pmap, preduce, psort, pdotimes, etc.

    In the first case we don’t have much explicit control over the operations themselves. We mostly rely on the fact that the library itself will optimise and parallelise the forms to whatever extent it can. In this post, we will focus on the second category of cognates.

    Take, for instance, the cognate function lparallel:pmap is exactly the same as the Common Lisp equivalent, map, but it runs in parallel. Let’s demonstrate that through an example.

    Suppose we had a list of random strings of length varying from 3 to 10, and we wished to collect their lengths in a vector.

    Let’s first set up the helper functions that will generate the random strings:

    (defvar *chars*
      (remove-duplicates
       (sort
        (loop for c across "The quick brown fox jumps over the lazy dog"
           when (alpha-char-p c)
           collect (char-downcase c))
        #'char<)))   
    
    (defun get-random-strings (&optional (count 100000))
      "generate random strings between lengths 3 and 10"
      (loop repeat count
         collect
           (concatenate 'string  (loop repeat (+ 3 (random 8))
                               collect (nth (random 26) *chars*)))))
    

    And here’s how the Common Lisp map version of the solution might look like:

    ;;; map demo
    (defun test-map ()
      (map 'vector #'length (get-random-strings 100)))
    

    And let’s have a test run:

    LPARALLEL-USER> (test-map)
    #(7 5 10 8 7 5 3 4 4 10)
    

    And here’s the lparallel:pmap equivalent:

    ;;;pmap demo
    (defun test-pmap ()
      (pmap 'vector #'length (get-random-strings 100)))
    

    which produces:

    LPARALLEL-USER> (test-pmap)
    #(8 7 6 7 6 4 5 6 5 7)
    LPARALLEL-USER> 
    

    As you can see from the definitions of test-map and test-pmap, the syntax of the lparallel:map and lparallel:pmap functions are exactly the same (well, almost - lparallel:pmap has a few more optional arguments).

    Some useful cognate functions and macros (all of them are functions except when marked so explicitly. Note that there are quite a few cognates, and I have chosen a few to try and represent every category through an example:

    • lparallel:pmap:
      Parallel version of map.

      Note that all the mapping functions (lparallel:pmap, lparallel:pmapc,lparallel:pmapcar, etc.) are take two special keyword arguments - :size, specifiying the number of elements of the input sequence(s) to process, and :parts which specifies the number of parallel parts to divide the sequence(s) into.

      ;;; pmap - function
      (defun test-pmap ()
        (let ((numbers (loop for i below 10
                          collect i)))
          (pmap 'vector #'(lambda (x)
                            (* x x))
                :parts (length numbers)
                numbers)))
      

      Sample run:

      LPARALLEL-USER> (test-pmap)
      
      #(0 1 4 9 16 25 36 49 64 81)
      
    • lparallel:por:
      Parallel version of or. The behaviour is that it returns the first non-nil element amongst its arguments. However, due to the parallel nature of this macro, that element varies.

      ;;; por - macro
      (defun test-por ()
        (let ((a 100)
              (b 200)
              (c nil)
              (d 300))
          (por a b c d)))
      

      Sample run:

      LPARALLEL-USER> (dotimes (i 10)
                        (print (test-por)))
      
      300 
      300 
      100 
      100 
      100 
      300 
      100 
      100 
      100 
      100 
      NIL
      

      In the case of the normal or operator, it would always have returned the first non-nil element viz. 100.

    • lparallel:pdotimes:
      Parallel version of dotimes. Note that this macro also take an optional :parts argument.

      ;;; pdotimes - macro
      (defun test-pdotimes ()
        (pdotimes (i 5)
          (declare (ignore i))
          (print (random 100))))
      

      Sample run:

      LPARALLEL-USER> (test-pdotimes)
      
      39 
      29 
      81 
      42 
      56 
      NIL
      
    • lparallel:pfuncall:
      Parallel version of funcall.

      ;;; pfuncall - macro
      (defun test-pfuncall ()
        (pfuncall #'* 1 2 3 4 5))
      

      Sample run:

      LPARALLEL-USER> (test-pfuncall)
      
      120
      
    • lparallel:preduce:
      Parallel version of reduce.

      This very important function also takes two optional keyword arguments - :parts (same meaning as explained), and :recurse. If :recurse is non-nil, it recursively applies lparallel:preduce to its arguments, otherwise it default to using reduce.

      ;;; preduce - function
      (defun test-preduce ()
        (let ((numbers (loop for i from 1 to 100
                          collect i)))
          (preduce #'+ 
                   numbers
                   :parts (length numbers)
                   :recurse t)))
      

      Sample run:

      LPARALLEL-USER> (test-preduce)
      
      5050
      
    • lparallel:premove-if-not:
      Parallel version of remove-if-not. This is essentially equivalent to “filter” in Functional Programming parlance.

      ;;; premove-if-not 
      (defun test-premove-if-not ()
        (let ((numbers (loop for i from 1 to 100
                          collect i)))
          (premove-if-not #'evenp numbers)))
      

      Sample run:

      LPARALLEL-USER> (test-premove-if-not)
      
      (2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54
       56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100)
      
    • lparallel:pevery:
      Parallel version of every.

      ;;; pevery - function
      (defun test-pevery ()
        (let ((numbers (loop for i from 1 to 100
                          collect i)))
          (list (pevery #'evenp numbers)
                (pevery #'integerp numbers))))
      

      Sample run:

      LPARALLEL-USER> (test-pevery)
      
      (NIL T)
      

      In this example, we are performing two checks - firstly, whether all the numbers in the range [1,100] are even, and secondly, whether all the numbers in the same range are integers.

    • lparallel:count:
      Parallel version of count.

      ;;; pcount - function
      (defun test-pcount ()
        (let ((chars "The quick brown fox jumps over the lazy dog"))
          (pcount #\e chars)))
      

      Sample run:

      LPARALLEL-USER> (test-pcount)
      
      3
      
    • lparallel:psort:
      Parallel version of sort.

      ;;; psort - function
      (defstruct person
        name
        age)
      
      (defun test-psort ()
        (let* ((names (list "Rich" "Peter" "Sybil" "Basil" "Candy" "Slava" "Olga"))
               (people (loop for name in names
                          collect (make-person :name name :age (+ (random 20) 20)))))
          (print "Before sorting...")
          (print people)
          (fresh-line)
          (print "After sorting...")
          (psort
           people
           #'(lambda (x y)
               (< (person-age x)
                  (person-age y)))
           :test #'=)))
      

      Sample run:

      LPARALLEL-USER> (test-psort)
      
      "Before sorting..." 
      (#S(PERSON :NAME "Rich" :AGE 38) #S(PERSON :NAME "Peter" :AGE 24)
       #S(PERSON :NAME "Sybil" :AGE 20) #S(PERSON :NAME "Basil" :AGE 22)
       #S(PERSON :NAME "Candy" :AGE 23) #S(PERSON :NAME "Slava" :AGE 37)
       #S(PERSON :NAME "Olga" :AGE 33)) 
      
      "After sorting..." 
      (#S(PERSON :NAME "Sybil" :AGE 20) #S(PERSON :NAME "Basil" :AGE 22)
       #S(PERSON :NAME "Candy" :AGE 23) #S(PERSON :NAME "Peter" :AGE 24)
       #S(PERSON :NAME "Olga" :AGE 33) #S(PERSON :NAME "Slava" :AGE 37)
       #S(PERSON :NAME "Rich" :AGE 38))
      

      In this example, we first define a structure of type person for storing information about people. Then we create a list of 7 people with randomly generated ages (between 20 and 39). Finally, we sort them by age in non-decreasing order.

    References

    Top

    There are, of course, a lot more functions, objects, and idiomatic ways of performing parallel computations using the lparallel library. This post barely scratches the surface on those. However, the general flow of operation is amply demonstrated here, and for further reading, you may find the following resources useful:

    In the final part of this series, we will discuss an extremely important topic common to all that was covered in this post - error handling.

Basic Concurrency and Parallelism in Common Lisp – Part 4a (Parallelism using lparallel – fundamentals)

Basic Concurrency and Parallelism in Common Lisp – Part 3 (Concurrency using Bordeaux and SBCL threads)

In case you haven’t, I’d recommend checking out the previous post to ensure that you have the correct setup configured and ready to follow this tutorial.

You need to have the following setup at minimum:

  • A standard Common Lisp installation aside from CLISP (which doesn’t have any thread support)(preferably compiled)
  • The Bordeaux threading library
  • SBCL with thread support (for the SBCL specific examples)
  • Quicklisp (preferred)

What is Concurrency? What is Parallelism?

Concurrency is a way of running different, possibly related, tasks seemingly simultaneously. What this means is that even on a single processor machine, you can simulate simultaneity using threads (for instance) and context-switching them.

In the case of system (native OS) threads, the scheduling and context switching is ultimately determined by the OS. This is the case with Java threads and Common Lisp threads.

In the case of “green” threads, that is to say threads that are completely managed by the program, the scheduling can be completely controlled by the program itself. Erlang is a great example of this approach.

So what is the difference between Concurrency and Parallelism? Parallelism is usually defined in a very strict sense to mean independent tasks being run in parallel, simultaneously, on different processors or on different cores. In this narrow sense, you really cannot have parallelism on a single-core, single-processor machine.

It rather helps to differentiate between these two related concepts on a more abstract level – concurrency primarily deals with providing the illusion of simultaneity to clients so that the system doesn’t appear locked when a long running operation is underway. GUI systems are a wonderful example of this kind of system. Concurrency is therefore concerned with providing good user experience and not necessarily concerned with performance benefits.

Java’s Swing toolkit and JavaScript are both single-threaded, and yet they can give the appearance of simultaneity because of the context switching behind the scenes. Of course, concurrency is implemented using multiple threads/processes in most cases.

Parallelism, on the other hand, is mostly concerned with pure performance gains. For instance, if we are given a task to find the squares of all the even numbers in a given range, we could divide the range into chunks which are then run in parallel on different cores or different processors, and then the results can be collated together to form the final result. This is an example of Map-Reduce in action.

So now that we have separated the abstract meaning of Concurrency from that of Parallelism, we can talk a bit about the actual mechanism used to implement them. This is where most of the confusion arise for a lot of people. They tend to tie down abstract concepts with specific means of implementing them. In essence, both abstract concepts may be implemented using the same mechanisms! For instance, we may implement concurrent features and parallel features using the same basic thread mechanism in Java. It’s only the conceptual intertwining or independence of tasks at an abstract level that makes the difference for us.

For instance, if we have a task where part of the work can be done on a different thread (possibly on a different core/processor), but the thread which spawns this thread is logically dependent on the results of the spawned thread (and as such has to “join” on that thread), it is still Concurrency!

So the bottomline is this – Concurrency and Parallelism are different concepts, but their implementations may be done using the same mechanisms — threads, processes, etc.

Checking for thread support in Common Lisp

Regardless of the Common Lisp implementation, there is a standard way to check for thread support availability:

CL-USER> (member :thread-support *FEATURES*)
(:THREAD-SUPPORT :SWANK :QUICKLISP :ASDF-PACKAGE-SYSTEM :ASDF3.1 :ASDF3 :ASDF2
 :ASDF :OS-MACOSX :OS-UNIX :NON-BASE-CHARS-EXIST-P :ASDF-UNICODE :64-BIT
 :64-BIT-REGISTERS :ALIEN-CALLBACKS :ANSI-CL :ASH-RIGHT-VOPS :BSD
 :C-STACK-IS-CONTROL-STACK :COMMON-LISP :COMPARE-AND-SWAP-VOPS
 :COMPLEX-FLOAT-VOPS :CYCLE-COUNTER :DARWIN :DARWIN9-OR-BETTER :FLOAT-EQL-VOPS
 :FP-AND-PC-STANDARD-SAVE :GENCGC :IEEE-FLOATING-POINT :INLINE-CONSTANTS
 :INODE64 :INTEGER-EQL-VOP :LINKAGE-TABLE :LITTLE-ENDIAN
 :MACH-EXCEPTION-HANDLER :MACH-O :MEMORY-BARRIER-VOPS :MULTIPLY-HIGH-VOPS
 :OS-PROVIDES-BLKSIZE-T :OS-PROVIDES-DLADDR :OS-PROVIDES-DLOPEN
 :OS-PROVIDES-PUTWC :OS-PROVIDES-SUSECONDS-T :PACKAGE-LOCAL-NICKNAMES
 :PRECISE-ARG-COUNT-ERROR :RAW-INSTANCE-INIT-VOPS :SB-DOC :SB-EVAL :SB-LDB
 :SB-PACKAGE-LOCKS :SB-SIMD-PACK :SB-SOURCE-LOCATIONS :SB-TEST :SB-THREAD
 :SB-UNICODE :SBCL :STACK-ALLOCATABLE-CLOSURES :STACK-ALLOCATABLE-FIXED-OBJECTS
 :STACK-ALLOCATABLE-LISTS :STACK-ALLOCATABLE-VECTORS
 :STACK-GROWS-DOWNWARD-NOT-UPWARD :SYMBOL-INFO-VOPS :UD2-BREAKPOINTS :UNIX
 :UNWIND-TO-FRAME-AND-CALL-VOP :X86-64)

If there were no thread support, it would show “NIL” as the value of the expression.

Depending on the specific library being used, we may also have different ways of checking for concurrency support, which may be used instead of the common check mentioned above.

For instance, in our case, we are interested in using the Bordeaux library. To check whether there is support for threads using this library, we can use the *supports-threads-p* variable so:

First let’s load up the Bordeaux library using Quicklisp:

CL-USER> (ql:quickload 'bt-semapahore)
(BT-SEMAPAHORE)
CL-USER> (ql:quickload :bt-semaphore)
To load "bt-semaphore":
  Load 1 ASDF system:
    bt-semaphore
; Loading "bt-semaphore"

(:BT-SEMAPHORE)

Now we can see whether the *supports-threads-p* global variable is set to NIL (no support) or T (support available):

CL-USER> bt:*supports-threads-p*
T

Okay, now that we’ve got that out of the way, let’s test out both the platform-independent library (Bordeaux) as well as the platform-specific support (SBCL in this case).

To do this, let us work our way through a number of simple examples:

  1. Basics — list current thread, list all threads, get thread name
  2. Update a global variable from a thread
  3. Print a message onto the top-level using a thread
  4. Print a message onto the top-level — fixed
  5. Print a message onto the top-level — better
  6. Modify a shared resource from multiple threads
  7. Modify a shared resource from multiple threads — fixed using locks
  8. Modify a shared resource from multiple threads — using atomic operations
  9. Joining on a thread, destroying a thread example

Bordeaux threads

The Bordeaux library provides a platform independent way to handle basic threading on multiple Common Lisp implementations. The interesting bit is that it itself does not really create any native threads — it relies entirely on the underlying implementation to do so.

On there other hand, it does provide some useful extra features in its own abstractions over the lower-level threads.

Also, you can see from the demo programs that a lot of the Bordeaux functions seem quite similar to those used in SBCL. I don’t really think that this is a coincidence.

You can refer to the documentation for more details (check the “Wrap-up” section).

Demo

  • Basics — list current thread, list all threads, get thread name:
    ;;; Print the current thread, all the threads, and the current thread's name
    
    (defun print-thread-info ()
      (let* ((curr-thread (bt:current-thread))
             (curr-thread-name (bt:thread-name curr-thread))
             (all-threads (bt:all-threads)))
        (format t "Current thread: ~a~%~%" curr-thread)
        (format t "Current thread name: ~a~%~%" curr-thread-name)
        (format t "All threads:~% ~{~a~%~}~%" all-threads))
      nil)
    

    And the output:

    CL-USER> (print-thread-info)
    Current thread: #<THREAD "repl-thread" RUNNING {10043B8003}>
    
    Current thread name: repl-thread
    
    All threads:
     #<THREAD "repl-thread" RUNNING {10043B8003}>
    #<THREAD "auto-flush-thread" RUNNING {10043B7DA3}>
    #<THREAD "swank-indentation-cache-thread" waiting on: #<WAITQUEUE  {1003A28103}> {1003A201A3}>
    #<THREAD "reader-thread" RUNNING {1003A20063}>
    #<THREAD "control-thread" waiting on: #<WAITQUEUE  {1003A19E53}> {1003A18C83}>
    #<THREAD "Swank Sentinel" waiting on: #<WAITQUEUE  {1003790043}> {1003788023}>
    #<THREAD "main thread" RUNNING {1002991CE3}>
    
    NIL
    
  • Update a global variable from a thread:
    (defparameter *counter* 0)
    
    (defun test-update-global-variable ()
      (bt:make-thread
       (lambda ()
         (sleep 1)
         (incf *counter*)))
      *counter*)
    

    We create a new thread using bt:make-thread, which takes a lambda abstraction as a parameter. Note that this lambda abstraction cannot take any parameters.

    Another point to note is that unlike some other languages (Java, for instance), there is no separation from creating the thread object and starting/running it. In this case, as soon as the thread is created, it is executed.

    The output:

    CL-USER> (test-update-global-variable)
    
    0
    CL-USER> *counter*
    1
    

    As we can see, because the main thread returned immediately, the initial value of *counter* is 0, and then around a second later, it gets updated to 1 by the anonymous thread.

  • Print a message onto the top-level using a thread:
    ;;; Print a message onto the top-level using a thread
    
    (defun print-message-top-level-wrong ()
      (bt:make-thread
       (lambda ()
         (format *standard-output* "Hello from thread!")))
      nil)
    

    And the output:

    CL-USER> (print-message-top-level-wrong)
    NIL
    

    So what went wrong? The problem is variable binding. Now, the ’t’ parameter to the format function refers to the top-level, which is a Common Lisp term for the main console stream, also referred to by the global variable *standard-output*. So we could have expected the output to be shown on the main console screen.

    The same code would have run fine if we had not run it in a separate thread. What happens is that each thread has its own stack where the variables are rebound. In this case, even for *standard-output*, which being a global variable, we would assume should be available to all threads, is rebound inside each thread! This is similar to the concept of ThreadLocal storage in Java.

  • Print a message onto the top-level — fixed:

    So how do we fix the problem of the previous example? By binding the top-level at the time of thread creation of course. Pure lexical scoping to the rescue!

    ;;; Print a message onto the top-level using a thread — fixed
    
    (defun print-message-top-level-fixed ()
      (let ((top-level *standard-output*))
        (bt:make-thread
         (lambda ()
           (format top-level "Hello from thread!"))))
      nil)
    

    Which produces:

    CL-USER> (print-message-top-level-fixed)
    Hello from thread!
    NIL
    

    Phew! However, there is another way of producing the same result using a very interesting reader macro as we’ll see next.

  • Print a message onto the top-level — read-time eval macro:

    Let’s take a look at the code first:

    ;;; Print a message onto the top-level using a thread - reader macro
    
    (eval-when (:compile-toplevel)
      (defun print-message-top-level-reader-macro ()
        (bt:make-thread
         (lambda ()
           (format #.*standard-output* "Hello from thread!")))
        nil))
    
    (print-message-top-level-reader-macro)
    

    And the output:

    CL-USER> (print-message-top-level-reader-macro)
    Hello from thread!
    NIL
    

    So it works, but what’s the deal with the eval-when and what is that strange #. symbol before *standard-output*?

    eval-when controls when evaluation of Lisp expressions takes place. We can have three targets — :compile-toplevel, :load-toplevel, and :execute.

    The “#.” symbol is what is called a “Reader macro”. (I will be posting a whole post (or maybe a series of posts!) on reader macros in the future). A reader (or read) macro is called so because it has special meaning to the Common Lisp Reader, which is the component that is responsible for reading in Common Lisp expressions and making sense out of them. This specific reader macro ensures that the binding of *standard-output* is done at read time.

    Binding the value at read-time ensures that the original value of *standard-output* is maintained when the thread is run, and the output is shown on the correct top-level.

    Now this is where the eval-when bit comes into play. By wrapping the whole function definition inside the eval-when, and ensuring that evaluation takes place during compile time, the correct value of *standard-output* is bound. If we had skipped the eval-when, we would see the following error:

      error: 
        don't know how to dump #<SWANK/GRAY::SLIME-OUTPUT-STREAM {100439EEA3}> (default MAKE-LOAD-FORM method called).
        ==>
          #<SWANK/GRAY::SLIME-OUTPUT-STREAM {100439EEA3}>
        
      note: The first argument never returns a value.
      note: 
        deleting unreachable code
        ==>
          "Hello from thread!"
        
    
    Compilation failed.
    

    And that makes sense because SBCL cannot make sense of what this output stream returns since it is a stream and not really a defined value (which is what the ‘format’ function expects). That is why we see the “unreachable code” error.

    Note that if the same code had been run on the REPL directly, there would be no problem since the resolution of all the symbols would be done correctly by the REPL thread.

    I have already posted a comprehensive post on eval-when and other advanced Common Lisp constructs. You can find them with a simple search in my blog (use the search box).

  • Modify a shared resource from multiple threads:

    Suppose we have the following setup with a minimal bank-account class (no error checks):

    ;;; Modify a shared resource from multiple threads
    
    (defclass bank-account ()
      ((id :initarg :id
           :initform (error "id required")
           :accessor :id)
       (name :initarg :name
             :initform (error "name required")
             :accessor :name)
       (balance :initarg :balance
                :initform 0
                :accessor :balance)))
    
    (defgeneric deposit (account amount)
      (:documentation "Deposit money into the account"))
    
    (defgeneric withdraw (account amount)
      (:documentation "Withdraw amount from account"))
    
    (defmethod deposit ((account bank-account) (amount real))
      (incf (:balance account) amount))
    
    (defmethod withdraw ((account bank-account) (amount real))
      (decf (:balance account) amount))
    

    And we have a simple client which apparently does not believe in any form of synchronisation:

    (defparameter *rich*
      (make-instance 'bank-account
                     :id 1
                     :name "Rich"
                     :balance 0))
    ; compiling (DEFPARAMETER *RICH* ...)
    
    (defun demo-race-condition ()
      (loop repeat 100
         do
           (bt:make-thread
            (lambda ()
              (loop repeat 10000 do (deposit *rich* 100))
              (loop repeat 10000 do (withdraw *rich* 100))))))
    

    This is all we are doing – create a new bank account instance (balance 0), and then create a 100 threads, each of which simply deposits an amount of 100 10000 times, and then withdraws the same amount the same number of times. So the final result should be the same as that of the opening balance, which is 0, right? Let’s check that and see.

    On a sample run, we might get the following results:

    CL-USER> (:balance *rich*)
    0
    CL-USER> (dotimes (i 5)
               (demo-race-condition))
    NIL
    CL-USER> (:balance *rich*)
    22844600
    

    Whoa! The reason for this discrepancy is that incf and decf are not atomic operations — they consist of multiple sub-operations, and the order in which they are executed is not in our control.

    This is what is called a “race condition” — multiple threads contending for the same shared resource with at least one modifying thread which, more likely than not, reads the wrong value of the object while modifying it. How do we fix it? One simple way it to use locks (mutex in this case, could be semaphores for more complex situations).

  • Modify a shared resource from multiple threads — fixed using locks:
  • Let’s rest the balance for the account back to 0 first:

    CL-USER> (setf (:balance *rich*) 0)
    0
    CL-USER> (:balance *rich*)
    0
    

    Now let’s modify the demo-race-condition function to access the shared resource using locks (created using bt:make-lock and used as shown):

    (defvar *lock* (bt:make-lock))
    ; compiling (DEFVAR *LOCK* …)
    
    (defun demo-race-condition-locks ()
      (loop repeat 100
         do
           (bt:make-thread
            (lambda ()
              (loop repeat 10000 do (bt:with-lock-held (*lock*)
                                      (deposit *rich* 100)))
              (loop repeat 10000 do (bt:with-lock-held (*lock*)
                                      (withdraw *rich* 100)))))))
    ; compiling (DEFUN DEMO-RACE-CONDITION-LOCKS ...)
    

    And let’s do a bigger sample run this time around:

    CL-USER> (dotimes (i 100)
               (demo-race-condition-locks))
    NIL
    CL-USER> (:balance *rich*)
    0
    

    Excellent! Now this is better. Of course, one has to remember that using a mutex like this is bound to affect performance. There is a better way in quite a few circumstances — using atomic operations when possible. We’ll cover that next.

  • Modify a shared resource from multiple threads — using atomic operations:

    Atomic operations are operations that are guaranteed by the system to all occur inside a conceptual transaction, i.e., all the sub-operations of the main operation all take place together without any interference from outside. The operation succeeds completely or fails completely. There is no middle ground, and there is no inconsistent state.

    Another advantage is that performance is far superior to using locks to protect access to the shared state. We will see this difference in the actual demo run.

    The Bordeaux library does not provide any real support for atomics, so we will have to depend on the specific implementation support for that. In our case, that is SBCL, and so we will have to defer this demo to the SBCL section.

  • Joining on a thread, destroying a thread example:

    To join on a thread, we use the bt:join-thread function, and for destroying a thread (not a recommended operation), we can use the bt:destroy-thread function.

    A simple demo:

    (defmacro until (condition &body body)
      (let ((block-name (gensym)))
        `(block ,block-name
           (if ,condition
               (return-from ,block-name nil)
               (progn
                 ,@body)))))
    
    (defun join-destroy-thread ()
      (let* ((s *standard-output*)
            (joiner-thread (bt:make-thread
                            (lambda ()
                              (loop for i from 1 to 10
                                 do
                                   (format s "~%[Joiner Thread]  Working...")
                                   (sleep (* 0.01 (random 100)))))))
            (destroyer-thread (bt:make-thread
                               (lambda ()
                                 (loop for i from 1 to 1000000
                                    do
                                      (format s "~%[Destroyer Thread] Working...")
                                      (sleep (* 0.01 (random 10000))))))))
        (format t "~%[Main Thread] Waiting on joiner thread...")
        (bt:join-thread joiner-thread)
        (format t "~%[Main Thread] Done waiting on joiner thread")
        (if (bt:thread-alive-p destroyer-thread)
            (progn
              (format t "~%[Main Thread] Destroyer thread alive... killing it")
              (bt:destroy-thread destroyer-thread))
            (format t "~%[Main Thread] Destroyer thread is already dead"))
        (until (bt:thread-alive-p destroyer-thread)
               (format t "[Main Thread] Waiting for destroyer thread to die..."))
        (format t "~%[Main Thread] Destroyer thread dead")
        (format t "~%[Main Thread] Adios!~%")))
    

    And the output on a run:

    CL-USER> (join-destroy-thread)
    
    [Joiner Thread]  Working...
    [Destroyer Thread] Working...
    [Main Thread] Waiting on joiner thread...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Main Thread] Done waiting on joiner thread
    [Main Thread] Destroyer thread alive... killing it
    [Main Thread] Destroyer thread dead
    [Main Thread] Adios!
    NIL
    

    The until macro simply loops around until the condition becomes true. The rest of the code is pretty much self-explanatory — the main thread waits for the joiner-thread to finish, but it immediately destroys the destroyer-thread.

    Again, it is not recommended to use bt:destroy-thread. Any conceivable situation which requires this function can probably be done better with another approach.

    Now let’s move onto some more comprehensive examples which tie together all the concepts discussed thus far.

Useful functions

Here is a summary of the functions, macros and global variables which were used in the demo examples along with some extras. These should cover most of the basic programming scenarios:

  • bt:*supports-thread-p* (to check for basic thread support)
  • bt:make-thread (create a new thread)
  • bt:current-thread (return the current thread object)
  • bt:all-threads (return a list of all running threads)
  • bt:thread-alive-p (checks if the thread is still alive)
  • bt:thread-name (return the name of the thread)
  • bt:join-thread (join on the supplied thread)
  • bt:interrupt-thread (interrupt the given thread)
  • bt:destroy-thread (attempt to abort the thread)
  • bt:make-lock (create a mutex)
  • bt:with-lock-held (use the supplied lock to protect critical code)

SBCL threads

SBCL provides support for native threads via its sb-thread package. These are very low-level functions, but we can build our own abstractions on top of these as shown in the demo examples.

You can refer to the documentation for more details (check the “Wrap-up” section).

Demos

You can see from the examples below that there is a strong correspondence between Bordeaux and SBCL Thread functions. In most cases, the only difference is the change of package name from bt to sb-thread.

It is evident that the Bordeaux thread library was more or less based on the SBCL implementation. As such, explanation will be provided only in those cases where there is a major difference in syntax or semantics.

  • Basics — list current thread, list all threads, get thread name:
    The code:

    ;;; Print the current thread, all the threads, and the current thread's name
    
    (defun print-thread-info ()
      (let* ((curr-thread sb-thread:*current-thread*)
             (curr-thread-name (sb-thread:thread-name curr-thread))
             (all-threads (sb-thread:list-all-threads)))
        (format t "Current thread: ~a~%~%" curr-thread)
        (format t "Current thread name: ~a~%~%" curr-thread-name)
        (format t "All threads:~% ~{~a~%~}~%" all-threads))
      nil)
    

    And the output:

    CL-USER> (print-thread-info)
    Current thread: #<THREAD "repl-thread" RUNNING {10043B8003}>
    
    Current thread name: repl-thread
    
    All threads:
     #<THREAD "repl-thread" RUNNING {10043B8003}>
    #<THREAD "auto-flush-thread" RUNNING {10043B7DA3}>
    #<THREAD "swank-indentation-cache-thread" waiting on: #<WAITQUEUE  {1003A28103}> {1003A201A3}>
    #<THREAD "reader-thread" RUNNING {1003A20063}>
    #<THREAD "control-thread" waiting on: #<WAITQUEUE  {1003A19E53}> {1003A18C83}>
    #<THREAD "Swank Sentinel" waiting on: #<WAITQUEUE  {1003790043}> {1003788023}>
    #<THREAD "main thread" RUNNING {1002991CE3}>
    
    NIL
    
  • Update a global variable from a thread:

    The code:

    ;;; Update a global variable from a thread
    
    (defparameter *counter* 0)
    
    (defun test-update-global-variable ()
      (sb-thread:make-thread
       (lambda ()
         (sleep 1)
         (incf *counter*)))
      *counter*)
    

    And the output:

    CL-USER> (test-update-global-variable)
    0
    
  • Print a message onto the top-level using a thread:

    The code:

    ;;; Print a message onto the top-level using a thread
    
    (defun print-message-top-level-wrong ()
      (sb-thread:make-thread
       (lambda ()
         (format *standard-output* "Hello from thread!")))
      nil)
    

    And the output:

    CL-USER> (print-message-top-level-wrong)
    NIL
    
  • Print a message onto the top-level — fixed:

    The code:

    ;;; Print a message onto the top-level using a thread - fixed
    
    (defun print-message-top-level-fixed ()
      (let ((top-level *standard-output*))
        (sb-thread:make-thread
         (lambda ()
           (format top-level "Hello from thread!"))))
      nil)
    

    And the output:

    CL-USER> (print-message-top-level-fixed)
    Hello from thread!
    NIL
    
  • Print a message onto the top-level — better

    The code:

    ;;; Print a message onto the top-level using a thread - reader macro
    
    (eval-when (:compile-toplevel)
      (defun print-message-top-level-reader-macro ()
        (sb-thread:make-thread
         (lambda ()
           (format #.*standard-output* "Hello from thread!")))
        nil))
    

    And the output:

    CL-USER> (print-message-top-level-reader-macro)
    Hello from thread!
    NIL
    
  • Modify a shared resource from multiple threads:

    The code:

    ;;; Modify a shared resource from multiple threads
    
    (defclass bank-account ()
      ((id :initarg :id
           :initform (error "id required")
           :accessor :id)
       (name :initarg :name
             :initform (error "name required")
             :accessor :name)
       (balance :initarg :balance
                :initform 0
                :accessor :balance)))
    
    (defgeneric deposit (account amount)
      (:documentation "Deposit money into the account"))
    
    (defgeneric withdraw (account amount)
      (:documentation "Withdraw amount from account"))
    
    (defmethod deposit ((account bank-account) (amount real))
      (incf (:balance account) amount))
    
    (defmethod withdraw ((account bank-account) (amount real))
      (decf (:balance account) amount))
    
    (defparameter *rich*
      (make-instance 'bank-account
                     :id 1
                     :name "Rich"
                     :balance 0))
    
    (defun demo-race-condition ()
      (loop repeat 100
         do
           (sb-thread:make-thread
            (lambda ()
              (loop repeat 10000 do (deposit *rich* 100))
              (loop repeat 10000 do (withdraw *rich* 100))))))
    

    And the output:

    CL-USER> (:balance *rich*)
    0
    CL-USER> (demo-race-condition)
    NIL
    CL-USER> (:balance *rich*)
    3987400
    
  • Modify a shared resource from multiple threads — fixed using locks:

    The code:

    (defvar *lock* (sb-thread:make-mutex))
    
    (defun demo-race-condition-locks ()
      (loop repeat 100
         do
           (sb-thread:make-thread
            (lambda ()
              (loop repeat 10000 do (sb-thread:with-mutex (*lock*)
                                      (deposit *rich* 100)))
              (loop repeat 10000 do (sb-thread:with-mutex (*lock*)
                                      (withdraw *rich* 100)))))))
    

    The only difference here is that instead of make-lock as in Bordeaux, we have make-mutex and that is used along with the macro with-mutex as shown in the example.

    And the output:

    CL-USER> (:balance *rich*)
    0
    CL-USER> (demo-race-condition-locks)
    NIL
    CL-USER> (:balance *rich*)
    0
    
  • Modify a shared resource from multiple threads — using atomic operations:

    First, the code:

    ;;; Modify a shared resource from multiple threads - atomics
    
    (defgeneric atomic-deposit (account amount)
      (:documentation "Atomic version of the deposit method"))
    
    (defgeneric atomic-withdraw (account amount)
      (:documentation "Atomic version of the withdraw method"))
    
    (defmethod atomic-deposit ((account bank-account) (amount real))
      (sb-ext:atomic-incf (car (cons (:balance account) nil)) amount))
    
    (defmethod atomic-withdraw ((account bank-account) (amount real))
      (sb-ext:atomic-decf (car (cons (:balance account) nil)) amount))
    
    (defun demo-race-condition-atomics ()
      (loop repeat 100
         do (sb-thread:make-thread
             (lambda ()
               (loop repeat 10000 do (atomic-deposit *rich* 100))
               (loop repeat 10000 do (atomic-withdraw *rich* 100))))))
    

    And the output:

    CL-USER> (dotimes (i 5)
               (format t "~%Opening: ~d" (:balance *rich*))
               (demo-race-condition-atomics)
               (format t "~%Closing: ~d~%" (:balance *rich*)))
    
    Opening: 0
    Closing: 0
    
    Opening: 0
    Closing: 0
    
    Opening: 0
    Closing: 0
    
    Opening: 0
    Closing: 0
    
    Opening: 0
    Closing: 0
    NIL
    

    As you can see, SBCL’s atomic functions are a bit quirky. The two functions used here: sb-ext:incf and sb-ext:atomic-decf have the following signatures:


    Macro: atomic-incf [sb-ext] place &optional diff

    and


    Macro: atomic-decf [sb-ext] place &optional diff

    The interesting bit is that the “place” parameter must be any of the following (as per the documentation):

    • a defstruct slot with declared type (unsigned-byte 64) or aref of a (simple-array (unsigned-byte 64) (*)) The type sb-ext:word can be used for these purposes.
    • car or cdr (respectively first or REST) of a cons.
    • a variable defined using defglobal with a proclaimed type of fixnum.

    This is the reason for the bizarre construct used in the atomic-deposit and atomic-decf methods.

    One major incentive to use atomic operations as much as possible is performance. Let’s do a quick run of the demo-race-condition-locks and demo-race-condition-atomics functions over 1000 times and check the difference in performance (if any):

    With locks:

    CL-USER> (time 
                        (loop repeat 100
                          do (demo-race-condition-locks)))
    Evaluation took:
      57.711 seconds of real time
      431.451639 seconds of total run time (408.014746 user, 23.436893 system)
      747.61% CPU
      126,674,011,941 processor cycles
      3,329,504 bytes consed
      
    NIL
    

    With atomics:

    CL-USER> (time
                        (loop repeat 100
                         do (demo-race-condition-atomics)))
    Evaluation took:
      2.495 seconds of real time
      8.175454 seconds of total run time (6.124259 user, 2.051195 system)
      [ Run times consist of 0.420 seconds GC time, and 7.756 seconds non-GC time. ]
      327.66% CPU
      5,477,039,706 processor cycles
      3,201,582,368 bytes consed
      
    NIL
    

    
The results? The locks version took around 57s whereas the lockless atomics version took just 2s! This is a massive difference indeed!

  • Joining on a thread, destroying a thread example:

    The code:

    ;;; Joining on and destroying a thread
    
    (defmacro until (condition &body body)
      (let ((block-name (gensym)))
        `(block ,block-name
           (if ,condition
               (return-from ,block-name nil)
               (progn
                 ,@body)))))
    
    (defun join-destroy-thread ()
      (let* ((s *standard-output*)
            (joiner-thread (sb-thread:make-thread
                            (lambda ()
                              (loop for i from 1 to 10
                                 do
                                   (format s "~%[Joiner Thread]  Working...")
                                   (sleep (* 0.01 (random 100)))))))
            (destroyer-thread (sb-thread:make-thread
                               (lambda ()
                                 (loop for i from 1 to 1000000
                                    do
                                      (format s "~%[Destroyer Thread] Working...")
                                      (sleep (* 0.01 (random 10000))))))))
        (format t "~%[Main Thread] Waiting on joiner thread...")
        (bt:join-thread joiner-thread)
        (format t "~%[Main Thread] Done waiting on joiner thread")
        (if (sb-thread:thread-alive-p destroyer-thread)
            (progn
              (format t "~%[Main Thread] Destroyer thread alive... killing it")
              (sb-thread:terminate-thread destroyer-thread))
            (format t "~%[Main Thread] Destroyer thread is already dead"))
        (until (sb-thread:thread-alive-p destroyer-thread)
               (format t "[Main Thread] Waiting for destroyer thread to die..."))
        (format t "~%[Main Thread] Destroyer thread dead")
        (format t "~%[Main Thread] Adios!~%")))
    

    And the output:

    CL-USER> (join-destroy-thread)
    
    [Joiner Thread]  Working...
    [Destroyer Thread] Working...
    [Main Thread] Waiting on joiner thread...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Joiner Thread]  Working...
    [Main Thread] Done waiting on joiner thread
    [Main Thread] Destroyer thread alive... killing it
    [Main Thread] Destroyer thread dead
    [Main Thread] Adios!
    NIL
    

Useful functions

Here is a summarised list of the functions, macros and global variables used in the examples along with some extras:

  • (member :thread-support *features*) (check thread support)
  • sb-thread:make-thread (create a new thread)
  • sb-thread:*current-thread* (holds the current thread object)
  • sb-thread:list-all-threads (return a list of all running threads)
  • sb-thread:thread-alive-p (checks if the thread is still alive)
  • sb-thread:thread-name (return the name of the thread)
  • sb-thread:join-thread (join on the supplied thread)
  • sb-thread:interrupt-thread (interrupt the given thread)
  • sb-thread:destroy-thread (attempt to abort the thread)
  • sb-thread:make-mutex (create a mutex)
  • sb-thread:with-mutex (use supplied lock to protect critical code)

Wrap-up

As you can see, concurrency support is rather primitive in Common Lisp, but that’s primarily due to the glaring absence of this important feature in the ANSI Common Lisp specification. That does not detract in the least from the support provided by Common Lisp implementations, nor wonderful libraries like the Bordeaux library.

You should follow up on your own by reading a lot more on this topic. I share some of my own references here:

Next up, the final post in this mini-series: parallelism in Common Lisp using the lparallel library.

Basic Concurrency and Parallelism in Common Lisp – Part 3 (Concurrency using Bordeaux and SBCL threads)

Basic Concurrency and Parallelism in Common Lisp – Part 2 (Bootstrapping a threading-capable Mac OS X SBCL instance from source)

The Problem

The main (and practically only) issue with the default binary distribution available for SBCL on Mac OS X is that it does not come with built-in support for threads:

CL-USER> (format t "~a, ~a~%" (lisp-implementation-type)
                 (lisp-implementation-version))
SBCL, 1.2.11
NILCL-USER> (member :thread-support *FEATURES*)
NIL

As we can see, the SBCL version is 1.2.11 and there is no support for threads.

Assuming that the Bordeaux library has already been installed, load it up:

CL-USER> (ql:quickload :bt-semaphore)
To load "bt-semaphore":
  Load 1 ASDF system:
    bt-semaphore
; Loading "bt-semaphore"

(:BT-SEMAPHORE)

This is a bit surprising – if there is no thread support on the platform, shouldn’t the Bordeaux library fail during loading itself? Well, it does not, and it is precisely this fact the one has to be careful about – just because a library is successfully downloaded and loaded does not mean that it is going to run as expected:

CL-USER> bt:*supports-threads-p*
NIL

The Solution

The reason why threading is off by default is that thread support on the Mac OS platform is an experimental feature in SBCL. In order to work with threads on Mac OS X, we need to build SBCL from source with thread support.

The way we do this is by bootstrapping the new compiled version of the source code using the existing SBCL installation. The following steps should make it amply clear:

  • First, we need to download the latest source for SBCL:
    Timmys-MacBook-Pro:Software z0ltan$ wget https://sourceforge.net/projects/sbcl/files/sbcl/1.3.8/sbcl-1.3.8-source.tar.bz2
    --2016-08-29 05:16:40--  https://sourceforge.net/projects/sbcl/files/sbcl/1.3.8/sbcl-1.3.8-source.tar.bz2
    Resolving sourceforge.net... 216.34.181.60
    Connecting to sourceforge.net|216.34.181.60|:443... connected.
    HTTP request sent, awaiting response... 302 Found
    Location: https://sourceforge.net/projects/sbcl/files/sbcl/1.3.8/sbcl-1.3.8-source.tar.bz2/download [following]
    --2016-08-29 05:16:41--  https://sourceforge.net/projects/sbcl/files/sbcl/1.3.8/sbcl-1.3.8-source.tar.bz2/download
    Connecting to sourceforge.net|216.34.181.60|:443... connected.
    HTTP request sent, awaiting response... 302 Found
    Location: http://downloads.sourceforge.net/project/sbcl/sbcl/1.3.8/sbcl-1.3.8-source.tar.bz2?r=&ts=1472428003&use_mirror=liquidtelecom [following]
    --2016-08-29 05:16:43--  http://downloads.sourceforge.net/project/sbcl/sbcl/1.3.8/sbcl-1.3.8-source.tar.bz2?r=&ts=1472428003&use_mirror=liquidtelecom
    Resolving downloads.sourceforge.net... 216.34.181.59
    Connecting to downloads.sourceforge.net|216.34.181.59|:80... connected.
    HTTP request sent, awaiting response... 302 Found
    Location: http://liquidtelecom.dl.sourceforge.net/project/sbcl/sbcl/1.3.8/sbcl-1.3.8-source.tar.bz2 [following]
    --2016-08-29 05:16:45--  http://liquidtelecom.dl.sourceforge.net/project/sbcl/sbcl/1.3.8/sbcl-1.3.8-source.tar.bz2
    Resolving liquidtelecom.dl.sourceforge.net... 197.155.77.8
    Connecting to liquidtelecom.dl.sourceforge.net|197.155.77.8|:80... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: 5754599 (5.5M) [application/octet-stream]
    Saving to: ‘sbcl-1.3.8-source.tar.bz2’
    
    sbcl-1.3.8-source.tar.bz2                          100%[==============================================================================================================>]   5.49M   600KB/s    in 17s     
    
    2016-08-29 05:17:02 (333 KB/s) - ‘sbcl-1.3.8-source.tar.bz2’ saved [5754599/5754599]
    

    Version 1.3.8 is the latest stable version of the source code distribution. If you want to stay bleeding edge, you can download the latest development version from here – https://sourceforge.net/p/sbcl/sbcl/ci/master/tree/

  • Now we unzip, untar, and compile the source code by passing in two flags – —fancy and —with-sb-thread:
    Timmys-MacBook-Pro:Software z0ltan$ bzip2 -cd sbcl-1.3.8-source.tar.bz2 | tar xf -
    Timmys-MacBook-Pro:Software z0ltan$ ls
    ASDF				sbcl-1.2.11-x86-64-darwin	sbcl-1.3.8			sbcl-1.3.8-source.tar.bz2
    Timmys-MacBook-Pro:Software z0ltan$ 
    

    Excellent! Now we have the sbcl-1.3.8 directory which contains the source code and scripts that we need to use next.

    A brief note on the mentioned flags – when you pass in —fancy, you actually enable all extended features of SBCL. In our case, we’re only interested in threads, so we could try passing in only the —with-sb-thread flag. However, that would possibly not work (at least one my local system it didn’t!).

    The reason is that —fancy installs the extended features even on platforms where it might be unstable which is precisely the case here. So my recommendation is to pass in both flags as shown:

    Timmys-MacBook-Pro:sbcl-1.3.8 z0ltan$ sudo sh make.sh —fancy —with-sb-thread
    
    <lots of build messages - elided>
    
    The build seems to have finished successfully, including       18 (out of       18)
    contributed modules. If you would like to run more extensive tests on
    the new SBCL, you can try:
    
      cd tests && sh ./run-tests.sh
    
      (All tests should pass on x86/Linux, x86/FreeBSD4, and ppc/Darwin. On other platforms some failures are currently expected; patches welcome as always.)
    
    To build documentation:
    
      cd doc/manual && make
    
    To install SBCL (more information in INSTALL):
    
      sh install.sh
    
    //build started:  Mon Aug 29 05:38:52 IST 2016
    //build finished: Mon Aug 29 05:42:42 IST 2016
    

    The command should take a minute or two to run.

  • Finally, we can install the binaries into the default locations (which can be changed using the INSTALL_ROOT environment variable — refer to the documentation for specifics):
    Timmys-MacBook-Pro:sbcl-1.3.8 z0ltan$ sudo sh install.sh
    Password:
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//uiop.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//asdf.fasl "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-aclrepl.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-aclrepl.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-bsd-sockets.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-bsd-sockets.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-cltl2.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-cltl2.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-concurrency.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-concurrency.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-cover.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-cover.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-executable.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-executable.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-gmp.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-gmp.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-grovel.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-grovel.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-introspect.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-introspect.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-md5.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-md5.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-mpfr.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-mpfr.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-posix.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-posix.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-queue.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-queue.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-rotate-byte.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-rotate-byte.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-rt.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-rt.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-simple-streams.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-simple-streams.asd "/usr/local/lib/sbcl/contrib/"
    cp /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-sprof.fasl /Users/z0ltan/Software/sbcl-1.3.8/obj/sbcl-home/contrib//sb-sprof.asd "/usr/local/lib/sbcl/contrib/"
    
    SBCL has been installed:
     binary /usr/local/bin/sbcl
     core and contribs in /usr/local/lib/sbcl/
    
    Documentation:
     man /usr/local/share/man/man1/sbcl.1
    

Now that we have a working SBCL instance compiled from source, let’s fire up SLIME and test it out!

Checking the SBCL version:

CL-USER> (lisp-implementation-type)
"SBCL"
CL-USER> (lisp-implementation-version)
"1.3.8"

Load the Bordeaux library and check if threading support is now enabled:

CL-USER> (ql:quickload :bt-semaphore)
To load "bt-semaphore":
  Load 1 ASDF system:
    bt-semaphore
; Loading "bt-semaphore"

(:BT-SEMAPHORE)
CL-USER> bt:*supports-threads-p*
T
CL-USER> (member :thread-support *FEATURES*)
(:THREAD-SUPPORT :SWANK :QUICKLISP :ASDF-PACKAGE-SYSTEM :ASDF3.1 :ASDF3 :ASDF2
 :ASDF :OS-MACOSX :OS-UNIX :NON-BASE-CHARS-EXIST-P :ASDF-UNICODE :64-BIT
 :64-BIT-REGISTERS :ALIEN-CALLBACKS :ANSI-CL :ASH-RIGHT-VOPS :BSD
 :C-STACK-IS-CONTROL-STACK :COMMON-LISP :COMPARE-AND-SWAP-VOPS
 :COMPLEX-FLOAT-VOPS :CYCLE-COUNTER :DARWIN :DARWIN9-OR-BETTER :FLOAT-EQL-VOPS
 :FP-AND-PC-STANDARD-SAVE :GENCGC :IEEE-FLOATING-POINT :INLINE-CONSTANTS
 :INODE64 :INTEGER-EQL-VOP :LINKAGE-TABLE :LITTLE-ENDIAN
 :MACH-EXCEPTION-HANDLER :MACH-O :MEMORY-BARRIER-VOPS :MULTIPLY-HIGH-VOPS
 :OS-PROVIDES-BLKSIZE-T :OS-PROVIDES-DLADDR :OS-PROVIDES-DLOPEN
 :OS-PROVIDES-PUTWC :OS-PROVIDES-SUSECONDS-T :PACKAGE-LOCAL-NICKNAMES
 :PRECISE-ARG-COUNT-ERROR :RAW-INSTANCE-INIT-VOPS :SB-DOC :SB-EVAL :SB-LDB
 :SB-PACKAGE-LOCKS :SB-SIMD-PACK :SB-SOURCE-LOCATIONS :SB-TEST :SB-THREAD
 :SB-UNICODE :SBCL :STACK-ALLOCATABLE-CLOSURES :STACK-ALLOCATABLE-FIXED-OBJECTS
 :STACK-ALLOCATABLE-LISTS :STACK-ALLOCATABLE-VECTORS
 :STACK-GROWS-DOWNWARD-NOT-UPWARD :SYMBOL-INFO-VOPS :UD2-BREAKPOINTS :UNIX
 :UNWIND-TO-FRAME-AND-CALL-VOP :X86-64)

Brilliant! Finally, let’s test out a simple example to make sure that threads actually do work now:

CL-USER> (defun test-sbcl-thread ()
           (let ((top-level *query-io*))
             (sb-thread:make-thread
              (lambda ()
                (format top-level "Hello from ~a~%"
                        (sb-thread:thread-name sb-thread:*current-thread*))))))
TEST-SBCL-THREAD
CL-USER> (test-sbcl-thread)
Hello from NIL
#<SB-THREAD:THREAD FINISHED values: NIL {1005132733}>

and the same example using the Bordeaux library:

CL-USER> (defun test-bordeaux-thread ()
           (let ((top-level *query-io*))
             (bt:make-thread
              (lambda ()
                (format top-level "Hello from ~a~%"
                        (bt:thread-name (bt:current-thread)))))))
TEST-BORDEAUX-THREAD
CL-USER> (test-bordeaux-thread)
Hello from Anonymous thread
#<SB-THREAD:THREAD "Anonymous thread" FINISHED values: NIL {1002C1A7B3}>

Success! As you can see, it’s really rather simple to build SBCL from source (even if you simply want to upgrade). There are a few other tweaks that you can perform when installing SBCL from source whereas with a binary installation, you are pretty much stuck in terms of functionality.

One advantage of this approach is that we can even bootstrap the SBCL client if we have an existing installation of most of the major Common Lisp implementations. Again, the documentation is the final word on the matter.

Next up in this mini-series of blogs, basic concurrency in Common Lisp using SBCL threads and the Bordeaux threads library.

Basic Concurrency and Parallelism in Common Lisp – Part 2 (Bootstrapping a threading-capable Mac OS X SBCL instance from source)