Creating custom Java 8 Stream collectors

The streams feature (which makes heavy use of Functional Interfaces) is arguably the strongest feature in Java 8 (and above). The introduction of lambdas in Java in the same release to target Functional Interfaces also meant that creating chained operations in a functional style has never been easier in Java.

That being said, there are plenty of examples in the official docs that show how streams can be “collected” (effectively reduced/folded depending on your preference of the terminology) into Collection types – List, Set, or even Map. Now, I must make it absolutely clear at this stage that I love Java streams, and barring the absence (or rather, deprecation) of the zip iterator, it’s almost comprehensive. However, there are no examples in the docs to show how we might a series of intermediate operations into a custom type. Of course the helper class, Collectors has several helper methods such as groupingBy, partitioningBy, filtering, and reducing, but they either return a Map, or expect a reducible expression which may not always be the case as explained next.

Recently, I did a project in which I needed to process a stream of integers (the lack of zip forced me to take quite the peripatetic approach to finally make things work. Perhaps more on that in a later blogpost) acting as indices into a simple wrapper around a List of integers, and then ultimately collect the updated values of the list into a new instance of the custom type. It was quite an interesting experience that sparked interest in exploring how much more we could push the collect mechanism. (If you are interested in checking out the code for the mentioned example, you can find it here – Functional Nim.

For some more examples of custom Collector implementations, you can check out my Github page.

Use Case

For the purposes of this blog, to keep things simple, let us consider a a hypothetical example. Suppose we have a Point class with the following structure:

package com.z0ltan.custom.collectors.types;

public class Point {
	private int x;
	private int y;

	public Point(final int x, final int y) {
		this.x = x;
		this.y = y;
	}

	public int x() {
		return this.x;
	}

	public int y() {
		return this.y;
	}

	@Override
	public int hashCode() {
		return (this.x + this.y) % 31;
	}

	@Override
	public boolean equals(Object other) {
		if (other == null || !(other instanceof Point)) {
			return false;
		}

		Point p = (Point) other;

		return p.x == this.x && p.y == this.y;
	}

	@Override
	public String toString() {
		return "Point { x = " + x + ", y = " + y + " }";
	}
}

and we have a List of such points. Now suppose we wish to collate the information into a custom Points object which has the following structure:

package com.z0ltan.custom.collectors.types;

import java.util.Set;

public class Points {
	private Set<Integer> xs;
	private Set<Integer> ys;

	public Points(final Set<Integer> xs, final Set<Integer> ys) {
		this.xs = xs;
		this.ys = ys;
	}

	@Override
	public int hashCode() {
		return (this.xs.hashCode() + this.ys.hashCode()) % 31;
	}

	@Override
	public boolean equals(Object other) {
		if (other == null || !(other instanceof Points)) {
			return false;
		}

		Points p = (Points) other;

		return p.xs.equals(this.xs) && p.ys.equals(this.ys);
	}

	@Override
	public String toString() {
		return "Points { xs = " + xs + ", ys = " + ys + " }";
	}
}

As can be seen, we wish to retain only the unique x and y coordinate values into our final object.

A simple and logical way would be to collect the items from this stream (collect being a “terminal operation” defined in the Stream interface) into our Points object using a custom collector. Before we can do that, let us first understand what is involved in implementing a custom collector.

How to implement a custom collector

In brief, there are two forms of operations involved in Java streams – non-terminal or intermediate operations, which produce streams of their own, and terminal operations, which effectively stop the pipeline resulting in a final result of some sort.

As mentioned before, in most cases, the helper class, Collectors, provides enough functionality to cater to almost any requirement imaginable. However, in cases such as these, where we want to collect data into a custom type, we might be better off defining our own custom collector.

To do that, let us examine the signature of the collect method in the Stream interface. In fact, we will find that there are two versions of this method available to us:

 R collect​(Collector collector)

and

 R collect​(Supplier supplier,
              BiConsumer accumulator,
              BiConsumer combiner)

So which one do we use? Well, we can actually use either one for our purposes. In fact, the former is preferable if we have some involved logic, and we wish to encapsulate all of that in a nice class. However, functionally speaking, the latter is exactly the same. This can be further clarified by examining the Collector interface (showing only the abstract methods):

public interface Collector<T, A, R> {
        Supplier<A>	supplier​()
        BiConsumer<A,T>	accumulator​();	
        BinaryOperator<A>	combiner​()	
        Function<A,R>	finisher​()
        Set<Collector.Characteristics>	characteristics​()	
 }

As can be seen, if we do implement the Collector interface, we will have to implement essentially the same methods as that used by the second version of the collect method. In addition, we have a couple of extra methods which are not only interesting, but quite vital if we wish to implement the interface:

  • finisher: This is the main method that we need to implement as per our collection logic. This is the actual part where the accumulated values of the stream are massaged together into the final return value. The type parameters give a big hint in this regard – the return type,
    R is the same as that returned by the overall collect
    method.
  • characteristics: This is where we need to be careful. The enum has three variants – CONCURRENT, IDENTITY_FINISH, and UNORDERED. The bottomline is this – always use CONCURRENT for your custom types if the final value depends on the ordering of the values in the stream, or use UNORDERED if they do not. In the case of collecting values into a custom non-collection type, I don’t see any scenario where you would want to use IDENTITY_FINISH (unless you are a big fan of unsolicited ClassCastExceptionS).

    In short, this variant indicates that the finisher function is essentially an identity function, meaning that it can be skipped, and the currently accumulated value returned as the overall result (which is precisely what we wish to avoid).

One final comment to understand the collect method once and for all – what all those terms mean!

  • Supplier: This is the mechanism by which the input values are supplied to the collect method.
  • Accumulator: This is where the elements of the stream are combined with a running accumulator (which may be of a different type from the elements themselves), “reduced”, or “folded” in Functional terms.
  • Combiner: Similar to the accumulator, but the elements being combined together are of the same type. In most cases, this type would be a collection type, and finally,
  • Finisher: This is the meat of the whole collector. This is where the actual custom logic goes into to take the values produced by the combiner into the final result of the given return type.

Now that we’ve analysed the signature of the collect method, we must be in a position to realise that we can actually create custom collectors in multiple ways:

  • Using the static of methods in the Collector interface by supplying the correct supplier, accumulator, combiner, finisher, and Collector characteristics,
  • By creating a class that implements the Collector interface itself, and thereby providing implementations of the same supplier, accumulator, combiner, finisher, and Collector characteristics,
  • By simply creating any anonymous class conforming to the Collector interface, and providing the same inputs as in the previous two cases, or
  • Using any combination of the above.

To keep matters simple, let us create a custom class that implements the Collector interface. This will not only make things easier to understand, but also allow us to maintain code cleanliness.

Now let’s proceed with the implementation of the given use case to solidify these concepts.

Implementation and Demo

Let’s create a simple Maven project called custom_stream_collectors:

Macushla:Blog z0ltan$ mvn archetype:generate -DgroupId=com.z0ltan.custom.collectors -DartifactId=custom-collector -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Maven Stub Project (No POM) 1
[INFO] ------------------------------------------------------------------------

               <elided>

[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 30.967 s
[INFO] Finished at: 2017-07-11T20:57:45+05:30
[INFO] Final Memory: 18M/62M
[INFO] ------------------------------------------------------------------------

After customising the project to our heart’s desire, let’s fill in our custom collector class:

package com.z0ltan.custom.collectors.collectors;

import java.util.ArrayList;
import java.util.Collections;
import java.util.EnumSet;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.function.BiConsumer;
import java.util.function.BinaryOperator;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collector;

import com.z0ltan.custom.collectors.types.Point;
import com.z0ltan.custom.collectors.types.Points;

public class PointToPointsCollector implements Collector<Point, List<Point>, Points> {
	@Override
	public Supplier<List<Point>> supplier() {
		return ArrayList::new;
	}

	@Override
	public BiConsumer<List<Point>, Point> accumulator() {
		return List::add;
	}

	@Override
	public BinaryOperator<List<Point>> combiner() {
		return (acc, ps) -> {
			acc.addAll(ps);
			return acc;
		};
	}

	@Override
	public Function<List<Point>, Points> finisher() {
		return (points) -> {
			final Set<Integer> xs = new HashSet<>();
			final Set<Integer> ys = new HashSet<>();
			
			for (Point p : points) {
				xs.add(p.x());
				ys.add(p.y());
			}
			
			return new Points(xs, ys);
		};
	}

	@Override
	public Set<java.util.stream.Collector.Characteristics> characteristics() {
		return Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.UNORDERED));
	}
}

We use ArrayList::new (method references are another excellent feature in Java 8 and beyond) for our Supplier since we start off with a blank slate, and for the Accumulator, we use List::add since the last section made it clear that the accumulator’s only job is to keep collecting items into running value of another type (a List in this case).

Then we have the Combiner which is implemented by the little lambda expression:

   (acc, ps) -> { acc.addAll(ps); return acc; }

As mentioned in the previous section, the combiner simply flattens the collections together into a single collection. In case of confusion, always look to the type signature for clarity.

And finally, we have the Finisher:

return (points) -> {
			final Set<Integer> xs = new HashSet<>();
			final Set<Integer> ys = new HashSet<>();
			
			for (Point p : points) {
				xs.add(p.x());
				ys.add(p.y());
			}
			
			return new Points(xs, ys);
		};

At this point of the stream pipeline, the points variable holds the list of accumulated Point objects. All we do then is to create an instance of the Points class by using the data available in the points variable. The whole point (if you will forgive the pun) is that this method will have logic peculiar to your specific use case, so the implementation will vary tremendously (which is more than can be said about the others – supplier, accumulator, and combiner).

And finally, here is our main class:

package com.z0ltan.custom.collectors;

import java.util.Arrays;
import java.util.List;

import com.z0ltan.custom.collectors.collectors.PointToPointsCollector;
import com.z0ltan.custom.collectors.types.Point;
import com.z0ltan.custom.collectors.types.Points;

public class Main {
	public static void main(String[] args) {
		final List<Point> points = Arrays.asList(new Point(1, 2), new Point(1, 2), new Point(3, 4), new Point(4, 3),
				new Point(2, 5), new Point(2, 5));

		// the result of our custom collector
		final Points pointsData = points.stream().collect(new PointToPointsCollector());

		System.out.printf("\npoints = %s\n", points);
		System.out.printf("\npoints data = %s\n", pointsData);
	}
}

Well, let’s run it and see the output!

Macushla:custom-collector z0ltan$ mvn package && java -jar target/custom-collector-1.0-SNAPSHOT.jar
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building custom-collector 1.0-SNAPSHOT
        <elided>
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.627 s
[INFO] Finished at: 2017-07-11T21:31:19+05:30
[INFO] Final Memory: 16M/55M
[INFO] ------------------------------------------------------------------------

points = [Point { x = 1, y = 2 }, Point { x = 1, y = 2 }, Point { x = 3, y = 4 }, Point { x = 4, y = 3 }, Point { x = 2, y = 5 }, Point { x = 2, y = 5 }]

points data = Points { xs = [1, 2, 3, 4], ys = [2, 3, 4, 5] }

Success!

Advertisements
Creating custom Java 8 Stream collectors

Interop mini-series – Callbacks special! (Common Lisp special) (Part 2b)

This is a continuation of the previous post callbacks interlude. I decided to give the section pertaining to Common Lisp its own post as I think there is some good educational value in this part itself.

We carry on from where we left off last time. We continue with the same squaring number callback example.

As a quick refresher, the idea is to implement a synchronous callback scenario. The function client invokes another function squarify which squares the passed value and invokes a callback function callback.

How it’s done in Common Lisp

Let’s start off with our first attempt to implement the solution in Common Lisp.

;;;; Callback demo using the squarify example.

(defpackage :callback-demo-user
  (:use :cl))

(in-package :callback-demo-user)

(defun callback(n)
  (format t "Received: ~d~%" n))

(defun squarify(n cb)
  (funcall cb (* n n)))

(defun client ()
  (let ((n (progn
             (princ "Enter a number: ")
             (read))))
    (squarify n #'callback)))
CALLBACK-DEMO-USER> (client)
Enter a number: 19
Received: 361
NIL

That’s the direct equivalent of all the demos shown so far. However, since Common Lisp is a functional language (albeit not as pure as, say, Scheme or Haskell), we can certainly do better!

In most Functional Programming languages, higher order functions are usually deployed to do the job. So let’s see if we can cook up something nicely functional like function composition.
Here’s a first attempt:

(defun client()
  (funcall #'(lambda (n)
               (format t "Received: ~d~%" n))
           (funcall #'(lambda (n)
                        (* n n))
                    (funcall #'(lambda ()
                                 (princ "Enter number: ")
                                 (read))))))

Which produces:

CALLBACK-DEMO-USER> (client)
Enter number: 19
Received: 361
NIL

As expected! Now, as you may know, funcall simply takes a function and some arguments (optional), and applies the function to those arguments. In this case, we simply compose them in the proper order so that the types match up: read a number -> square it -> print message.

However, let’s work our way to a generic compose function that simulates the behaviour of Haskell’s composition operator. The previous function can be improved by defining a new version that composes the three functions in the mentioned order (so as to match types):

The compose function:

(defun compose (fn gn hn)
  #'(lambda (&rest args)
      (funcall fn (funcall gn (apply hn args)))))

And the client to test it:

(defun client ()
  (funcall (compose #'(lambda (x)
                        (format t "Received: ~d~%" x))
                    #'(lambda (x)
                        (* x x))
                    #'(lambda ()
                        (princ "Enter a number: ")
                        (read)))))

And the output is the same:

CALLBACK-DEMO-USER> (client)
Enter a number: 19
Received: 361
NIL

So what’s changed? Well, taking inspiration from the nested funcall function, we defined compose to invoke the functions in the proper order – first read the number, and then square it, and then finally print it! (Remember that the functions are composed in reverse order in which they are entered).

Note that the last function invocation is done using apply instead of funcall because &rest args produces a list of arguments, and funcall does not work with that (unless the function definition takes a list itself as a parameter, but that is not the general case, and apply works very well with lists and destructures them correctly.

How can we make this generic enough though? We notice the pattern – we invoke apply on the innermost function call, but we use funcall for the rest of the function call chain. This means that we must handle two cases – if there is a single function passed in, we should simply use apply on that, and if not, we should take care to chain them up as discussed. This lends itself to a nice recursive definition as shown next.

The updated compose function:

(defun compose (&rest funcs)
  (labels ((f (funcs args)
             (if (null (cdr funcs))
                 (apply (car funcs) args)
                 (funcall (car funcs) (f (cdr funcs) args)))))
    #'(lambda (&rest args)
        (f funcs args))))
)

And the test client for it:

(defun client ()
  (funcall (compose #'(lambda (x)
                        (format t "Received: ~d~%" x))
                    #'(lambda (x)
                        (* x x))
                    #'(lambda ()
                        (princ "Enter number: ")
                        (read)))))

And the output:

CALLBACK-DEMO-USER> (client)
Enter number: 19
Received: 361
NIL

Explanation: What we do is simply generalise the three-function version of compose into a generic function. For this we, define an internal function f that takes the supplied functions and the arguments as input.

f then recursively decomposes the function applications. The base condition (stopping condition) is when there is only one function left. The (if (null (cdr funcs)) bit then takes care to return the only apply call that we need, and that is of course, applied to the args argument.

As the recursion unwinds the call stack, successive funcallS are applied at each stage. This is exactly in line with the algorithm discussed at the end of the last section.

Now we are almost home and dry! Pay special attention to the order in which the lambda equivalents of the functions are entered in the client function. They are applied in the following order – callback, squarify, and then client.

We could stop here, but there’s one more change that we can make. The current version of compose works absolutely as expected, but the intuitive order of supplying functions is the opposite of what we could expect as a user. The expected order would be, in English, “read in the number, square it, and then print out a message indicating that the number was received”.

Let’s fix that last bit for out final version of compose.

Final version of compose:

;;; final version of compose
(defun compose(&rest funcs)
  (labels ((f (funcs args)
             (if (null (cdr funcs))
                 (apply (car funcs) args)
                 (funcall (car funcs) (f (cdr funcs) args)))))
    #'(lambda (&rest args)
        (f (reverse funcs) args)))))

And the corresponding test code:

;;; test out the final version of compose
(defun client ()
  (funcall (compose #'(lambda ()
                        (princ "Enter a number: ")
                        (read))
                    #'(lambda (x)
                        (* x x))
                    #'(lambda (x)
                        (format t "Received: ~d~%" x)))))

And now let’s test out and see if it works!

CALLBACK-DEMO-USER> (client)
Enter a number: 19
Received: 361
NIL

Success!

The only difference is this line: (f (reverse funcs) args). We simply reverse the order of the received functions while passing it to the recursive function f, and the rest of the code remains exactly the same!

And, of course, this is purely functional! Sweet, ain’t it?

The compose function could be optimised in multiple ways – converting it to an iterative version for instance, but conceptually, this works exactly as advertised.

Conclusion

This post illustrates why I love Common Lisp! Even as I make my journey through the world of Common Lisp, my admiration for it only grows. If there is some feature that we would like to incorporate into the language, it can be done in a just a few lines of code! No other language truly comes close in terms of expressiveness and extensibility.

Interop mini-series – Callbacks special! (Common Lisp special) (Part 2b)

Interop mini-series – Callbacks special! (Part 2a)

This post was actually meant to be part of the previous post(Calling C and C++ from Common Lisp).

However, as I began writing the section of “callbacks”, it started growing to such an extent that I decided to give its own post with a slightly more comprehensive treatment than originally planned!

Contents

  1. What is a callback?
    1. Uses of callbacks
    2. Methods of implementation
  2. Demos
    1. How it’s done in C
    2. How it’s done in C++
    3. How it’s done in Java
    4. How its’s done in Common Lisp
    5. How it’s done in other languages
      1. JavaScript
      2. Python
  3. References

What exactly is a callback?

A callback, in essence, is simply a function that is executed by another function which has a reference of sorts to the first function. Yes, that’s really it!

Uses

One major use is to ensure proper separation of concerns.

Suppose we are writing some client code that makes use of a library, and say that our client function wishes to invoke a library function. Now, this library function executes code that might result in some form of platform specific signalling that will need to be handled in disparate ways depending on the specific signal received. The library writer could not possibly have imagined all the scenarios for such signalling when he was writing the library. So how can this work? Callbacks to the rescue!

So what the library writer did was to hold a reference to a callback function in his own function, and then his function invokes this callback function as and when the need arises (say an error condition or an OS interrupt). The callback function then takes care of all the handling and bookkeeping involved.

This callback function is, of course, expected to be supplied by the client code. This makes sense since the client has the best knowledge of its own domain. This then means that the library writer can make his code as generic as possible, leaving the specifics for the client to manage.

Another common use of callbacks is asynchronous programming. For example, suppose we have a function that needs to be activated when some specific conditions have arisen, and those conditions are decided by some other code. This is a good case to use a callback.

The current function can “register” itself with the condition-generating code, and then that code can invoke a callback in the current function’s module, which can then proceed to completion. Node, in particular, makes extensive use of this approach. The general Observer pattern is, in essence, the generalisation of a callback.

Implementation

Top

Callbacks may be implemented through various means – function pointers, function objects, or lambda abstractions. The important bit is to understand the concept and defer the specifics of the modes of implementation to the language at hand.

Callbacks can be both synchronous or asynchronous (think Node).

So much for the concept. As far as the terminology goes, it is important to remember that the callback itself is the actual function that is invoked by the function that takes the callback as the parameter. A lot of confusion arises precisely for the reason that some people tend to assume that the function taking the function parameter is the callback function. Quite contrary, as we have just surmised. The one mnemonic that always works for me is to remember that both the client function and the callback function are in the same conceptual module.

Finally, a caveat – extensive use of callbacks can lead to what is known as “callback hell” (check the Reference section – there is a whole site dedicated to it!). The rule of thumb is to use a callback only when it is absolutely needed. Otherwise, it can lead to code which is both unreadable and unmaintainable.

Demos

Top

Let’s now take a brief look at the functionality offered by callbacks is implemented in various languages. Of course, there may be different mechanisms for doing so, but I have chosen what I feel to be the idiomatic form in each language under discussion.

For all these examples, we will consider the same example – we have a function (squarify) which takes two parameters – a number and a callback function (callback). squarify simply squares the parameter, and then invokes callback with the squared value.

callback simply prints out the received value with a small message. The whole chain is triggered by another function client, which invokes squarify.

Note that all the examples here are, for the sake of simplicity, synchronous.

How it’s done in C

Top

In C and C++, we make use of function pointers like so:

#include <stdio.h>

void squarify(int, void(*)(int));
void callback(int);
void client();

int main()
{
    client();

    return 0;
}

void client()
{
    int n;
    
    printf("Enter a number: ");
    scanf("%d", &n);

    squarify(n, &callback);
}

void squarify(int n, void (*cb)(int))
{
    (*cb)(n*n);
}

void callback(int n)
{
    printf("Received %d\n", n);
}

And the output:

Timmys-MacBook-Pro:C z0ltan$ gcc -Wall -o callbackdemo callbackdemo.c
 
Timmys-MacBook-Pro:C z0ltan$ ./callbackdemo 
Enter a number: 19
Received 361

Notice how we pass the address of callback to squarify using &callback inside the client function.

How it’s done in C++

Top

The technique used in the C example (declaring the callback as a function pointer parameter to squarify and then passing it the address of callback at runtime will work just the same way in C++ as well.

However, in addition, C++ offers a whole lot more ways of achieving the same result. Let’s explore three of these in the same demo – lambda abstractions, function objects, and functors.

To this end, we use a std::function object to hold a reference to the callback in squarify. This class type is specified in the header.

The logic remains unchanged from that used in the C demo.

Note that this code only works in C++11 (or above).

//C++11 or above
#include <iostream>
#include <functional>

// Define the functor class
typedef struct {
    public:
        void operator()(int n)
        {
            std::cout << "Received: " << n << std::endl;
        }
} backcall;

void squarify(int, std::function<void(int)>);
void callback(int);
void client();

int main()
{
    client();

    return 0;
}

void client()
{
    int n;

    std::cout << "Enter a number: ";
    std::cin >> n;

    // simply pass in a lambda abstraction!
    squarify(n, [](int x) 
				{ std::cout << "Received: " 
					<< x << std::endl; 
			});
    
    // or specify a function explicitly
    squarify(n, callback);

    // or pass in a functor!
    squarify(n, backcall());
}

void squarify(int n, std::function<void(int)> cb)
{
    cb(n*n);
}

void callback(int n)
{
    std::cout << "Received: " << n << std::endl;
}

And the output:

Timmys-MacBook-Pro:C++ z0ltan$ g++ -std=c++11 -Wall -o callbackdemo callbackdemo.cpp 

Timmys-MacBook-Pro:C++ z0ltan$ ./callbackdemo 
Enter a number: 19
Received: 361
Received: 361
Received: 361

Et voila!

How it’s done in Java

Top

In Java, the situation is a bit more complicated than usual for many reasons – lack of pure function objects, extreme verboseness, lack of pure generic functions, etc.

However, the code below demonstrates how we would do it pre-Java 8 (and frankly, most code written today still follow this idiomatic approach).

import java.io.InputStreamReader;
import java.io.BufferedReader;
import java.io.IOException;

interface Callback {
    void call(int x);
}

public class CallbackDemo {
    public static void main(String[] args) {
            client(); 
    }

    public static void client() {
        int n;

        Callback cb = new Callback() {
                        @Override
                        public void call(int n) {
                            System.out.println("Received: " + n);
                        }
                    };

        try (BufferedReader reader = new BufferedReader(new InputStreamReader(System.in))) {
                System.out.print("Enter a number: ");
                n = Integer.parseInt(reader.readLine());
                squarify(n, cb);
        } catch (NumberFormatException |IOException ex) {
            ex.printStackTrace();
        }
    }

    public static void squarify(int n, Callback callback) {
        callback.call(n*n);
    }
}
Timmys-MacBook-Pro:Java z0ltan$ javac CallbackDemo.java 
Timmys-MacBook-Pro:Java z0ltan$ java -cp . CallbackDemo
Enter a number: 19
Received: 361

The code is mostly self-explanatory. To simulate function pointers/function objects, we simply make use of essentially what’s equivalent to the C++ functor (backcall) used in the previous demo.

The Callback interface declares a single abstract method called call which takes a single int parameter, and prints out a small message onto the console.

The squarify function takes an int parameter along with an instance of Callback, and then calls that instance’s call function. (On a side note, this is precisely why even C++’s functors are superior to Java’s. C++ has operator overloading, Java unfortunately does not).

Now, let’s take a look at how it would be done using Java 8 (and above). The Java 8 version is a marked improvement in terms of readability and conciseness.

Here’s the code:

import java.io.InputStreamReader;
import java.io.BufferedReader;
import java.io.IOException;

import java.util.function.Function;

public class CallbackDemo8 {
    public static void main(String[] args) {
        client();
    }

    public static void client() {
        try (BufferedReader reader = new BufferedReader(new InputStreamReader(System.in))) {
                System.out.print("Enter a number: ");
                int n = Integer.parseInt(reader.readLine());
                
                squarify(n, (x) -> { System.out.println("Received: " + x); return null; });
        } catch (NumberFormatException | IOException ex) {
            ex.printStackTrace();
        }
    }

    public static void squarify(int n, Function<Integer,Void> cb) {
        cb.apply(n*n);
    }
}


And here’s the output:


Timmys-MacBook-Pro:Java z0ltan$ java -cp . CallbackDemo8
Enter a number: 19
Received: 361

We observe a few things that differentiate it from the pre-Java 8 version:

  • The Callback interface is gone, having been replaced by the built-in Function function interface.
  • The callback function is also gone, and a lambda abstraction replaces it instead.

The lambda expression (x) -> { System.out.println(“Received: “ + x); return null; } still looks ugly with that return null; call. It is clearly redundant, but because of the way the Function functional interface is defined, this statement is mandatory.

We could fix that by creating our own functional interface like so:

@FunctionalInterface
interface Function<T> {
	void apply(T o);
}

However, it would reintroduce a custom interface in our code. So, not much gained there!

How it’s done in Common Lisp

Top

A full post containing a detailed discussion on this topic (along with the relevant demos) is available here in the next part of this series. Make sure to check that out!

How it’s done in other languages

Top

Let’s implement the same example in a few other languages for our own edification! For the sake of brevity (this post is already quite long!), we will stick to very commonly used languages - JavaScript and Python.

I feel these should be representative of most of the mainstream languages. Haskell is a bit of a different beast, but that is worthy of its own series of posts!

JavaScript

Top

Since client-side JavaScript does not provide any means of taking input in from the command line, we will use Node.js for this demo. For I/O from the console, we will use the readline module that now comes bundled with Node.

const readline = require('readline');

const stream = readline.createInterface({
                    input: process.stdin,
                    output: process.stdout
                });

function callback(n) {
    console.log("Received: " + n);
}


function squarify(n, cb) {
    cb(n*n);
}

function client() {
    stream.question("Enter a number: ", function(n) {
        squarify(n, callback);
        stream.close();
        process.stdin.destroy();
    });
}

// run!
client();

And the output:

Timmys-MacBook-Pro:JavaScript z0ltan$ node callback.js 
Enter a number: 19
Received: 361

Again, this is equivalent to the C version. We simply pass the function name (which is a reference to the function object) to the squarify function as the callback function.

However, we could do it more idiomatically using a lambda abstraction as follows:

const readline = require('readline');

const stream = readline.createInterface({
                    input: process.stdin,
                    output: process.stdout
                });

function squarify(n, cb) {
    cb(n*n);
}

function client() {
    stream.question("Enter a number: ", function(n) {
        squarify(n, function(x) {
            console.log("Received: " + x);
        });

        stream.close();
        process.stdin.destroy();
    });
}

// run!
client()

Note how the callback function has now been replaced by a a lambda abstraction that does the same operation.

And the output:

Timmys-MacBook-Pro:JavaScript z0ltan$ node callback_demo_lambda.js 
Enter a number: 19
Received: 361

Nice!

Python

Top

In Python, just like in JavaScript, the function name itself is an intrinsic reference to the function object. Functions are, after all, first-class objects in Python, and we can simply pass it around like so:

def callback(n):
    print("Received: " + str(n))

def squarify(n, cb):
    cb(n*n)

def client():
    print("Enter a number: ", end='')
    n = int(input())
    
    squarify(n, callback)

if __name__ == '__main__':
    client()

Note that the code was written in Python 3. However, it will easily work with minimal changes with Python 2.x as well.

And the output:

Timmys-MacBook-Pro:Python z0ltan$ python3 callback_demo.py 
Enter a number: 19
Received: 361

However, since Python also supports a crude form of lambda abstractions, we could rewrite the demo like so:

def squarify(n, cb):
    cb(n*n)

def client():
    print("Enter a number: ", end='')
    n = int(input())

    squarify(n, lambda x: print("Received: " + str(x)))

if __name__ == '__main__':
    client()

So now we have simply passed the callback function as a lambda abstraction to the squarify function.

And just to verify, the output:

Timmys-MacBook-Pro:Python z0ltan$ python3 callback_demo_lambda.py 
Enter a number: 19
Received: 361

So that’s all for now! Next up, callbacks in Common Lisp, and how we can write a simple function to perform function composition.

References

Top

Here are a few references related to the topics discussed in this blog post that you might find useful:

Interop mini-series – Callbacks special! (Part 2a)

My favourite feature in Java 9 – JShell (and other ruminations)

It used to be the case that Java releases were few and far between. Back in the olden days, releases used to happen so infrequently that many users were forced to write libraries to compensate for the lack of features in Java. I still recall how Generics itself was a big feature that was included only in Java 5 after Java 1.4 had been around for a long time. The problem with that was that a lot of code had already been written which made use of “raw” (in Java Generics parlance) types, and the Java folks could not afford to break all that code! Thus started this incongruous (if understandable) tradition in Java – introduce features but never break any existing code. The issue with this approach is that while it saves hundreds of thousands of man-hours of effort, it does introduce inherent limitations in the feature being introduced. This is the reason why Java Generics are such a mess, and now why Lambdas in Java are less than ideal (in fact I’d argue they’re pretty much non-features and we’ll see why in future posts). Notice also that Lambdas in Java were introduced as late as Java 8, and arguably primarily in order to support Streams. Thankfully, major major Java releases have been occurring with greater frequency in recent years, and surely that is a good thing. Of course, I don’t want a situation where releases happen with such frequency that codebases get broken on a regular basis – that’d be even worse.

Java 9 has been in the offing for some time now, and I believe that its planning had already begun even before Java 8 was released. Java 8 introduced some much needed features – basic Lambda support, Streams (my favourite feature in Java 8!), and default methods in interfaces. The interesting bit about default methods in interfaces is that that feature was introduced to support the new Streams feature in Java 8. How do you figure? Let’s take a simple example – the List interface in the java.util package. Till Java 8, this package did not have a stream() method. This means that if stream() had been made a method in the List interface in JDK 8, all legacy code using the List interface (which would practically be all codebases) would be irreparably broken. If, on the other hand, this method was not part of List, then streams could not be supported on List! To solve this conundrum, JDK 8 introduced a new List interface where ‘stream’ was made a default method of the List interface. This means that legacy code would work fine with the new List interface since the JRE would take care to ensure that the stream method (which is, to be pedantic, actually defined in the base Collection interface with the signature: default Stream stream()) would be ignored for older code whereas new code that made use of this new feature would hum along nicely as well. Hackish, but it works.

Anyway, to get on with it, Java 9 introduces some very interesting new features as well. The full list can be seen here: JDK 9 features. However, a couple of major features stand out in this list – JShell (and the JShell API), and support for Modules. The latter feature is technically a separate project that’s will be bundled with the main Java release, and is far more complicated than I would have liked. I reserve further comments on that till such time as Java 9 itself is released. However, I absolutely love JShell and feel its arguably the best productivity-boosting feature ever released in Java. Of course, I’m talking about the JShell tool itself (which comes bundled with Java 9. Download EA (Early Access) releases of Java 9 here – JDK 9 Downloads).

Anyone who has ever worked with languages with ecosystems that include at least a REPL (Read-Eval-Print-Loop for the Luddites) will agree that once you get used to that mode of working, it feels severely constraining to get back to the traditional Code-Compile-run cycle. The best environment in this regard is provided by Lisps – Common Lisp in particular ensures a very interactive image-based ecosystem that is in a league of its own. Scheme and Racket also provide interactive development environments (DrRacket, for instance), but they’re still inferior (in my opinion) to Common Lisp environments (SBCL + Emacs + SLIME is what I use myself). Python also provides a decent REPL system, and is arguably the best of the mainstream languages in that regard. Traditionally, dynamic languages have had it easy when it comes to REPLs by their very nature while statically-typed languages have not been able to provide something comparable. Haskell is a very good example of a language that defies this rule. Haskell is a very strongly-statically-typed language and yet it has a very nice REPL. Plus, of course, the wonderful Hindley-Milner type inference system in place ensures that you hardly ever have to type in the types yourself (pun intended). Scala is also a strongly-typed static language that has a decent REPL (on par with Haskell). More traditional static languages like C++ and Java haven’t had a REPL in years, apart from some projects that have attempted to provide one in the form of libraries – that never does work as well though as coming bundled with the language, of course. With the introduction of JShell in Java 9, I believe Java has at least one feature that’s clearly superior to C++. In terms of the other “functional” features such as lambdas and closures, not at all. More on this comparison in later posts. Now, let’s jump into JShell and play around a bit!

When you install the JDK 9 EA bits, the “jshell” executable comes bundled in the “bin/“ folder by default. JShell also comes with an new sets of JShell APIs that can be used to essentially create our own custom shells, but I’m more interested in the interactive JShell tool for now. To run jshell, simply run it in a shell, and you should see something like the following:

Timmys-MacBook-Pro:Blogs z0ltan$ jshell
|  Welcome to JShell -- Version 9-ea
|  For an introduction type: /help intro

jshell>

Now, let’s see. As far as I have evaluated it, some points to be noted while developing with this REPL are:

  • No semi-colons required (well, almost none).
  • No mandatory class wrappers to execute code – plain Java code works fine.
  • Ability to use the generated variable names ($N, where N is an integer) to refer to objects (a la Scala).
  • Ability to import custom packages (if the JARs are in the classpath).
  • Ability to save code snippets to file.
  • Ability to persist state between sessions (very important for interactive development).
  • Ability to open external files and load them.
  • Ability to list all session-declared variables, methods, classes, and history.
  • And most importantly, code completion using the tab key (eat that, Python!).

Of course, that is not an exhaustive list. There are a lot more features that are available, and you can see all the options using the “/help” command:

Now let’s get down to some hacking! (For those unfamiliar with Lambdas and Streams in Java, I’ll post a series of introductory blogs on those topics in the near future. For now, bear with me as the main point of this blog is to demonstrate JShell’s support interactive development environment).

0). First let us import the classes that we are interested in:

jshell> import java.util.*

jshell> import java.util.stream.*

jshell> import java.util.function.*

1). Let us now create a list of names, convert them to uppercase, and the print them out:

jshell> List<String> names =
            Arrays.asList("Peter", "Timmy", "Gennady", "Petr", "Slava")
names ==> [Peter, Timmy, Gennady, Petr, Slava]

jshell> names.stream()
             .map((name) -> name.toUpperCase())
             .forEach(System.out::println)
PETER
TIMMY
GENNADY
PETR
SLAVA

jshell> names.forEach(System.out::println)
Peter
Timmy
Gennady
Petr
Slava

Note that the original list is not modified at all since streams are functional (in most respects). All stream functions return a new version of the original data structure with the necessary processing done. Also note how the “names” variable is printed out in human-readable form.

2). With the same list of names, let’s sort them out in non-decreasing order (lexicographically speaking):

jshell> Collections.sort(names, (f, s) -> f.compareTo(s))

jshell> names
names ==> [Gennady, Peter, Petr, Slava, Timmy]

Note that Collections.sort() is a mutating operation.

3). Now that this list has been sorted, let us do a series of operations: let’s filter out the names that start with G, convert the rest to uppercase, and concatenate them to a single string, and return that value:

jshell> names.stream()
             .filter((s) -> !(s.startsWith("G") || s.startsWith("g")))
             .map(String::toUpperCase)
             .collect(Collectors.joining())
$9 ==> "PETERPETRSLAVATIMMY"

Observe how map(String::toUpperCase) is essentially the same as map((r) -> r.toUpperCase()), and also how the return type, which is a string, is automatically assigned to a generated variable, $9, which can now be used in further processing. For instance, if we wished to find the length of this concatenated string, we could do something like (it’s obviously a contrived example):

jshell> Function<String, Void> func =
                 (s) -> { System.out.println(s.length()); return null; }
func ==> $Lambda$16/1427810650@35d176f7

jshell> func.apply($9)
19
$11 ==> null

A couple of interesting observations here: first off, Note how we use the semi-colons inside the body of the lambda expression – this is required in JShell when you have multiple statements in the body or when you use an explicit return statement. Moreover, if a class or function is being defined explicitly, a semi-colon is also required. If you want to play safe, you might as well use semi-colons everywhere.

Secondly, we assign the lambda expression to a Function variable. Function<T, R> is a “functional interface”, which basically means that it is an interface with a single abstract (non-default) method (also called SAMs). Any interface that conforms to this convention is a functional interface. However, the problem is that we can’t just invoke the function object. We have to know that this interface’s method name is “apply”, and thus use that explicitly. More on this on the series of planned posts on Lambdas and Streams in Java 8.

4). That’s about strings. Now let’s see some more examples with other types. Let’s generate an infinite stream of positive integers, collect upto a certain limit, filter out the evens, take their sum and return the maximum in the set:

jshell> IntStream.iterate(1, (n) -> n+1)
                  .limit(100).filter((d) -> d%2 == 0)
                  .summaryStatistics()
$12 ==> IntSummaryStatistics{count=50, sum=2550, min=2, average=51.000000, max=100}

jshell> System.out.format("Max: %d, Sum: %d\n", $12.getMax(), $12.getSum())
Max: 100, Sum: 2550
$13 ==> java.io.PrintStream@612679d6

The “iterate” method takes an initial starting value and a lambda expression that acts as a generator function for the series.

5). Finally, let’s observe some of the aforementioned JShell features in action:

  • Viewing all the imports:
    jshell> /imports
    |    import java.io.*
    |    import java.math.*
    |    import java.net.*
    |    import java.util.concurrent.*
    |    import java.util.prefs.*
    |    import java.util.regex.*
    |    import java.util.*
    |    import java.util.stream.*
    |    import java.util.function.*
    
  • Open and load a source file:
    jshell> /open /Users/z0ltan/HelloWorld.java
    |  Warning:
    |  Modifier 'public'  not permitted in top-level declarations, ignored
    |  public class HelloWorld {
    |  ^----^
    
    jshell> HelloWorld main = new HelloWorld()
    main ==> HelloWorld@757942a1
    
    jshell> main.main(null)
    Hello, world!
    
  • Save the current session:
    jshell> /save -all /Users/z0ltan/session1
    
    jshell>
    

    This session file can then be opened using the “open” command when launching a fresh jshell session.

  • Finally, to exit the current session:
    jshell> /exit
    |  Goodbye
    

Well, that’s all for a very basic introduction to JShell in Java 9! I didn’t go into much detail about why such interactivity is useful. I can’t list out a few reason why (without going deeper, which I defer to future posts):

  • Smalls snippets of code can be tested without the whole Code->Save->Compiler->Run->Debug cycle that one would normally have to do.
  • Top down and Bottom up programming can be done in a REPL. For instance:
    jshell> void printFactorial(int n) {
       ...> System.out.format("factorial(%d) = %d\n", n, factorial(n));
       ...> }
    |  created method printFactorial(int), however, it cannot be invoked until method factorial(int) is declared
    
    jshell> long factorial(int num) {
       ...>   long f = 1L;
       ...>   for (int i = 1; i <= num; i++) 
       ...>      f *= i;
       ...>   return f;
       ...> }
    |  created method factorial(int)
    
    jshell> printFactorial(10)
    factorial(10) = 3628800
    

    In this case, we could define functions (or methods, if you want to be pedantic about it) that reference functions that haven’t been defined as yet. This is an example of top-down programming.

  • There is much less thought impedance when developing in a REPL. Of course, Java as a whole is not as suited to interactive development as, say, Common Lisp is, but it is nevertheless invaluable. In other words, it doesn’t hamper flow as much as the traditional compile cycle, and this definitely helps boost creativity and productivity.

 

 

 

Next up, I will discuss lambda support in Java (8 and above) and how it compares to similar support in other languages – C++, Python, Common Lisp, and Haskell.

My favourite feature in Java 9 – JShell (and other ruminations)