A PigLatin translator in Common Lisp (and contrasting it with the Racket version)

As my curiosity had been piqued by the PigLatin exercise in Racket, I decided to basically translate that code into Common Lisp. The result was interesting to say the least.

Since the explanation of the program logic has already been done in the Racket version, I will skip that part, and only highlight the relevant differences.

First off, the code:

(defpackage #:piglatin
  (:use :cl :cl-user))


(in-package #:piglatin)


(defun translate (sentence)
  (let ((*readtable* (copy-readtable nil)))
    (setf (readtable-case *readtable*) :preserve)
    (mapcar #'make-symbol
            (mapcar #'english-to-piglatin
                    (mapcar #'string-to-list
                            (mapcar #'symbol-name sentence))))))


(defun english-to-piglatin (word)
  (if (starts-vowel-p word)
      (word-to-vowel-rule word)
      (word-to-consonant-rule word)))


(defun starts-vowel-p (word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))


(defun word-to-vowel-rule (word)
  (coerce (append word '(#\w #\a #\y)) 'string))


(defun word-to-consonant-rule (word)
  (let ((was-capital-p (upper-case-p (car word))))
    (labels ((f (word)
               (if (starts-vowel-p word)
                   (cond (was-capital-p (string-capitalize
                                          (coerce (append word '(#\a #\y)) 'string)))
                         (t (coerce (append word '(#\a #\y)) 'string)))
                   (f (append (cdr word) (list (char-downcase (car word))))))))
      (f word))))


(defun string-to-list (word)
  (coerce word 'list))

And a similar test run:

CL-USER> (load "/Users/z0ltan/Rabota/ProgrammingLanguages/CommonLisp/pig-latin.lisp")
T
CL-USER> (piglatin::translate '(|Hello| |world| |we| |meet| |again|))
(#:|Ellohay| #:|orldway| #:|eway| #:|eetmay| #:|againway|)
CL-USER> (piglatin::translate '(|Cucullus| |non| |facit| |monachum|))
(#:|Uculluscay| #:|onnay| #:|acitfay| #:|onachummay|)

This looks much dirtier than the Racket version’s output, but there is good reason for that.

Some notes:

The first thing you’ll observe is the strange input format for the Common Lisp version. What’s the deal with all the pipes in the input? Well, my understanding is this – the default manner in which the Common Lisp reader reads in symbols is to convert the symbol internally to upper-case. This can easily be verified:

CL-USER> (readtable-case *readtable*)
:UPCASE
CL-USER> (symbol-name 'hello)
"HELLO"
CL-USER> (symbol-name 'Hello)
"HELLO"
CL-USER> (symbol-name 'hElLo)
"HELLO"

The :UPCASE in the first output shows that the default behaviour is to convert the symbol to all upper-case. The other possible values for this are: :preserve, :downcase, and :invert. So how do we get around this problem? The easiest approach in this case is to change the reader behaviour to preserve the case of the input symbol (note that this will only work if the input symbol is ensconced within pipes) using the :preserve keyword. That is exactly what we’re doing in the translate function:

(defun translate (sentence)
  (let ((*readtable* (copy-readtable nil)))
    (setf (readtable-case *readtable*) :preserve)
    (mapcar #'make-symbol
            (mapcar #'english-to-piglatin
                    (mapcar #'string-to-list
                            (mapcar #'symbol-name sentence))))))

Remember that dynamic variables have special meaning in Common Lisp. Here, we simply create a copy of the read-table (read up on this if you are unaware of this concept) and bind it to *readtable*. We have to be really careful when working with special dynamic variables. What we are doing here is simply overwriting the actual read-table with our modified version for the scope of the let expression. This avoids poisoning the read-table globally.

So we simply set the read-table mode to :preserve in the line: (setf (readtable-case *readtable*) :preserve), and then we can proceed almost exactly as in the Racket version. We have a top-down approach in which we convert the symbol list into a string list, convert that list into a list of lists of chars (Common Lisp also shares Racket’s behaviour in the sense that a string is not automatically a list of chars), map our english-to-piglatin function over that list, and finally convert the whole list of processed strings into a list of symbols as the output.

Another point of interest is that unlike Racket, Common Lisp doesn’t have built-in convenience functions such as string->list or list->string. Instead, we have a useful (if a bit quirky) and powerful function called coerce. So in the following code, we are simply coercing a string into a list of characters:

(defun string-to-list (word)
  (coerce word 'list))

The rest of the code is pretty much in the same vein, with an interesting twist. As mentioned earlier, the read-table case-munging only works if we escape the symbol within pipes (such as |Hello|), and so the following input screws up the capitalisation entirely, as can be seen in the output:

CL-USER> (piglatin::translate '(hello world))
(#:|Ellohay| #:|Orldway|)

CL-USER> (piglatin::translate '(Hello WORLd))
(#:|Ellohay| #:|Orldway|)

As you can observe, when we use symbols without escaping them with pipes, the reader automatically converts the to upper-case anyway, so the output remains the same irrespective of the case of the input symbol.

This is quite in line with the overall quirkiness of Common Lisp. In that sense, I feel that Racket (and Scheme, by extension) is a much more functional and consistent language, easier to work with. However, Common Lisp did, and always will hold a soft spot in my heart, especially as it offers me countless more opportunities to explore and learn this magnificent beast!

A PigLatin translator in Common Lisp (and contrasting it with the Racket version)

A PigLatin translator in Racket

This post was inspired by a question posted on /r/programming on reddit. As part of helping the person, I realised I could create one just for fun in Racket!

I started out by reading the rules for Pig Latin on Wikipedia (dead simple rules), and the only irksome bit was ensuring that the capitalisation of the words remained intact. Also, the input and output formats were inspired by the posted question (it would have been much simpler to process a list of strings, or even a string of words directly).

Here is the code, followed by a bit of explanation.

(module piglatin racket
  (provide translate)

(define (translate sentence)
  (map string->symbol
       (map english->piglatin
           (map string->list
                (map symbol->string sentence)))))


(define (english->piglatin word)
  (if (starts-vowel? word)
      (word->vowel-rule word)
      (word->consonant-rule word)))


(define (starts-vowel? word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))


(define (word->vowel-rule word)
  (string-append (list->string word) "way"))


(define (word->consonant-rule word)
  (let [(was-capital? (char-upper-case? (car word)))]
    (letrec [(f (lambda (word)
                 (if (starts-vowel? word)
                     (cond [was-capital? (string-titlecase
                                          (string-append (list->string word)                  "ay"))]
                            [else (string-append (list->string word) "ay")])
                      (f (append (cdr word)
                                 (list (char-downcase (car word))))))))]
      (f word)))))

Explanation:

Let’s break down the code into chunks for easier explanation.

The input format is a list of symbols (not strings) such as: '(hello world). As such I figured that it was best to do some top-down programming here. To that end, the main function (and the only one exposed to the outside world) is the translate function:

(define (translate sentence)
  (map string->symbol
       (map english->piglatin
           (map string->list
                (map symbol->string sentence)))))

What I’m doing here is taking the input, sentence, which is in the specified format, and then I convert that into a list of strings, followed by converting that to a list of list of characters (Racket does not parse strings as lists of characters directly, like Haskell does). Following that, I convert each word in this list to its corresponding PigLatin form, and since the output format is again a list of symbols, map the strings back to symbols.

Then we have basically two rules – one for words that begin with vowels, and another one for words that begin with consonants.

(define (english->piglatin word)
  (if (starts-vowel? word)
      (word->vowel-rule word)
      (word->consonant-rule word)))

The starts-vowel? predicate simply checks if the word begins with a vowel or not (we convert the letter to lowercase for easier checking)

(define (starts-vowel? word)
  (member (char-downcase (car word)) '(#\a #\e #\i #\o #\u #\y)))

The rule for words beginning with a vowels is simply to append a “way” to the string:

(define (word->vowel-rule word)
  (string-append (list->string word) "way"))

For words that begin with consonants though, the rule is a bit more involved. Basically, we have to extract the consonant cluster (one or more letters) at the beginning of the word, append that to the end of the updated word (which now begins with a vowel), and also suffix an “ay” at the end. Note that we also have to take care of proper capitalisation in this case, which we didn’t have to do with the vowel rule:

define (word->consonant-rule word)
  (let [(was-capital? (char-upper-case? (car word)))]
    (letrec [(f (lambda (word)
                 (if (starts-vowel? word)
                     (cond [was-capital? (string-titlecase
                                          (string-append (list->string word)                  "ay"))]
                            [else (string-append (list->string word) "ay")])
                      (f (append (cdr word)
                                 (list (char-downcase (car word))))))))]
      (f word)))))

As you can see, since we are performing recursion to extract the consonant cluster, we maintain the original capitalisation of the word in a local binding called was-capital?. Then we recurse through the word, and in case was-capital> is true, we use the string-titlecase function to capitalise the first letter of the processed word. Note that we could have managed without using this built-in function, but the increase in verbosity for this little gain hardly seems worth the effort.

All right, let’s test it out on a variety of input cases:

> (load "pig-latin.rkt")
> (require 'piglatin)
> (translate '(My))
'(Ymay)
> (translate '(My hovercraft is full of eels))
'(Ymay overcrafthay isway ullfay ofway eelsway)
> (translate '(Always the odd one out))
'(Alwaysway ethay oddway oneway outway)
> (translate '(sticks and stones))
'(icksstay andway onesstay)
> (translate '(Hello world we meet again))
'(Ellohay orldway eway eetmay againway)
>(translate '(Cucullus non facit monachum)) 
'(Uculluscay onnay acitfay onachummay)
> (translate '(The quick brown fox jumped over the lazy dog))
'(Ethay uickqay ownbray oxfay umpedjay overway ethay azylay ogday)
> (translate '(Allo allo))
'(Alloway alloway)


This sort of stuff is what makes dynamic languages so dear to me, especially for prototyping. If I had done it in a static language, it would have taken quite a bit more effort (type inference is wonderful, but it does not actually save you from wrestling with the type system!)

A PigLatin translator in Racket