Although I prefer Racket, there are a few idioms from Clojure I like. I’m particularly infatuated with the threading macros,
I was surprised how little documentation I could find for these. So although I’m writing this for Racketeers, it’s possible a few Clojure folks might find it interesting, too.
Note: You can use them in Racket with Asumu Takikawa’s #lang clojure or my own #lang rackjure. In the latter, the threading macros are named
~instead of a
-) so as not to conflict with Racket’s use of
->for contracts. I’ll use the
~>names in this blog post.
First, the threading macros have nothing to do with concurrency. Instead, the macros “thread” values through a chain of function calls.
Here’s a simple example. You might write code like this:
The nesting makes it difficult to see the data flow. You need to read not just right-to-left, but also inside-out. And although we Racketeers don’t exactly hate parentheses, there are an awful lot of them.
Here’s that with the
~> “thread first” macro:
It expands into exactly the same code as above, but is easier to read as a series of transformations. The
~> macro takes the result of each function, and “threads” it in as the first argument to the next function.
, is whitespace so you can use it as a visual reminder where the argument is being inserted:
1 2 3 4
(-> (sha1 in) (bytes->hex-string ,) (bytes-append , #".") (displayln ,))
In Racket, I suppose you could use comments like so:
But that’s a bit noisy. I’ve found that with a little experience you don’t really need that. It’s fine to say:
Plus, when any function takes just one argument, you can omit the parentheses:
That’s really clean.
For certain programs it’s really helpful to use this style, which emphasizes data flow. If you’re nesting functions with
-> in the name, like
string->bytes/utf-8 — or functions that could be named that way — that’s often a good indicator that
~> might be natural.
As I searched for more information about the threading macro, I came across the idea of concatenative programming. Think of Unix command line pipes. Think of postfix HP calculators. Think of postfix languages like Forth, including newer ones like Cat.
If we Lispers can accept prefix notation, why not postfix? Like extremists on opposite sides of an issue, we actually have more in common with each other, than with the infix majority — because we care more than they do. (I am mostly joking, but not entirely.)
However you view it, the emphasis is on a chain of transformations. Maybe it’s my background in music and audio processing, but I find that a very clear and natural way to think about many problems.
Caveat: Don’t forget it’s a macro
Occasionally you’ll have some function that doesn’t take the interesting bit as the first argument:
You might think, whatevs, I’ll just supply an anonymous function that takes a list as the first argument. I’ll use
(lambda (xs) (map add1
xs)) or more elegantly
(curry map add1):
1 2 3 4
Why won’t this work? Because
~> is a macro and it expands to:
Unfortunately in such a situation you’ll have to define the function outside the
~> macro: 1
The main take-away is that
~> is a syntax transform, not a function application. Often you won’t need to care, but sometimes it will matter.
Sometimes the threading macros are referred to as the thrush combinator. They’re not, really, because the threading macros are macros, as we just saw. What they do have in common is specifying the functions in order of execution: in data-flow order. In contrast
compose specifies the function in math notation order, the same order in which we’d write them as nested expressions.
In other words, one implementation of thrush is simply:
Here are the permutations:
Argument Order Function Macro -------------- -------- ----- Math Notation compose Data Flow thrush ~>
Combined with applicable dicts
Sometimes I need to deal with JSON. Often this has nested object literals, a.k.a. hash tables a.k.a. dictionaries.
This can be a bit painful in Racket:
But #lang rackjure has applicable dicts. When an application form has two elements, and the second one is a
dict?, then this expands to a
dict-ref using the first element as the key:
('a d) ;; when (dict? d) is #t, expands to... (dict-ref d 'a)
As a result, that nested
dict-ref example can be written as:
('c ('b ('a some-dict)))
That’s a bit shorter, but of course this feels “backwards and inside-out”. So,
~> to the rescue:
(~> some-dict 'a 'b 'c)
Welcome your feedback
Those are my reflections on the threading macros. If you’d like to add something, or correct some horrible misunderstanding, feel free to leave a comment.
Update: Although it’s a bit ugly, you could also wrap any function-creating expressions with an extra set of parens. For example,
((lambda (xs) (map add1 xs)))and
((curry map add1)). ↩