Destructuring lists with match
Let’s say you need to destructure a list with
match, using a pattern that specifies a “rest” or “more” element. Be careful. You probably want to use
Update 2014–06–25: Sam Tobin-Hochstadt pushed a commit that changes this. Fast! Thanks! So if you’re building from HEAD or using a nightly build, this is now moot. However if you’re writing code for a library that may be used with Racket 6.0.1 or older, it remains relevant.
The long version
I answered a question on Stack Overflow. Basically, it was about how to select every 1st and 4th element from a list, repeatedly. My answer was equivalent to what we shall name
1 2 3 4 5
It passed all these tests:
1 2 3 4 5 6 7 8 9 10 11
(define (test f) (local-require rackunit) ;; Your example: (check-equal? (f '(0 1 2 3 4 5 6 7 8 9 10)) '(0 3 4 7 8)) ;; Other tests: (check-equal? (f '()) '()) (check-equal? (f '(0)) '(0)) (check-equal? (f '(0 1)) '(0)) (check-equal? (f '(0 1 2)) '(0)) (check-equal? (f '(0 1 2 3)) '(0 3)) (check-equal? (f '(0 1 2 3 4)) '(0 3 4)))
Later, I saw that uselpa posted a generalized, elegant answer using
1 2 3 4 5 6
At which point a few things happened.
I was impressed by that answer. Wished I’d answered that way. Consoled myself, well, at least my bespoke pattern-specific version is probably somewhat faster. Right?
After feeling smug for a moment, I had a nagging feeling. Well shit. I should measure. See how much faster.
I measured. I was horrified how wrong I was.
1 2 3 4 5 6 7 8 9
(define bench (let ([xs (build-list 10000 values)]) (λ (f) (for ([i 3]) (collect-garbage)) (printf "Timing ~a ... " (object-name f)) (time (void (f xs)))))) (bench slow) ;; Timing slow ... cpu time: 256 real time: 266 gc time: 46
partition-filter version was:
;; Timing partition-filter ... cpu time: 3 real time: 4 gc time: 0
I rewrote it using
car, etc. — i.e. what I imagined
match would expand to. That was as fast as I’d expected.
I scratched my head for a bit. On a hunch I tried a slightly different
match pattern, which we’ll name
1 2 3 4 5 6 7 8 9 10
All along I’d believed they’re equivalent. They’re not.2
In fact, if you double the test list length (from 10,000 to 20,000) the time quadruples (from 256 to about 1000).
As Jay McCarthy explained to me on IRC,
(list this more ...) needs to test that the input is a
cons up the elements to bind to
(list* this more) doesn’t care if the input is a list — whatever it is just gets bound to
list* match pattern is like the
1 2 3
I should write a blog post about this (done).
I should submit a PR with a “Tip!” for the
matchdocumentation, in case this is non-obvious to anyone else.
I wonder if
matchcould optimize the special case of
(list this more ...)to be
(list* this more). Maybe there are subtleties (or not-so-subtleties) I’m overlooking. Update 2014–06–25: [done].
Although my corrected version was indeed faster than
pattern-filter, it’s almost immeasurably so on a list of 10,000 items. At 100,000 it starts to matter. In many applications, it would be N/A. In general, I think uselpa’s answer is best.
Also I changed the second pattern from
(list a _ ...)to
(cons a _), although that test is only hit at the end so it hardly effects the run time in this case. ↩
By the way, same issue with quasiquote patterns, which for instance I frequently use to destruct x-expressions. Instead of
`(,this ,that ,more ...)use
`(,this ,that . ,more). ↩