Peek, Ibid
I just saw a blog post Ibid. by Proctor in which he describes the Ibid
problem
posed by Gene Kim on the Functional Geekery podcast, Episode 128. Clojurists will
remember Gene Kim from his excellent Clojure/conj talk:
My Love Letter To Clojure: And A Datomic Experience Report.
On the podcast, Gene says that he had a hard time writing a simple Clojure function that takes a list of author names (strings) possibly containing references to the special value “Ibid.” (as used in citations) and returns the list with the “Ibid.” references replaced by the previous author’s name. In his post, Proctor offers a couple of solutions in Clojure and other languages. Just for record, I would like to suggest what I think is the idiomatic Clojure answer. (This is an improvement over the comment I left for Proctor.)
(defn ibid [authors]
(reduce (fn [result auth]
(conj result (if (= auth "Ibid.") (peek result) auth)))
[]
authors))
(ibid ["Gene Kim", "Jez Humble", "Ibid.", "Gene Kim", "Ibid.", "Ibid.",
"Nicole Forsgren", "Micheal Nygard", "Ibid."])
;;=> ["Gene Kim" "Jez Humble" "Jez Humble" "Gene Kim" "Gene Kim" "Gene Kim"
;; "Nicole Forsgren" "Micheal Nygard" "Micheal Nygard"]
Anytime you need to run through a sequence of items to generate a new result, consider using
reduce
. The reducing “step” function takes two arguments: the intermediate result and the
next item from the collection. It should return a new result value. The reduce
call also needs
an initial value and the collection to reduce over.
My ibid
collects the intermediate results in a vector for two reasons. First,
it’s a natural way to preserve the order as vectors “conj” to the end. Sometimes you’ll see
code that collects results into a list and calls reverse
at the end to get the desired
order. If you need a reverse
, you probably should have used a vector in the first place.
The second reason to use a vector is to get efficient access to the last item. In my
opinion, the function peek
doesn’t get enough attention. It’s much faster than last
–
O(1) vs. O(n). If you find yourself needing last
, consider re-working your code to
use a vector so you can peek
instead.
In ibid
, the (peek result)
is the previous known author. This is exactly what we need
to replace the “Ibid.” reference. Even if we have multiple “Ibid.” references in a row, the
intermediate result will have the proper author as we go.
Proctor’s implementation was careful to check for a bad reference to “Ibid.” where there was
no previous author. In that situation, Proctor would throw an exception. My
ibid
will end up with nil
for the bad reference. This is mostly a matter of
taste, but I would say that the usual Clojure philosophy is “garbage in, garbage out” so
don’t complicate your code with low-level error detection. Test your inputs and results on the
edges and keep your internal functions as simple as possible. In this case, the error can
only occur if the first item is “Ibid.”. This suggests a simpler safety test which would
fit nicely as a precondition.
(defn ibid-safe [authors]
{:pre [(not= (first authors) "Ibid.")]}
(ibid authors))
(ibid ["Ibid."])
;;=> [nil]
(ibid-safe ["Ibid."])
Execution error (AssertionError) at miner.ibid/ibid-safe (ibid.clj:94).
Assert failed: (not= (first authors) "Ibid.")
I think this example is a good illustration of how to use reduce
. As a bonus, we get a
useful reminder about calling peek
(instead of last
) on vectors. By the way, if peek
is unfamiliar to you, be sure to look up its cousin pop
as well.