I watched the Apropos Clojure #18 video cast the other day. The group worked on an example problem of finding the longest palindromic substring of a given string. A palindrome is a string of characters that’s the same in reverse order, such as “abba”.

The panelists started with a classic generate and test approach. Their solution looked something like this:

(defn palindrome? [s]
  (= (seq s) (reverse s)))

(defn substrings [s]
  (let [mx (inc (count s))]
    (for [start (range mx)
          end (range (inc start) mx)]
      (subs s start end))))

(defn longest-palindrome [s]
  "Return the longest substring of s that is a palindrome"
  (apply max-key count (filter palindrome? (substrings s))))

I have a couple of suggestions for improvements. The first idea is to build the substrings in length order, longest first. A slight rearrangement of the for comprehension will do the trick.

(defn substrings1 [s]
  (let [cnt (count s)]
    (for [len (range cnt 0 -1)
          start (range (inc (- cnt len)))
          :let [end (+ start len)]]
      (subs s start end))))

(defn longest-palindrome1 [s]
  "Return the longest substring of s that is a palindrome"
  (first (filter palindrome? (substrings1 s))))

With longest-palindrome1, the first palindrome found is known to be the longest as the candidates are generated in length order. The filter and substrings1 functions are lazy so the process terminates as soon the first palindromic substring is generated. The smaller substrings are not realized. In my tests, longest-palindrome1 was 2-3 times faster than the original.

The second improvement is to use a faster palindrome test. It’s not obvious, but clojure.string/reverse is pretty fast (due to Java interop) and definitley beats using Clojure sequences of characters. (I use “str” as a namespace alias for “clojure.string”.)

(defn palindrome2? [s]
  (= s (str/reverse s)))

(defn longest-palindrome2 [s]
  (first (filter palindrome2? (substrings1 s))))

The longest-palindrome2 runs about 10 times faster than longest-palindrome1. That’s a nice performance win for a small change.

The fastest implementation I came up with uses Java interop to access characters within the original string without creating new strings. Only the longest palindromic substring needs to be realized. The ^String type annotations help the Clojure compiler pick the correct Java methods.

(defn substr-pal? [^String s start end]
  (loop [front start back (dec end)]
    (or (>= front back)
        (and (= (.charAt s front) (.charAt s back))
             (recur (inc front) (dec back))))))

(defn longest-palindrome3 [^String s]
  (let [cnt (.length s)]
    (first (for [len (range cnt 0 -1)
                 start (range (inc (- cnt len)))
                 :let [end (+ start len)]
                 :when (substr-pal? s start end)]
             (subs s start end)))))

You can find my code in this gist.

I wrote a benchmark function run-benchmarks which calls Criterium to get timings. Here are the results on my machine:

  • Testing miner.pal/longest-palindrome, Execution time mean : 633.145389 µs

  • Testing miner.pal/longest-palindrome1, Execution time mean : 222.759123 µs

  • Testing miner.pal/longest-palindrome2, Execution time mean : 22.415052 µs

  • Testing miner.pal/longest-palindrome3, Execution time mean : 7.496435 µs