Eric Normand asked his readers to submit Clojure implementations of the Sieve of Eratosthenes. You can see the benchmark code and results on the Purely Functional site.

I wrote a basic version in Clojure, which was directly translated from the pseudocode on Wikipedia. Using transients greatly improved performance.

(defn classic-sieve
  "Returns sequence of primes less than N"
  (loop [nums (transient (vec (range n))) i 2]
     (> (* i i) n) (remove nil? (nnext (persistent! nums)))
     (nums i) (recur (loop [nums nums j (* i i)]
                       (if (< j n)
                         (recur (assoc! nums j nil) (+ j i))
                     (inc i))
     :else (recur nums (inc i)))))

Valentin Waeselynck submitted a fast implementation that kept track of composites using a boolean-array. In a moment, I’m going to steal that idea to improve my version of the sieve. But first, I want to point out an excellent journal article by Melissa E. O’Neill: “The Genuine Sieve of Eratosthenes”, which uses Haskell code examples. She explains some of the tricks that can make the sieve faster. (Hat tip to Wikipedia for the link.)

The first well known improvement is to treat 2 as a special case. That means we only have to look at odd numbers as potential primes. The second trick is to start marking off multiples of a prime p from its square p2. Any composite less than p2 will have a smaller prime factor so we know it must have been marked off already. As we’re skipping even numbers, we can step by 2p to mark just the odd multiples.

(defn bar-primes
  "Returns sequence of primes less than N"
  [^long n]
  (if (<= n 2)
    (let [nsqrt (long (Math/sqrt n))]
      (loop [cmps (boolean-array n) i 3]
         (> i nsqrt) (loop [b 3 ps (transient [2])]
                        (if (< b n)
                          (recur (+ b 2) (if (aget cmps b) ps (conj! ps b)))
                          (persistent! ps)))
         (aget cmps i) (recur cmps (+ i 2))
         :else (recur (let [step (* 2 i)]
                        (loop [cmps cmps j (* i i)]
                          (if (< j n)
                            (recur (do (aset cmps j true) cmps) (+ j step))
                      (+ i 2)))))))

I kept the code organized in a functional pattern even though we’re using a mutable Java array of booleans, named cmps (short for “composites”). An earlier version of the code was written with a Clojure vector for the composite bookkeeping so it was pretty simple to substitute the boolean-array without changing the logic much.

We’re trying to be fast so we don’t mind wasting a bit of memory for even numbers that we never look at. The final result is calculated in a loop where a filter or reduce would have worked. For better performance, I used *unchecked-math* and ^long type hints. In the end, bar-primes was more than 30 times faster than the original classic-sieve on my machine running Eric’s benchmark. Thanks to Valentin Waeselynck for the boolean-array idea.

The O’Neill article goes on the discuss how to implement an incremental functional sieve with an infinite, lazy result. Naturally, you can’t mark off all the multiples without a fixed limit so you need to do it just in time as you build a result. Along these lines, I remember a nice implementation by Christophe Grand from several year ago in his blog post: “Everybody loves the Sieve of Eratosthenes”.

Inspired by some of these examples, I came up with the following lazy version. I wanted to allow for a limit as well so it could fit into Eric’s benchmarks so there’s a bit of extra code to support a limit argument. With no argument, the result is an infinite, lazy list. Use take to take what you need.

;; [org.clojure/ "0.2.4"]
(require '[ :as im])

(defn oprimes
  ([] (let [update-sieve! (fn [sieve c step]
                            (let [c2 (+ c step)]
                              (if (sieve c2)
                                (recur sieve c2 step)
                                (assoc! sieve c2 step))))

            (fn next-primes [sieve candidate]
              (if-let [step (sieve candidate)]
                (recur (-> sieve
                           (update-sieve! candidate step)
                           ;; better performance without dissoc!, but more memory
                           #_ (dissoc! candidate))
                       (+ candidate 2))
                (cons candidate
                      (lazy-seq (next-primes
                                 (assoc! sieve (* candidate candidate)
                                         (* 2 candidate))
                                 (+ candidate 2)))))) ]

        (cons 2 (lazy-seq (next-primes (transient (im/int-map)) 3)))))

   (take-while #(< % limit) (oprimes))) )

I’m using an int-map which is an alternative hashmap specialized for integer keys. (Thanks to Zack Tellman for the contrib library.) This yields better performance and promises to use less memory. (I didn’t test the memory usage, but I will take that on faith.) If you just want to try the code in a REPL without the int-map library, you can change (im/int-map) to an empty map {} and it should work fine.

The int-map keeps track of composite numbers that have been marked off. The key is the composite number and the value is a “step” to the next composite along the original prime’s succession of multiples. As we find a prime, we cons it on to the lazy sequence of the next-primes. The updated sieve marks the square of the prime and sets the step to twice the prime (skipping even numbers). If we see a candidate with a step already set, we know it’s a composite so we update the sieve by marking off the next composite in that line, pushing the step along in an incremental fashion. The next value is just the sum of the candidate and its step, but if that next location is already marked (because it’s a multiple of another prime), we increment by the step again until we find an unmarked number. Within next-primes we know that once we’ve seen the composite, we never have to check it again so it could be dissoc!-ed, but you’ll notice a line where I disable the call to dissoc!. That’s purely for performance, at the likely cost of wasting memory that could have been reclaimed after dissoc!-ing the obsolete entries.

The lazy oprimes is a bit slower than my original classic-sieve on Eric’s benchmarks, but it’s nice to have a lazy implementation when you need it. The raw performance of bar-primes is probably not worth the effort that went into writing it, but it was a fun experiment for me. I hope someone else finds it useful. Thanks again to everyone who shared their code so that I could pick up a few new tricks.