There are two main ways to build strings in Clojure: str
and
format
. str
essentially does string concatenation like
this:
(str "This is a sentence with " some " variables ")
I find str
forms somewhat unreadable, especially on one line. They
require the reader to mentally keep track of quote marks and spaces
around variables.
On the other hand, format
offers a fully-featured string interpolation
function using Java’s Formatter class:
(format "This string has another string: %s in it and a number: %.2f"
"hello!"
30.1)
In ruby we might use string interpolation thus:
def str_interpolate name, profession, born
puts "The person named #{name} works as a #{profession} and was born in #{born}"
end
str_interpolate "Ethel Smyth", "Composer", 1858
The advantage of string interpolation like this is how readable the code can be.
The problem with prefering format
over str
is the difference in
performance. format
is a lot more complex and if all you’re doing is
string concatenation, then it’ll not do the job as quickly as str
does (which uses a StringBuilder under the hood).
Benchmarks!
The following code uses the Criterium library aliased to bench
.
First let’s define a function using str
and benchmark it:
(defn str-concat-fun
[name profession born]
(str "The person named "
name
" works as a "
profession
" and was born in "
born))
(bench/bench (str-concat-fun name profession born))
Which results in (256ns):
Evaluation count : 241104540 in 60 samples of 4018409 calls.
Execution time mean : 256.236563 ns
Execution time std-deviation : 4.617404 ns
Execution time lower quantile : 250.226629 ns ( 2.5%)
Execution time upper quantile : 266.339191 ns (97.5%)
Overhead used : 1.166050 ns
Found 5 outliers in 60 samples (8.3333 %)
low-severe 5 (8.3333 %)
Variance from outliers : 7.7764 % Variance is slightly inflated by outliers
Now let’s check the format
version:
(defn format-fun
[name profession born]
(format "The person named %s works as a %s and was born in %d"
name
profession
born))
(bench/bench (format-fun name profession born))
Which results in (1.7µs):
Evaluation count : 34997760 in 60 samples of 583296 calls.
Execution time mean : 1.703759 µs
Execution time std-deviation : 36.732362 ns
Execution time lower quantile : 1.663752 µs ( 2.5%)
Execution time upper quantile : 1.779579 µs (97.5%)
Overhead used : 1.166050 ns
Found 2 outliers in 60 samples (3.3333 %)
low-severe 2 (3.3333 %)
Variance from outliers : 9.4397 % Variance is slightly inflated by outliers
That’s not good, but not entirely unexpected. An order of magnitude
slower to use format
for string contatenation tasks like this.
One of the advantages of Clojure is the promise of powerful, expressive abstractions and not having to compromise on those abstractions to achieve performance. In a blog post about string interpolation in Clojure, Chas Emerick proposes a macro for simple string interpolation that would behave much like Ruby’s does. This has made its way into the core.incubator project and can be used in projects today.
To require it, add core.incubator
to your project’s dependencies and
add the following to any namespace that needs it:
(ns example...)
(:require [clojure.core.strint :refer [<<]]))
So now we can define a function like the others using this new macro:
(defn interpolation-fun
[name profession born]
(<< "The person named ~{name} works as a ~{profession} and was born in ~{born}"))
(bench/bench (interpolation-fun name profession born))
And this results in (272ns):
Evaluation count : 222317940 in 60 samples of 3705299 calls.
Execution time mean : 271.580867 ns
Execution time std-deviation : 2.117763 ns
Execution time lower quantile : 267.593081 ns ( 2.5%)
Execution time upper quantile : 274.826087 ns (97.5%)
Overhead used : 1.166050 ns
Not bad! Ever so slightly slower than the str
version but a
performance penalty well worth paying for to get more expressive
string interpolation I feel.