Notes from Russ Olsen’s Getting Clojure.
1: Hello, Clojure
- Convention: use
;;
for comments that take up a whole line, use;
for comments on the same line as code - Prefix/Polish notation - i.e.,
(verb argument argument argument …)
- Division (
/
) can produce either:java.lang.Long
(integer value returned); orclojure.lang.Ratio
(precise, non-integer value returned)
- Use
quot
to get Java-like (truncated) integer division - Numeric type promotions: adding an integer (
java.lang.Long
) and floating point number (java.lang.Double
) returns a floating point number - Compilation is single-pass, top to bottom of a namespace
- The
declare
macro gets around the top-to-bottom characteristic of compilation but it is considered unconventional
- The
def
anddefn
draw from the same well of names (defn
usesdef
under the hood) - i.e., constants and functions cannot share the same symbol
2: Vectors and Lists
- Collections can be type-heterogeneous
- It is possible to call vectors like functions to perform an index lookup (e.g.,
(my-vector 3)
) - this is analogous to Python’s square bracket syntax (e.g.,my_vector[3]
) - As data structures are immutable, functions like
rest
always return entirely new vectors conj
- Short for conjunction
- Adds to the end of a vector, adds to the beginning of a list
- Returns a vector if called with a vector, otherwise a sequence
cons
- Short for construction
- Always adds to the beginning of a collection
- Always returns a sequence
- Empty lists do not need a
'
prefix (there is no ambiguity with a function call) - Vectors are analogous to arrays, lists are analogous to (and implemented as) linked lists
- Vectors are, generally speaking, faster when getting an item at an index
- Lists are, generally speaking, faster and less memory-disruptive when adding to
- Lists are implemented as three-memory-slot linked lists:
- The value of the item
- A pointer to the next item
- A count of the number of items in the list - this enables
count
to execute quickly rather than traversing the entire list
- Under the hood Clojure chunks vectors and stores them in memory as shallow trees - this means that adding to a vector doesn’t involve copying everything each time
- Immutable = persistent in Clojure parlance when speaking about data structures (“persistent” might have nothing to do with database storage)
- Vectors are more common than lists in the wild
3: Maps, Keywords, and Sets
- Like vectors, maps can be called as functions to perform lookups, with the desired key passed as an argument
- Keywords are interned strings - i.e., only one copy of each distinct string value is stored in memory
- Vectors can be thought of as maps where each key is just an index
- Some functions that are usually thought of as relating to maps can also be used on vectors (e.g.,
assoc
but notdissoc
) assoc
can only be used to replace a value at an index of a vector or append to a vector by passing in the value of the last element’s index plus one; you cannotassoc
at an arbitrary index- Generally speaking, if you are using
assoc
for a vector you should be using a different data structure, though
- Some functions that are usually thought of as relating to maps can also be used on vectors (e.g.,
sorted-map
can be used to guarantee the order of a map’s keys - creates aPersistentTreeMap
under the hood- Calling a set as function as a test for set membership is absolutely fine but not advisable for maps -
nil
returned might indicate the absence of a value, OR it might be the value of the key you just looked up - Standard sequence functions (e.g.,
first
,rest
,count
) also work on maps and treat the key-value pairs as collections of two element vectors- E.g., calling
first
on a map would return[:key value]
- But bear in mind that most maps are unordered!
- E.g., calling
4: Logic
- The
=
function is built on the idea of structural equality - two things are equal if their values are equal and
andor
do short-circuit evaluation- Only
false
andnil
are falsy, everything else is truthy do
andwhen
are useful for bundling side-effecting expressions together- Using
:else
or:default
as the final “predicate” in acond
expression isn’t special syntax, it is just idiomatic: keywords are truthy, so this will always evaluate totrue
and execute the last expression case
is a more opinionated version ofcond
, analogous to Java’s switch-case statementcatch
forms within atry
can return a value butfinally
will notand
does not always return a boolean!
5: More Capable Functions
- Arguments bound to the symbol following
&
in a function’s arguments will be in a collection - We can create a
defmethod
with:default
to handle all other cases - note that unlike the last predicate in acond
expression where the use of:default
(or:else
) is just convention (when any truthy value would do), here:default
is a special keyword that is required - We can use Clojure’s
loop
-recur
construct to write explicitly tail-recursive functions - this means that we can use recursion to process arbitrarily long collections without worrying about stack overflows- This is a Clojure construct - the JVM does not support tail recursion
- This can simply be thought of as a loop - each stack frame returns and is passed to the next (and can then be discarded from stack memory)
- Use
:pre
and:post
to raise an runtime errors if the arguments or return values to a function are not what you expect - the checking expression(s) can be anything; any non-truthy value will result in anjava.lang.AssertionError
6: Functional Things
- Functions are values that can be passed around like strings or longs
- A function (anonymous or otherwise) is bound by the opening and closing parentheses - this is known as a closure (analogous to scope)
apply
unwraps a sequence and applies the provided function to the arguments as if they were provided as individual arguments(apply + [1 2 3])
expands to(+ 1 2 3)
partial
takes a function and some “default” arguments to that function and returns a function- The returned function can then be called with the missing argument(s)
- A lambda (i.e., an anonymous function) can be represented two ways in Clojure:
- A function literal - i.e.,
#()
- Using
fn
- A function literal - i.e.,
- The functional programmer’s Prime Directive: try to write functions that don’t care about the context in which they are called
- Life is a lot easier without side effects - try to write pure functions when you can
7: Let
let
andfn
can be used together to build function-returning functions that come with values pre-set - the scope of thelet
form continues even after the (anonymous) function has been returnedwhen-let
is good for returning this if a predicate is truthy ornil
otherwise- A
let
form is an example of a local binding in the context of lexical scope
8: Def, Symbols and Vars
def
binds a symbol to a valuedefn
is justdef
plusfn
- In some sense, symbols are just values in Clojure - very much like keywords except keywords always evaluate to themselves, whereas a symbol generally evaluates to some other value
- Except when use a single quote to get the string value of the symbol - e.g.,
(str 'my-symbol)
- Except when use a single quote to get the string value of the symbol - e.g.,
- A var is created to when using
def
; it represents the binding between symbol and value- Access the value of a var using
#'
- e.g.,(str #'my-symbol)
- Vars come fully qualified with the namespace - e.g.,
#'my-namespace/my-symbol
- Access the value of a var using
- Generally speaking, leave your vars alone once they have been
def
ed binding
can be used likelet
to temporarily change the value of a^:dynamic
var within a (narrow) lexical scope- Convention states that dynamic vars should begin and end with
*
(“earmuffs”) - e.g.,(def ^:dynamic *my-dyanamic-var* "initial value")
- Convention states that dynamic vars should begin and end with
let
does not actually create vars, even within its scopedef
your constants anddefn
your functions and then LEAVE THEM ALONE- Use
set!
to update a dynamic var within a binding - Inbuilt dynamic vars for the REPL:
*1
- gets the result of the last expression*2
- gets the one before that, etc.*e
- gets the last exception
9: Namespaces
- Conceptually, namespace can be thought of as a lookup table of vars, indexed by their symbols
- In the REPL, Clojure boots us into the default namespace,
user
*ns*
is the inbuilt dynamic var that keeps track of the current namespace- E.g.,
*ns* => object[clojure.lang.Namespace 0x504aa3c9 "user"]
- E.g.,
(ns …)
in the REPL will create a new namespace and set the name passed as the current namespace- If you pass the name of an existing namespace to
ns
, it will take you into that namespace, rather than creating a new one
- If you pass the name of an existing namespace to
- When you use
refer
for accessing code from another namespace, you are essentially importing all of the vars from that namespace into yours - proceed with caution - Use an extra colon to create namespace-qualified keywords (good for avoiding collisions)
- Any environment that is running Clojure code will essentially
require
everything from theclojure.core
namespace on boot - this is why we don’t have to do anything to access fundamental Clojure functions - As far as Clojure is concerned, namespace names are arbitrary
- E.g.,
clojure.core.data
has no relationship to and does not belong toclojure.core
- E.g.,
- Use
:reload
withinrequire
when REPLing to avoid having to go back to your editor to reload a namespacedefonce
defines a var exactly once, regardless ofreload
(useful for side effects)
10: Sequences
- Under the hood sequences are implemented using an adapter pattern
- All collections in Clojure implement the
ISeq
interface - allowing you to use any sequence functions on them - and then under the hood implement those functions in collection-specific ways (seq a-collection)
is an idiomatic way to determine if a collection is empty -seq
returnsnil
(falsy) isa-collection
is emptyrest
,next
andcons
always return sequences- Seqable describes anything that can be turned into a sequence
- Think of
reduce
as a way of getting a single value out of a collection - it doesn’t have to be used to combine the elements in a standard way - While sequence functions are incredibly useful and powerful, sometimes you might want to retain the original type of collection that you started with
conj
is a good example of a collection function that returns the same type that you pass in
11: Lazy Sequences
-
The three chief virtues of a programmer are laziness, impatience and hubris.—Larry Wallquote
- Sequences are an abstraction in Clojure; we should make no assumptions about the underlying data type, only that the functions you use will do what they say they will do
- A lazy sequence is one that waits to be asked before it generates its elements
- An unbounded sequence is a lazy sequence that, in theory, can go on for ever
- All unbounded sequences are lazy, but not all lazy sequences are unbounded
doall
anddoseq
, gets rid of laziness and forces results to be computed now - i.e., they are eagercount
,sort
andfilter
are all eager (obviously)
12: Destructuring
_
is just convention for saying “I don’t really care about this value”- Sequential destructuring can be applied to anything that is seqable, even strings
:keys
is special syntax for associative destructuring that lets the compiler know that we are going to use the names of the map’s keywords for our local vars- Destructuring is purely for function parameter lists and
let
, you cannot use it directly after adef
- Use
:as
to retain access to the original map passed into a function - Use
:or
to set sensible defaults
13: Records and Protocols
- Records can be thought of as opinionated maps
- Records come with two factory functions for constructing instances:
->RecordName
- takes a list of argumentsmap->RecordName
- takes a map of arguments (with the correct keywords)
- You can use any map functions on records, but be careful - some operations will result on changing your record into a
clojure.lang.PersistentArrayMap
- Records are faster to lookup than maps
- Two records are equal if the values of their fields are equal (structural equality)
this
is used à la Java to represent the current instance of classthis
needs to be the first parameter in protocol methods (for methods that take more than one parameter)- Use
extend-protocol
to implement a new protocol on any existing type… if you want - Analogies with OOP:
- Records: classes and objects
- Protocols: interfaces
- Note the lack of inheritance, though!
reify
lets you create a partial implementation of a protocol (likely for testing) that will work without implementing all of the protocol methods
14: Tests
- Use
clojure.test/run-tests
to run all tests in a namespace from the REPL - Require
org.clojure/test.check
(external dependency) for property-based testing - Generators are cool
- Use
clojure.test/are
for parametrised testing - At least one test per namespace isn’t a bad idea, just to ensure it actually exists and there are no egregious errors in it
15: Spec
- Conceptually, you can think of
clojure.spec
as a convenience wrapper around regex spec/or
requires an even number of forms - keywords are required to help provide coherent failure messages- Use
:req
for required keys in a map and:opt
for optional - Add the
-un
suffix to either if the keys in the map to validate are non-namespace-qualified (i.e., just regular maps with keywords for keys)- Without
-un
, for a map to be be valid, the keys listed would need to be fully qualified - The list of keys provided to
spec/def
need to be fully qualified when using:req-un
because the resulting spec will be associated with whatever namespace from whichspec/def
is called
- Without
- Don’t use
fspec
/instrument
in production - they will slow things down a lot clojure.spec.test.alpha/check
provides a powerful way to do generative testing- You can use just during the testing phase while leaving instrumentation off in production
16: Interoperating with Java
- Java packages are analogous to Clojure namespaces
(ClassName. arg1 arg2)
to call a class’ constructor(.getWhatever my-object)
to invoke Java methods on Java objects(.-fieldName my-object)
to access an object’s field directly- There is no need to import
java.lang
- it is already on the classpath - Use Clojure-y syntax to call static Java functions:
(File/createTempFile "my-file" ".txt")
- Methods and static methods can be invoked like functions in Clojure code, but they are not functions
- Things you cannot do with Java methods:
- Bind an arbitrary symbol to one
- Pass one in as an argument to a function
- They are special forms
- Things you cannot do with Java methods:
- Java objects are, unless clearly stated otherwise, mutable
17: Threads, Promises and Futures
- By default on the JVM, the
main
thread runs your code - Java’s
Thread
class accepts anything that implements theRunnable
interface (Clojure functions do) - Threads are independent engines of computation - we don’t know how long one will take relative to another
- Race conditions can occur any time you have two or more threads making changes to a shared resource
- Threads keep dynamic vars in thread-local storage, so even though they are mutable, one the value within a binding (within a thread) is not accessible by another thread
- Spinning up a new thread with Java interop is a fine way to get some work done in the background, but if you want to actually get the result of an operation carried out on a separate thread, use a promise
deref
ing a promise that has not yet been delivered to will cause the main thread to wait until there is a result- Think of a future as a promise that brings along its own thread
- There is no need to
deliver
a result to a future, Clojure does this for you in a separate thread - Deref-ing a future, like a promise, also blocks the main thread
- If you want to get some background processing done, use a future, not a promise - futures take care of spinning up and down new threads themselves
- It’s generally better to use a higher level abstraction like
future
when explicitly using separate threads for work, but if you need something lower level, make use of thread pools injava.util.concurrent.Executors
18: State
- Use atoms to manage mutable state
swap!
is completely thread-safe!swap!
- Gets the current value of the atom
- Calls the passed-in function on the value
- Checks that the value in the atom is the same as when it started the process
- If it is the same, return the updated value; if it is not the same, start the process again the later value of the atom
swap!
executes entirely on the thread that calls itref
s are the grownup version ofatom
s- If you need to manage the state of multiple somewhat related values, a
ref
might be a good option - We can use
dosync
to wrap updates (usingalter
) to aref
in a database-like transaction - Atom-updating functions is a bad place to add side-effecting code - depending on what other threads are trying to do, this function might try to update the atom multiple times (the same applies to refs)
- Updates to an
agent
happen asynchronously:send
will return immediately- The function used to update the agent will be queued along with any other functions that are trying to update it
- Your function will actually update the agent sometime in the future
- Agent-updating functions are called exactly once
- It is also possible to wrap multiple
send
s in adosync
to create a transaction - When in doubt, reach for an
atom
over the other two options - If an update to an
atom
orref
fails, an exception is thrown; if an update to anagent
fails, nothing happens on your current thread - the exception is thrown by the thread that takes care of updating theagent
asynchronously, and will only be seen on the main thread the next time you try to update the agent
19 Read and Eval
- Code is data
- Clojure code is represented by Clojure data structures - this property is homoiconicity
read
turns characters into data structures,eval
turns data structures into code- Use
meta
to find a function’s metadata - There is probably no need to ever use
eval
in the wild eval
does not know about locallet
bindings- Only use
read
if you trust the data; useclojure.edn/read
for dodgy data
20: Macros
- Remember that the first thing a Clojure function does is evaluate its arguments (this can include side effects!)
- Clojure code is just Clojure data
- Macros are just functions that are part of the compilation process
- Macros are applied to the code, not the runtime data
- Macros makes themselves felt at two distinct times:
- When the macro is being expanded - i.e., macro is writing some code
- When the code that the macro wrote is executed
- Do not expect to see any mention of a macro’s name in stack traces - just an auto-generated function name that the macro will have expanded to
- Use
#
as a suffix for symbols if worried about name collisions (this will generate a unique name at expand-time) - Use
macroexpand-1
to check what a macro plus its arguments will expand to - Do not try to use macros as ordinary functions (or when an ordinary function would do)!
- Using an ordinary function should be your default choice!
- Use a macro only after repeating yourself a lot; or
- When bumping up against Clojure’s evaluation rules (à la the
arithmetic-if
example)