Getting Clojure—Russ Olsen

Notes from Russ Olsen’s Getting Clojure.

Code notes

1: Hello, Clojure

Convention: use ;; for comments that take up a whole line, use ; for comments on the same line as code
Prefix/Polish notation - i.e., (verb argument argument argument …)
Division (/) can produce either:
- java.lang.Long (integer value returned); or
- clojure.lang.Ratio (precise, non-integer value returned)
Use quot to get Java-like (truncated) integer division
Numeric type promotions: adding an integer (java.lang.Long) and floating point number (java.lang.Double) returns a floating point number
Compilation is single-pass, top to bottom of a namespace
- The declare macro gets around the top-to-bottom characteristic of compilation but it is considered unconventional
def and defn draw from the same well of names (defn uses def under the hood) - i.e., constants and functions cannot share the same symbol

2: Vectors and Lists

Collections can be type-heterogeneous
It is possible to call vectors like functions to perform an index lookup (e.g., (my-vector 3)) - this is analogous to Python’s square bracket syntax (e.g., my_vector[3])
As data structures are immutable, functions like rest always return entirely new vectors
conj
- Short for conjunction
- Adds to the end of a vector, adds to the beginning of a list
- Returns a vector if called with a vector, otherwise a sequence
cons
- Short for construction
- Always adds to the beginning of a collection
- Always returns a sequence
Empty lists do not need a ' prefix (there is no ambiguity with a function call)
Vectors are analogous to arrays, lists are analogous to (and implemented as) linked lists
Vectors are, generally speaking, faster when getting an item at an index
Lists are, generally speaking, faster and less memory-disruptive when adding to
Lists are implemented as three-memory-slot linked lists:
- The value of the item
- A pointer to the next item
- A count of the number of items in the list - this enables count to execute quickly rather than traversing the entire list
Under the hood Clojure chunks vectors and stores them in memory as shallow trees - this means that adding to a vector doesn’t involve copying everything each time
Immutable = persistent in Clojure parlance when speaking about data structures (“persistent” might have nothing to do with database storage)
Vectors are more common than lists in the wild

3: Maps, Keywords, and Sets

Like vectors, maps can be called as functions to perform lookups, with the desired key passed as an argument
Keywords are interned strings - i.e., only one copy of each distinct string value is stored in memory
Vectors can be thought of as maps where each key is just an index
- Some functions that are usually thought of as relating to maps can also be used on vectors (e.g., assoc but not dissoc)
- assoc can only be used to replace a value at an index of a vector or append to a vector by passing in the value of the last element’s index plus one; you cannot assoc at an arbitrary index
- Generally speaking, if you are using assoc for a vector you should be using a different data structure, though
sorted-map can be used to guarantee the order of a map’s keys - creates a PersistentTreeMap under the hood
Calling a set as function as a test for set membership is absolutely fine but not advisable for maps - nil returned might indicate the absence of a value, OR it might be the value of the key you just looked up
Standard sequence functions (e.g., first, rest, count) also work on maps and treat the key-value pairs as collections of two element vectors
- E.g., calling first on a map would return [:key value]
- But bear in mind that most maps are unordered!

4: Logic

The = function is built on the idea of structural equality - two things are equal if their values are equal
and and or do short-circuit evaluation
Only false and nil are falsy, everything else is truthy
do and when are useful for bundling side-effecting expressions together
Using :else or :default as the final “predicate” in a cond expression isn’t special syntax, it is just idiomatic: keywords are truthy, so this will always evaluate to true and execute the last expression
case is a more opinionated version of cond, analogous to Java’s switch-case statement
catch forms within a try can return a value but finally will not
and does not always return a boolean!

5: More Capable Functions

Arguments bound to the symbol following & in a function’s arguments will be in a collection
We can create a defmethod with :default to handle all other cases - note that unlike the last predicate in a cond expression where the use of :default (or :else) is just convention (when any truthy value would do), here :default is a special keyword that is required
We can use Clojure’s loop-recur construct to write explicitly tail-recursive functions - this means that we can use recursion to process arbitrarily long collections without worrying about stack overflows
- This is a Clojure construct - the JVM does not support tail recursion
- This can simply be thought of as a loop - each stack frame returns and is passed to the next (and can then be discarded from stack memory)
Use :pre and :post to raise an runtime errors if the arguments or return values to a function are not what you expect - the checking expression(s) can be anything; any non-truthy value will result in an java.lang.AssertionError

6: Functional Things

Functions are values that can be passed around like strings or longs
A function (anonymous or otherwise) is bound by the opening and closing parentheses - this is known as a closure (analogous to scope)
apply unwraps a sequence and applies the provided function to the arguments as if they were provided as individual arguments
- (apply + [1 2 3]) expands to (+ 1 2 3)
partial takes a function and some “default” arguments to that function and returns a function
- The returned function can then be called with the missing argument(s)
A lambda (i.e., an anonymous function) can be represented two ways in Clojure:
- A function literal - i.e., #()
- Using fn
The functional programmer’s Prime Directive: try to write functions that don’t care about the context in which they are called
Life is a lot easier without side effects - try to write pure functions when you can

7: Let

let and fn can be used together to build function-returning functions that come with values pre-set - the scope of the let form continues even after the (anonymous) function has been returned
when-let is good for returning this if a predicate is truthy or nil otherwise
A let form is an example of a local binding in the context of lexical scope

8: Def, Symbols and Vars

def binds a symbol to a value
defn is just def plus fn
In some sense, symbols are just values in Clojure - very much like keywords except keywords always evaluate to themselves, whereas a symbol generally evaluates to some other value
- Except when use a single quote to get the string value of the symbol - e.g., (str 'my-symbol)
A var is created to when using def; it represents the binding between symbol and value
- Access the value of a var using #' - e.g., (str #'my-symbol)
- Vars come fully qualified with the namespace - e.g., #'my-namespace/my-symbol
Generally speaking, leave your vars alone once they have been defed
binding can be used like let to temporarily change the value of a ^:dynamic var within a (narrow) lexical scope
- Convention states that dynamic vars should begin and end with * (“earmuffs”) - e.g., (def ^:dynamic *my-dyanamic-var* "initial value")
let does not actually create vars, even within its scope
def your constants and defn your functions and then LEAVE THEM ALONE
Use set! to update a dynamic var within a binding
Inbuilt dynamic vars for the REPL:
- *1 - gets the result of the last expression
- *2- gets the one before that, etc.
- *e - gets the last exception

9: Namespaces

Conceptually, namespace can be thought of as a lookup table of vars, indexed by their symbols
In the REPL, Clojure boots us into the default namespace, user
*ns* is the inbuilt dynamic var that keeps track of the current namespace
- E.g., *ns* => object[clojure.lang.Namespace 0x504aa3c9 "user"]
(ns …) in the REPL will create a new namespace and set the name passed as the current namespace
- If you pass the name of an existing namespace to ns, it will take you into that namespace, rather than creating a new one
When you use refer for accessing code from another namespace, you are essentially importing all of the vars from that namespace into yours - proceed with caution
Use an extra colon to create namespace-qualified keywords (good for avoiding collisions)
Any environment that is running Clojure code will essentially require everything from the clojure.core namespace on boot - this is why we don’t have to do anything to access fundamental Clojure functions
As far as Clojure is concerned, namespace names are arbitrary
- E.g., clojure.core.data has no relationship to and does not belong to clojure.core
Use :reload within require when REPLing to avoid having to go back to your editor to reload a namespace
- defonce defines a var exactly once, regardless of reload (useful for side effects)

10: Sequences

Under the hood sequences are implemented using an adapter pattern
All collections in Clojure implement the ISeq interface - allowing you to use any sequence functions on them - and then under the hood implement those functions in collection-specific ways
(seq a-collection) is an idiomatic way to determine if a collection is empty - seq returns nil (falsy) is a-collection is empty
rest, next and cons always return sequences
Seqable describes anything that can be turned into a sequence
Think of reduce as a way of getting a single value out of a collection - it doesn’t have to be used to combine the elements in a standard way
While sequence functions are incredibly useful and powerful, sometimes you might want to retain the original type of collection that you started with
- conj is a good example of a collection function that returns the same type that you pass in

11: Lazy Sequences

The three chief virtues of a programmer are laziness, impatience and hubris.—Larry Wallquote
Sequences are an abstraction in Clojure; we should make no assumptions about the underlying data type, only that the functions you use will do what they say they will do
A lazy sequence is one that waits to be asked before it generates its elements
An unbounded sequence is a lazy sequence that, in theory, can go on for ever
All unbounded sequences are lazy, but not all lazy sequences are unbounded
doall and doseq, gets rid of laziness and forces results to be computed now - i.e., they are eager
count, sort and filter are all eager (obviously)

12: Destructuring

_ is just convention for saying “I don’t really care about this value”
Sequential destructuring can be applied to anything that is seqable, even strings
:keys is special syntax for associative destructuring that lets the compiler know that we are going to use the names of the map’s keywords for our local vars
Destructuring is purely for function parameter lists and let, you cannot use it directly after a def
Use :as to retain access to the original map passed into a function
Use :or to set sensible defaults

13: Records and Protocols

Records can be thought of as opinionated maps
Records come with two factory functions for constructing instances:
- ->RecordName - takes a list of arguments
- map->RecordName - takes a map of arguments (with the correct keywords)
You can use any map functions on records, but be careful - some operations will result on changing your record into a clojure.lang.PersistentArrayMap
Records are faster to lookup than maps
Two records are equal if the values of their fields are equal (structural equality)
this is used à la Java to represent the current instance of class
this needs to be the first parameter in protocol methods (for methods that take more than one parameter)
Use extend-protocol to implement a new protocol on any existing type… if you want
Analogies with OOP:
- Records: classes and objects
- Protocols: interfaces
- Note the lack of inheritance, though!
reify lets you create a partial implementation of a protocol (likely for testing) that will work without implementing all of the protocol methods

14: Tests

Use clojure.test/run-tests to run all tests in a namespace from the REPL
Require org.clojure/test.check (external dependency) for property-based testing
Generators are cool
Use clojure.test/are for parametrised testing
At least one test per namespace isn’t a bad idea, just to ensure it actually exists and there are no egregious errors in it

15: Spec

Conceptually, you can think of clojure.spec as a convenience wrapper around regex
spec/or requires an even number of forms - keywords are required to help provide coherent failure messages
Use :req for required keys in a map and :opt for optional
Add the -un suffix to either if the keys in the map to validate are non-namespace-qualified (i.e., just regular maps with keywords for keys)
- Without -un, for a map to be be valid, the keys listed would need to be fully qualified
- The list of keys provided to spec/def need to be fully qualified when using :req-un because the resulting spec will be associated with whatever namespace from which spec/def is called
Don’t use fspec/instrument in production - they will slow things down a lot
clojure.spec.test.alpha/check provides a powerful way to do generative testing
- You can use just during the testing phase while leaving instrumentation off in production

16: Interoperating with Java

Java packages are analogous to Clojure namespaces
(ClassName. arg1 arg2) to call a class’ constructor
(.getWhatever my-object) to invoke Java methods on Java objects
(.-fieldName my-object) to access an object’s field directly
There is no need to import java.lang - it is already on the classpath
Use Clojure-y syntax to call static Java functions: (File/createTempFile "my-file" ".txt")
Methods and static methods can be invoked like functions in Clojure code, but they are not functions
- Things you cannot do with Java methods:
  - Bind an arbitrary symbol to one
  - Pass one in as an argument to a function
- They are special forms
Java objects are, unless clearly stated otherwise, mutable

17: Threads, Promises and Futures

By default on the JVM, the main thread runs your code
Java’s Thread class accepts anything that implements the Runnable interface (Clojure functions do)
Threads are independent engines of computation - we don’t know how long one will take relative to another
Race conditions can occur any time you have two or more threads making changes to a shared resource
Threads keep dynamic vars in thread-local storage, so even though they are mutable, one the value within a binding (within a thread) is not accessible by another thread
Spinning up a new thread with Java interop is a fine way to get some work done in the background, but if you want to actually get the result of an operation carried out on a separate thread, use a promise
derefing a promise that has not yet been delivered to will cause the main thread to wait until there is a result
Think of a future as a promise that brings along its own thread
There is no need to deliver a result to a future, Clojure does this for you in a separate thread
Deref-ing a future, like a promise, also blocks the main thread
If you want to get some background processing done, use a future, not a promise - futures take care of spinning up and down new threads themselves
It’s generally better to use a higher level abstraction like future when explicitly using separate threads for work, but if you need something lower level, make use of thread pools in java.util.concurrent.Executors

18: State

Use atoms to manage mutable state
swap! is completely thread-safe!
swap!
- Gets the current value of the atom
- Calls the passed-in function on the value
- Checks that the value in the atom is the same as when it started the process
- If it is the same, return the updated value; if it is not the same, start the process again the later value of the atom
swap! executes entirely on the thread that calls it
refs are the grownup version of atoms
If you need to manage the state of multiple somewhat related values, a ref might be a good option
We can use dosync to wrap updates (using alter) to a ref in a database-like transaction
Atom-updating functions is a bad place to add side-effecting code - depending on what other threads are trying to do, this function might try to update the atom multiple times (the same applies to refs)
Updates to an agent happen asynchronously:
- send will return immediately
- The function used to update the agent will be queued along with any other functions that are trying to update it
- Your function will actually update the agent sometime in the future
- Agent-updating functions are called exactly once
It is also possible to wrap multiple sends in a dosync to create a transaction
When in doubt, reach for an atom over the other two options
If an update to an atom or ref fails, an exception is thrown; if an update to an agent fails, nothing happens on your current thread - the exception is thrown by the thread that takes care of updating the agent asynchronously, and will only be seen on the main thread the next time you try to update the agent

19 Read and Eval

Code is data
Clojure code is represented by Clojure data structures - this property is homoiconicity
read turns characters into data structures, eval turns data structures into code
Use meta to find a function’s metadata
There is probably no need to ever use eval in the wild
eval does not know about local let bindings
Only use read if you trust the data; use clojure.edn/read for dodgy data

20: Macros

Remember that the first thing a Clojure function does is evaluate its arguments (this can include side effects!)
Clojure code is just Clojure data
Macros are just functions that are part of the compilation process
Macros are applied to the code, not the runtime data
Macros makes themselves felt at two distinct times:
- When the macro is being expanded - i.e., macro is writing some code
- When the code that the macro wrote is executed
Do not expect to see any mention of a macro’s name in stack traces - just an auto-generated function name that the macro will have expanded to
Use # as a suffix for symbols if worried about name collisions (this will generate a unique name at expand-time)
Use macroexpand-1 to check what a macro plus its arguments will expand to
Do not try to use macros as ordinary functions (or when an ordinary function would do)!
Using an ordinary function should be your default choice!
- Use a macro only after repeating yourself a lot; or
- When bumping up against Clojure’s evaluation rules (à la the arithmetic-if example)

✎ Blair's Notes

Explorer

Getting Clojure—Russ Olsen

1: Hello, Clojure

2: Vectors and Lists

3: Maps, Keywords, and Sets

4: Logic

5: More Capable Functions

6: Functional Things

7: Let

8: Def, Symbols and Vars

9: Namespaces

10: Sequences

11: Lazy Sequences

12: Destructuring

13: Records and Protocols

14: Tests

15: Spec

16: Interoperating with Java

17: Threads, Promises and Futures

18: State

19 Read and Eval

20: Macros

Graph View

Table of Contents

Backlinks