Notes from Russ Olsen’s Getting Clojure.

1: Hello, Clojure

  • Convention: use ;; for comments that take up a whole line, use ; for comments on the same line as code
  • Prefix/Polish notation - i.e., (verb argument argument argument …)
  • Division (/) can produce either:
    • java.lang.Long (integer value returned); or
    • clojure.lang.Ratio (precise, non-integer value returned)
  • Use quot to get Java-like (truncated) integer division
  • Numeric type promotions: adding an integer (java.lang.Long) and floating point number (java.lang.Double) returns a floating point number
  • Compilation is single-pass, top to bottom of a namespace
    • The declare macro gets around the top-to-bottom characteristic of compilation but it is considered unconventional
  • def and defn draw from the same well of names (defn uses def under the hood) - i.e., constants and functions cannot share the same symbol

2: Vectors and Lists

  • Collections can be type-heterogeneous
  • It is possible to call vectors like functions to perform an index lookup (e.g., (my-vector 3)) - this is analogous to Python’s square bracket syntax (e.g., my_vector[3])
  • As data structures are immutable, functions like rest always return entirely new vectors
  • conj
    • Short for conjunction
    • Adds to the end of a vector, adds to the beginning of a list
    • Returns a vector if called with a vector, otherwise a sequence
  • cons
    • Short for construction
    • Always adds to the beginning of a collection
    • Always returns a sequence
  • Empty lists do not need a ' prefix (there is no ambiguity with a function call)
  • Vectors are analogous to arrays, lists are analogous to (and implemented as) linked lists
  • Vectors are, generally speaking, faster when getting an item at an index
  • Lists are, generally speaking, faster and less memory-disruptive when adding to
  • Lists are implemented as three-memory-slot linked lists:
    • The value of the item
    • A pointer to the next item
    • A count of the number of items in the list - this enables count to execute quickly rather than traversing the entire list
  • Under the hood Clojure chunks vectors and stores them in memory as shallow trees - this means that adding to a vector doesn’t involve copying everything each time
  • Immutable = persistent in Clojure parlance when speaking about data structures (“persistent” might have nothing to do with database storage)
  • Vectors are more common than lists in the wild

3: Maps, Keywords, and Sets

  • Like vectors, maps can be called as functions to perform lookups, with the desired key passed as an argument
  • Keywords are interned strings - i.e., only one copy of each distinct string value is stored in memory
  • Vectors can be thought of as maps where each key is just an index
    • Some functions that are usually thought of as relating to maps can also be used on vectors (e.g., assoc but not dissoc)
    • assoc can only be used to replace a value at an index of a vector or append to a vector by passing in the value of the last element’s index plus one; you cannot assoc at an arbitrary index
    • Generally speaking, if you are using assoc for a vector you should be using a different data structure, though
  • sorted-map can be used to guarantee the order of a map’s keys - creates a PersistentTreeMap under the hood
  • Calling a set as function as a test for set membership is absolutely fine but not advisable for maps - nil returned might indicate the absence of a value, OR it might be the value of the key you just looked up
  • Standard sequence functions (e.g., first, rest, count) also work on maps and treat the key-value pairs as collections of two element vectors
    • E.g., calling first on a map would return [:key value]
    • But bear in mind that most maps are unordered!

4: Logic

  • The = function is built on the idea of structural equality - two things are equal if their values are equal
  • and and or do short-circuit evaluation
  • Only false and nil are falsy, everything else is truthy
  • do and when are useful for bundling side-effecting expressions together
  • Using :else or :default as the final “predicate” in a cond expression isn’t special syntax, it is just idiomatic: keywords are truthy, so this will always evaluate to true and execute the last expression
  • case is a more opinionated version of cond, analogous to Java’s switch-case statement
  • catch forms within a try can return a value but finally will not
  • and does not always return a boolean!

5: More Capable Functions

  • Arguments bound to the symbol following & in a function’s arguments will be in a collection
  • We can create a defmethod with :default to handle all other cases - note that unlike the last predicate in a cond expression where the use of :default (or :else) is just convention (when any truthy value would do), here :default is a special keyword that is required
  • We can use Clojure’s loop-recur construct to write explicitly tail-recursive functions - this means that we can use recursion to process arbitrarily long collections without worrying about stack overflows
    • This is a Clojure construct - the JVM does not support tail recursion
    • This can simply be thought of as a loop - each stack frame returns and is passed to the next (and can then be discarded from stack memory)
  • Use :pre and :post to raise an runtime errors if the arguments or return values to a function are not what you expect - the checking expression(s) can be anything; any non-truthy value will result in an java.lang.AssertionError

6: Functional Things

  • Functions are values that can be passed around like strings or longs
  • A function (anonymous or otherwise) is bound by the opening and closing parentheses - this is known as a closure (analogous to scope)
  • apply unwraps a sequence and applies the provided function to the arguments as if they were provided as individual arguments
    • (apply + [1 2 3]) expands to (+ 1 2 3)
  • partial takes a function and some “default” arguments to that function and returns a function
    • The returned function can then be called with the missing argument(s)
  • A lambda (i.e., an anonymous function) can be represented two ways in Clojure:
    • A function literal - i.e., #()
    • Using fn
  • The functional programmer’s Prime Directive: try to write functions that don’t care about the context in which they are called
  • Life is a lot easier without side effects - try to write pure functions when you can

7: Let

  • let and fn can be used together to build function-returning functions that come with values pre-set - the scope of the let form continues even after the (anonymous) function has been returned
  • when-let is good for returning this if a predicate is truthy or nil otherwise
  • A let form is an example of a local binding in the context of lexical scope

8: Def, Symbols and Vars

  • def binds a symbol to a value
  • defn is just def plus fn
  • In some sense, symbols are just values in Clojure - very much like keywords except keywords always evaluate to themselves, whereas a symbol generally evaluates to some other value
    • Except when use a single quote to get the string value of the symbol - e.g., (str 'my-symbol)
  • A var is created to when using def; it represents the binding between symbol and value
    • Access the value of a var using #' - e.g., (str #'my-symbol)
    • Vars come fully qualified with the namespace - e.g., #'my-namespace/my-symbol
  • Generally speaking, leave your vars alone once they have been defed
  • binding can be used like let to temporarily change the value of a ^:dynamic var within a (narrow) lexical scope
    • Convention states that dynamic vars should begin and end with * (“earmuffs”) - e.g., (def ^:dynamic *my-dyanamic-var* "initial value")
  • let does not actually create vars, even within its scope
  • def your constants and defn your functions and then LEAVE THEM ALONE
  • Use set! to update a dynamic var within a binding
  • Inbuilt dynamic vars for the REPL:
    • *1 - gets the result of the last expression
    • *2- gets the one before that, etc.
    • *e - gets the last exception

9: Namespaces

  • Conceptually, namespace can be thought of as a lookup table of vars, indexed by their symbols
  • In the REPL, Clojure boots us into the default namespace, user
  • *ns* is the inbuilt dynamic var that keeps track of the current namespace
    • E.g., *ns* => object[clojure.lang.Namespace 0x504aa3c9 "user"]
  • (ns …) in the REPL will create a new namespace and set the name passed as the current namespace
    • If you pass the name of an existing namespace to ns, it will take you into that namespace, rather than creating a new one
  • When you use refer for accessing code from another namespace, you are essentially importing all of the vars from that namespace into yours - proceed with caution
  • Use an extra colon to create namespace-qualified keywords (good for avoiding collisions)
  • Any environment that is running Clojure code will essentially require everything from the clojure.core namespace on boot - this is why we don’t have to do anything to access fundamental Clojure functions
  • As far as Clojure is concerned, namespace names are arbitrary
    • E.g., clojure.core.data has no relationship to and does not belong to clojure.core
  • Use :reload within require when REPLing to avoid having to go back to your editor to reload a namespace
    • defonce defines a var exactly once, regardless of reload (useful for side effects)

10: Sequences

  • Under the hood sequences are implemented using an adapter pattern
  • All collections in Clojure implement the ISeq interface - allowing you to use any sequence functions on them - and then under the hood implement those functions in collection-specific ways
  • (seq a-collection) is an idiomatic way to determine if a collection is empty - seq returns nil (falsy) is a-collection is empty
  • rest, next and cons always return sequences
  • Seqable describes anything that can be turned into a sequence
  • Think of reduce as a way of getting a single value out of a collection - it doesn’t have to be used to combine the elements in a standard way
  • While sequence functions are incredibly useful and powerful, sometimes you might want to retain the original type of collection that you started with
    • conj is a good example of a collection function that returns the same type that you pass in

11: Lazy Sequences

  • The three chief virtues of a programmer are laziness, impatience and hubris.—Larry Wallquote

  • Sequences are an abstraction in Clojure; we should make no assumptions about the underlying data type, only that the functions you use will do what they say they will do
  • A lazy sequence is one that waits to be asked before it generates its elements
  • An unbounded sequence is a lazy sequence that, in theory, can go on for ever
  • All unbounded sequences are lazy, but not all lazy sequences are unbounded
  • doall and doseq, gets rid of laziness and forces results to be computed now - i.e., they are eager
  • count, sort and filter are all eager (obviously)

12: Destructuring

  • _ is just convention for saying “I don’t really care about this value”
  • Sequential destructuring can be applied to anything that is seqable, even strings
  • :keys is special syntax for associative destructuring that lets the compiler know that we are going to use the names of the map’s keywords for our local vars
  • Destructuring is purely for function parameter lists and let, you cannot use it directly after a def
  • Use :as to retain access to the original map passed into a function
  • Use :or to set sensible defaults

13: Records and Protocols

  • Records can be thought of as opinionated maps
  • Records come with two factory functions for constructing instances:
    • ->RecordName - takes a list of arguments
    • map->RecordName - takes a map of arguments (with the correct keywords)
  • You can use any map functions on records, but be careful - some operations will result on changing your record into a clojure.lang.PersistentArrayMap
  • Records are faster to lookup than maps
  • Two records are equal if the values of their fields are equal (structural equality)
  • this is used à la Java to represent the current instance of class
  • this needs to be the first parameter in protocol methods (for methods that take more than one parameter)
  • Use extend-protocol to implement a new protocol on any existing type… if you want
  • Analogies with OOP:
    • Records: classes and objects
    • Protocols: interfaces
    • Note the lack of inheritance, though!
  • reify lets you create a partial implementation of a protocol (likely for testing) that will work without implementing all of the protocol methods

14: Tests

  • Use clojure.test/run-tests to run all tests in a namespace from the REPL
  • Require org.clojure/test.check (external dependency) for property-based testing
  • Generators are cool
  • Use clojure.test/are for parametrised testing
  • At least one test per namespace isn’t a bad idea, just to ensure it actually exists and there are no egregious errors in it

15: Spec

  • Conceptually, you can think of clojure.spec as a convenience wrapper around regex
  • spec/or requires an even number of forms - keywords are required to help provide coherent failure messages
  • Use :req for required keys in a map and :opt for optional
  • Add the -un suffix to either if the keys in the map to validate are non-namespace-qualified (i.e., just regular maps with keywords for keys)
    • Without -un, for a map to be be valid, the keys listed would need to be fully qualified
    • The list of keys provided to spec/def need to be fully qualified when using :req-un because the resulting spec will be associated with whatever namespace from which spec/def is called
  • Don’t use fspec/instrument in production - they will slow things down a lot
  • clojure.spec.test.alpha/check provides a powerful way to do generative testing
    • You can use just during the testing phase while leaving instrumentation off in production

16: Interoperating with Java

  • Java packages are analogous to Clojure namespaces
  • (ClassName. arg1 arg2) to call a class’ constructor
  • (.getWhatever my-object) to invoke Java methods on Java objects
  • (.-fieldName my-object) to access an object’s field directly
  • There is no need to import java.lang - it is already on the classpath
  • Use Clojure-y syntax to call static Java functions: (File/createTempFile "my-file" ".txt")
  • Methods and static methods can be invoked like functions in Clojure code, but they are not functions
    • Things you cannot do with Java methods:
      • Bind an arbitrary symbol to one
      • Pass one in as an argument to a function
    • They are special forms
  • Java objects are, unless clearly stated otherwise, mutable

17: Threads, Promises and Futures

  • By default on the JVM, the main thread runs your code
  • Java’s Thread class accepts anything that implements the Runnable interface (Clojure functions do)
  • Threads are independent engines of computation - we don’t know how long one will take relative to another
  • Race conditions can occur any time you have two or more threads making changes to a shared resource
  • Threads keep dynamic vars in thread-local storage, so even though they are mutable, one the value within a binding (within a thread) is not accessible by another thread
  • Spinning up a new thread with Java interop is a fine way to get some work done in the background, but if you want to actually get the result of an operation carried out on a separate thread, use a promise
  • derefing a promise that has not yet been delivered to will cause the main thread to wait until there is a result
  • Think of a future as a promise that brings along its own thread
  • There is no need to deliver a result to a future, Clojure does this for you in a separate thread
  • Deref-ing a future, like a promise, also blocks the main thread
  • If you want to get some background processing done, use a future, not a promise - futures take care of spinning up and down new threads themselves
  • It’s generally better to use a higher level abstraction like future when explicitly using separate threads for work, but if you need something lower level, make use of thread pools in java.util.concurrent.Executors

18: State

  • Use atoms to manage mutable state
  • swap! is completely thread-safe!
  • swap!
    • Gets the current value of the atom
    • Calls the passed-in function on the value
    • Checks that the value in the atom is the same as when it started the process
    • If it is the same, return the updated value; if it is not the same, start the process again the later value of the atom
  • swap! executes entirely on the thread that calls it
  • refs are the grownup version of atoms
  • If you need to manage the state of multiple somewhat related values, a ref might be a good option
  • We can use dosync to wrap updates (using alter) to a ref in a database-like transaction
  • Atom-updating functions is a bad place to add side-effecting code - depending on what other threads are trying to do, this function might try to update the atom multiple times (the same applies to refs)
  • Updates to an agent happen asynchronously:
    • send will return immediately
    • The function used to update the agent will be queued along with any other functions that are trying to update it
    • Your function will actually update the agent sometime in the future
    • Agent-updating functions are called exactly once
  • It is also possible to wrap multiple sends in a dosync to create a transaction
  • When in doubt, reach for an atom over the other two options
  • If an update to an atom or ref fails, an exception is thrown; if an update to an agent fails, nothing happens on your current thread - the exception is thrown by the thread that takes care of updating the agent asynchronously, and will only be seen on the main thread the next time you try to update the agent

19 Read and Eval

  • Code is data
  • Clojure code is represented by Clojure data structures - this property is homoiconicity
  • read turns characters into data structures, eval turns data structures into code
  • Use meta to find a function’s metadata
  • There is probably no need to ever use eval in the wild
  • eval does not know about local let bindings
  • Only use read if you trust the data; use clojure.edn/read for dodgy data

20: Macros

  • Remember that the first thing a Clojure function does is evaluate its arguments (this can include side effects!)
  • Clojure code is just Clojure data
  • Macros are just functions that are part of the compilation process
  • Macros are applied to the code, not the runtime data
  • Macros makes themselves felt at two distinct times:
    • When the macro is being expanded - i.e., macro is writing some code
    • When the code that the macro wrote is executed
  • Do not expect to see any mention of a macro’s name in stack traces - just an auto-generated function name that the macro will have expanded to
  • Use # as a suffix for symbols if worried about name collisions (this will generate a unique name at expand-time)
  • Use macroexpand-1 to check what a macro plus its arguments will expand to
  • Do not try to use macros as ordinary functions (or when an ordinary function would do)!
  • Using an ordinary function should be your default choice!
    • Use a macro only after repeating yourself a lot; or
    • When bumping up against Clojure’s evaluation rules (à la the arithmetic-if example)