Elements of Clojure—Zach Tellman

Book notes from Elements of Clojure by Zach Tellman.

I created a GitHub project for the corresponding “code notes”.

Review

It’s hard to imagine a book like this being written about another language.

Chapter 1: Names

”Names should be narrow and consistent.”

Narrow means that the name cannot represent anything else
Consistent means that the name is congruent with the surrounding code and should not be misunderstood by someone familiar with the codebase
The textual representation of a name is its sign
The thing a name refers to is its referent
How a name is used is its sense
Narrowness does not equal specificity
Describe the purpose of the function, not its implementation
Consider a name’s sense when thinking about referential transparency
The only way to achieve true consistency is to have a one-to-one relationship between signs and senses
Favour synthetic names over natural names to avoid ambiguity
Synthetic names allow experts to communicate without ambiguity
Novices are forced to learn the lexicon if they want to participate—a monad has no sense to a layperson
Natural names allow everyone to reason by analogy—great to for quickly grokking a codebase, bad for ensuring reducing ambiguity
Choose accordingly!

Naming Data

The relationship between our code and the outside world can be adversarial—we should make invariant checks at the periphery of our code
vars provide indirection by hiding the underlying value; function parameters provide indirection by hiding the implementation of the invoking function
We don’t need to name every intermediate result when transforming data
Consistent code means fewer deep dives to understand a codebase’s core concepts
Being able to skim and quickly understand Clojure code is a function of the language’s syntax and use of immutable data structures (as well as an individual’s experience)

“If a function’s name is more self-explanatory than any name you can think of, it should be an anonymous function.”

Idiomatic Clojure names

Could be anything: x
A sequence of anything xs
Arbitrary function: f
Sequence of arbitrary functions: fs
Arbitrary map: m
Sequence of arbitrary maps: ms
Self-reference: this
Arguments of the same datatype: [a b c & rst]
Arbitrary expression: form

Narrowing

Maps of more narrowly named data, e.g.: class->students, department->classes->student
Tuples of more narrowly named data,. e.g.: tutor+student
A sequence of tutor-student tuples could be tutor+students, but this could be conflated with tutor-sequence of student tuples—a synthetic name here can remove ambiguity
Clearly document synthetic names!

Naming Functions

Our data scope at runtime is any data accessible by our thread
Functions can do three things: pull new data into scope, transform data, push data into a different scope
One function in every process needs to do all three, but most functions should do only one

”Shared mutable state creates asymmetric scopes.”

Functions that cross scope boundaries should have a verb in the name
Functions that pull data from another scope should have the returned type in their name
Functions that push data into another scope should communicate their side effect

If a function only transforms data, we should avoid verbs wherever possible.

Naming Macros

”There are two kinds of macros: those that we understand syntactically, and those that we understand semantically.”

If we are required to understand a macro syntactically, this is a poor form of indirection
Macros that include with, def or let in their name should have predictable macroexpanded forms
It is difficult for a macro to be self-evident—the macro-expanded form and semantics matter more than the name

Chapter 2: Idioms

Inequalities

Favour < and <= for
Infix, prefix
Left or right associative?

Defaults for accumulating functions

Offer every arity if a function accumulates
Lots of Clojure functions comes with sensible defaults for niladic functions—e.g., concat, conj
The niladic/zero-arity variant of a function returns the identity value
Combining the identity value with any other argument leaves the argument unchanged—e.g., 0 for +, 1 for *
Generally only the niladic and dyadic versions of a function will be interesting
Monoids are sets that have a dyadic function as well as an identity value (result of the niladic function)
Monoids are useful for passing to reduce
Do not implement an identity value for a function if an obvious one does not exist—an exception is better than unexpected behaviour

Option maps over named parameters

Using positional parameters and giving them default values means implicitly defining a hierarchy in terms of which parameters are most likely to change
Using an option map means that invocations of a function do not need to change even though the potential arguments have
Non-positional keyword arguments carry the performance overhead of having to construct a hash map each time the function is called (but option maps still carry a performance overhead relative to positional parameters)

In performance-sensitive contexts, we should only use positional parameters.

On the spectrum of positional arguments to non-positional keyword arguments to option maps, choose an option map unless you have a good reason not to (e.g., performance)

Bindings

No one should have to know you’ve used a binding
Bindings and dynamic vars break referential transparency (as do side effects)
Laziness, generally speaking, relies on referential transparency
Higher-order functions assume referential transparency - where and when a function that is passed in as an argument is called is an implementation detail
Large enough chunked sequences could have all-but-the-first chunks realised outside of a binding - this could lead to some hard-to-diagnose bugs
Consider with-redefs if you need to use a dynamic var

Favour atoms for mutable state

A state container’s utilisation is a measure of how often it is in the process of being updated
- Rule of thumb: you will see update retires increase dramatically when utilisation approaches 60%
Clojure’s software transaction memory (STM) implementations (i.e., refs) offer better throughput in pathological cases
- But you are highly unlikely to need this throughput lift, and should favour the simplicity and reliability of atoms
Lazy realisation of updates via alter, commute or ref-set can lead to hard-to-find bugs
Having state defined by multiple refs makes it difficult to get a consistent snapshot of our state at any time - because updates to one ref can happen while we are reading another, we need to wrap our reading of the entire state in its own transaction
STM is useful: in write-heavy workloads that can’t be offloaded to a database
You need to have a good reason to not represent mutable state as a single atom

Communicate side effects consistently

Valid idioms for implying a side effect is occurring:
- Redundant do block
- Leaving whitespace around the side-effecting function
- Binding the return value of the side-effecting function to _ within a let block
All are fine, just use one consistently within a codebase

Use data structure-specific functions where possible

Just because two ways of doing something are functionally equivalent doesn’t mean they are equal to the reader
If it matters that the data structure is a vector or map or set, use the narrowest possible accessor for said data structure
- Do this hand-in-hand with intentional naming (e.g., for a map: key->value)
- E.g., for a map k->v, (keys k->v) is much better than (map key k->v) - the former can only be a map

Don’t bother with `letfn`

If you need a let-bound function, just split let and fn for readability and consistency with other special forms that use bindings
Being able to forward declare functions from one another within the binding of letfn is one (and maybe the only) advantage of using letfn

Don’t obfuscate Java interop

Clojure data structures are understood by their internals
Java data structures are understood by their names
Stay away from anything (e.g., the .. macro) that can make it non-obvious that a Java method is being used on a Java object rather than a Clojure function

You don’t always need a transducer; use `for` for cartesian products

Think carefully about sacrificing readability for performance when introducing transducers
for’s best quality is to provide a readable way (without any :when or :let clauses) to generate all the possible combinations of lists - i.e., the cartesian product

Nil

Ambiguity makes our code more concise, but unbounded ambiguity makes it impossible to reason about
We must interpret nil at regular intervals throughout our code - a synthetically named keyword might be better at communicating absence than the generic nil
Don’t wrap in when!
- If the result can be coerced to an empty collection (or something else sensible), do that
- Or else throw an error
- Don’t pass back a generic nil to the caller

Chapter 3: Indirection

Indirection provides separation between what and how
It allows readers of our your code to be able to stop reading at some point and be OK with not dissecting the underlying implementation of a function or macro
Two fundamental components of indirection:
- References (each paired with its referent)
  - A name is a lexical reference that is dereferenced at compile time
  - A pointer is a memory reference that is dereferenced at runtime
- Conditionals
Generally speaking, dereferencing happens implicitly - e.g., in passing the symbol (reference) for a sequence to map, the referent is implicitly accessed so that a function can be applied to each element
- This is not the case for Clojure’s stateful constructs (e.g., atoms)
Conditionals are used for deciding functionality based on some input (e.g., input type, input value, input value relative to something else (index))
- This can mean grouping types together to make use of a generic function
- Or splitting behaviour based on type (or otherwise)
References are open - we can change the behaviour of a program by changing a referent
Conditionals are closed - we need to change the underlying implementation of the code to change the program’s behaviour
- Just varying the arguments to a conditional isn’t enough - conditionals are ordered

✎ Blair's Notes

Explorer

Elements of Clojure—Zach Tellman

Review

Chapter 1: Names

Naming Data

Idiomatic Clojure names

Narrowing

Naming Functions

Naming Macros

Chapter 2: Idioms

Inequalities

Defaults for accumulating functions

Option maps over named parameters

Bindings

Favour atoms for mutable state

Communicate side effects consistently

Use data structure-specific functions where possible

Don’t bother with `letfn`

Don’t obfuscate Java interop

You don’t always need a transducer; use `for` for cartesian products

Nil

Chapter 3: Indirection

Graph View

Table of Contents

Backlinks

✎ Blair's Notes

Explorer

Elements of Clojure—Zach Tellman

Review

Chapter 1: Names

Naming Data

Idiomatic Clojure names

Narrowing

Naming Functions

Naming Macros

Chapter 2: Idioms

Inequalities

Defaults for accumulating functions

Option maps over named parameters

Bindings

Favour atoms for mutable state

Communicate side effects consistently

Use data structure-specific functions where possible

Don’t bother with letfn

Don’t obfuscate Java interop

You don’t always need a transducer; use for for cartesian products

Nil

Chapter 3: Indirection

Graph View

Table of Contents

Backlinks

Don’t bother with `letfn`

You don’t always need a transducer; use `for` for cartesian products