functional thinking applying the philosophy of functional programming to system design & architecture Jed Wesley-Smith @jedws
please, ask questions
functional programming has many bene fj ts: better program reasonability, composition, refactorability and performance yet, the dominant models & paradigms for software architecture and building software systems today remain rooted in mutation and side-e ff ects many of the ideas and principles of functional programming have been applied to solve design problems including security, concurrency, auditing and robustness it is possible and desirable to apply them to all of the systems we build, and gain practical advantage from doing so
Q. is the universe mutable?
what is change? what about the past? what is now?
x = x + 1 the fundamental absurdity at the heart of programming
x = x + 1 is a low-level execution plan for speci fj c hardware, not a fundamentally important strategy to how we should write programs it only makes any sense as an execution strategy when associated with the most common Von Neumann style hardware architectures it makes less sense at any higher-level of abstraction, but we learn fairly early on that this is how programming works! it makes even less sense as a way we should design software in the large
x = x + 1 the biggest problem is that we can only know one value for x : what it is or what it was the value of x is ephemeral , we forget what it was!
x = x + 1 we all know global mutable variables are to be avoided unfortunately, many of our common storage systems use exactly the same paradigm UPDATE person WHERE person_id = 123 SET phone_no = “+61 2 9876 5432” same goes for writing to a fj le same goes for most REST interfaces… pure functional programming , in the large, rejects this approach to programming!
what is functional programming?
programming, with functions!
a function f : A -> B relates one value from its domain : A to exactly one value from its codomain : B always the same – or equivalent – value and nothing else! this is also known as a pure function , because programming de fj nes impure ones too
programming with values!
values immutable, values cannot change easily shareable without concern for concurrent modi fj cation referentially transparent expressions can be replaced with their computed value the state of a thing in an instant in time is a value
“we invented mutable values, we must uninvent them” –Rich Hickey, Simple Made Easy
what about things ?
identity what we think of as the things around us; you, me, the plants and animals, rivers and mountains, are identities identities are things we name we are used to thinking of the world in terms of identities, they are the objects in our world
the philosophy of functional programming
since the time of Plato and Aristotle, philosophers have posited true reality as “ timeless, based on permanent substances, while processes are denied or subordinated to timeless substances if Socrates changes, becoming sick, Socrates is still the same, and change (his sickness) only glides over his substance: change is accidental, whereas the substance is ” essential. http://en.wikipedia.org/wiki/Process_philosophy
no one ever steps in the same river twice, for it's not the same river and it's not the same person – Heraclitus
an identity is a series of values over time
reifying time
time requesting the current time is not a function , it always gives a di ff erent answer! as we are functional programmers, we recognise now is a side-e ff ect we usually model side-e ff ects as explicit things, commonly via a type such as IO now :: IO Time (java) public IO<Time> now() IO is a value describing how to perform a side-e ff ect which we can run later now is a pure function as it returns a value
f : A -> B
f : A -> B
f : A -> B f : A -> T -> B
f : A -> B f : A -> T -> B
a -> t1 -> X
a -> t1 -> X a -> t2 -> X'
change
X + Δ = X' X' - X = Δ X' - Δ = X we can store entire versions, or we can store deltas, or patches* they are equivalent being in possession of any two allows us to traverse time * http://liamoc.net/posts/2015-11-10-patch-theory.html
architecture in the Real World™
architecture in the New World
new world it wasn’t that long ago that computation was expensive, disk storage was expensive, DRAM was expensive, but coordination with latches/locks was cheap now, all these have changed using cheap computation (with many-core), cheap commodity disks, and cheap DRAM and SSD coordination with latches/locks gets harder because latch latency loses lots of instruction opportunities; with branch prediction and deep CPU pipelines, accessing main memory is now much more expensive increasingly, applications are distributed, often globally, however we still use paradigms invented in the old-world with old-world assumptions
new world the new world is increasingly distributed distribution brings enormous problems, including increased latency and unreliability conventional consensus techniques (locks and transactions) impose intolerable constraints: • locks do not compose • distributed transaction protocols are bespoke and not widely supported • latency costs are enormous, huge performance hit • distributed transactions have fundamental problems
new world, latency https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
new world, latency https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
“the unavoidable price of reliability is simplicity.” – C. A. R. Hoare
accountants do not use erasers
how do we change without mutating?
if an identity is series of values over time, can we model things so we keep old and new values?
this is known as append-only computing
double-entry bookkeeping the fj rst functional architecture, codi fj ed in 1492 by Luca Pacioli central to bookkeeping — then and now — are its three books: the memorandum , the journal and the ledger the memorandum records all transactions as they happen the journal records the detailed transcription of these transactions, involving debiting and crediting speci fj c accounts the ledger is generated by posting the journal entries to the individual accounts balance is fundamental and is key to the correctness protocol
event sourcing event sourcing is a name given to the practice of storing a journal, or stream, of changes the changes can be deltas or full versions, depending on e ffi ciency and other factors it is possible to reconstruct the state of entities in the stream at any point in time serves as a complete history or audit log of changes a single event stream usually serves as a unit of consistency, or shard, and may have one or more entities contained within it
event sourcing given a source of events and a function to fold (or reduce) over them we can produce a “current” version of a value, or construct a view at any previous time given the ability to save event values, we can continuously feed new events into our fold function, using the old persisted value as the seed and produce our new “mutated” value, which we can also persist this persistence strategy is now decoupled from the source of the data (the events) we can have multiple views of our data, each of which can be tailored for speci fj c purposes, such as query optimised storage
event sourcing we can store events that patch, or “mutate” the previous value additionally, we can store derived facts , such as the complete value after a number of updates – sometimes known as an epoch event sourcing fj ts very well with Command/Query Request Separation (CQRS) as an architectural practice – allowing separate deployment of specialised query services, and tailored load-balancing strategies for di ff erent access patterns an event stream is easily distributable, with various strategies available for consensus on write, depending on requirements
content-addressable storage fj les are stored at an address computed from their content: a content hash names are associated with a hash retrieval looks up the current hash for a name, then accessing the content stored at that address update adds new content, then a new (name, hash) pair caches only cache content at a hash, not at a name, avoiding concurrency issues
git: version control system non-linear development, branching/merging distributed development, changes must be shareable between repositories that are not necessarily connected cryptographic authentication of history, the ability to uniquely identify the complete development history of any change to the resources in a repository
Recommend
More recommend