Friday, December 09, 2016

Build Yourself a Robo-Advisor in F#. Part I : Domain Modelling


This article is part of the F# Advent Calendar 2016

It is based on the first part of the ‘Build yourself a Robo-Advisor in F#’ workshop, as presented at the 2016 F# Progressive F# Tutorials conference in London.

Introduction


Most of us aren’t saving enough for our future. This isn’t something that’s going to go away - we’re living longer and many of us are underestimating how much money we need for a comfortable retirement. How can you solve this problem? By harnessing the power of F# to deliver clear and easy-to-understand advice and recommendations that takes you on a journey from confused to confident.

Over the last couple of years, robo-advisors have emerged as a platform for automating this advice as one part of the Fintech revolution. In this short series of posts we will build a web-based robo advisor that will show you how much money you might have in retirement.

Along the way you will discover how to:

  • Understand a domain and model it with the F# type system
  • Leverage F#’s data capabilities and make a first program that can chart our projected savings
  • Take it to the next level by building a web-based robo-advisor using one of F#’s fantastic open-source web app frameworks

Part 1 - Domain modelling


A good first step towards building a new piece of software is to model the domain: what exactly is it we’re getting the software to do? What are the real-life equivalents of the code we want to write?

What is our domain?


We boil our domain down to one example question:

How much money will I have when I retire?

More generally, we want to take someone and calculate how much their pension pot will be worth when they retire.

The bold sections give us our basic domain:

  • People
  • Money
  • Time


How can F# help us model it?


Algebraic Data Types


Discriminated Unions allow us to express real-world concepts almost directly in code

Let’s take a really simple one with only two options, the boolean. How to we write this in F#?

type Bool = True | False

What if we want to say that something can either have a value or not?

type Option<'a> = 
| Some of 'a
None

What about if something can fail?

type Result<'TSuccess, 'TFailure> = 
| Succcess of 'TSuccess
| Failure of 'TFailure

And what if we want to distinguish one type of string from another?

type EmailAddress = EmailAddress of string

That’s all pretty useful! We can combin them with Records. Immutable records have a concise syntax and structural equality which make them really powerful for modelling. Here’s an example:

type Car = {
  Engine : Engine
  Wheels : Wheel list
  Seats : ... }


Pattern matching


Once we have our unions, we have a type-safe switch alternative in the form of pattern matching. These give us a one-step match & decompose workflow, and for extra benefit the compiler catches non-exhaustive match so that we are warned if we’ve added a case and not properly handled it.

Here’s an example using a Result:

let handleResult = function
| Success result -> "Woop! It all worked, the answer is: " + result
| Failure error -> "Uh-oh. Something bad happened: " + error


Units of measure


The question we want to answer here is:

How do you represent a currency in code?

One way to do so is with Units of Measure. With these we can make sure something has the right unit, as well as creating derives measures ( e.g kg m / s^2)

We can also reduce errors in conversions & calculations. How bad can this really be? Take a look here!


What did we come up with?


Using what we’ve just discussed, here’s a minimal domain model for figuring out how much money someone is likely to have in retirement.

module Domain =
  open System 

  type Gender = Male | Female

  type EmailAddress = EmailAddress of string

  type Person = { 
      Name : string
      Gender : Gender
      DateOfBirth : DateTime
      EmailAddress : EmailAddress option }

  type ContributionFrequency = Monthly | Annual

  type ContributionSchedule<[<Measure>] 'ccy> = {
      Amount : decimal<'ccy>
      Frequency : ContributionFrequency }

  type PensionPot<[<Measure>] 'ccy> = { 
      Value : decimal<'ccy>; 
      ValuationDate : DateTime; 
      ContributionSchedule: ContributionSchedule<'ccy> }

  [<Measure>] type GBP
  [<Measure>] type USD
  [<Measure>] type EUR

  let valuePensionPot<[<Measure>] 'ccy> (pensionPot : PensionPot<'ccy>) (valuationDate: DateTime) = 
     match pensionPot.ContributionSchedule.Frequency with 
     | Monthly ->
        let monthsInFuture = (valuationDate.Year - pensionPot.ValuationDate.Year) * 12 + valuationDate.Month - pensionPot.ValuationDate.Month
        pensionPot.Value + decimal monthsInFuture * pensionPot.ContributionSchedule.Amount
     | Annual ->
        let yearsInFuture = valuationDate.Year - pensionPot.ValuationDate.Year
        pensionPot.Value + decimal yearsInFuture * pensionPot.ContributionSchedule.Amount

I think this code is pretty short and easy to understand given the complexity of the domain. We’ve used a wide variety of features that helped us along the way.



Talks


Scott Wlaschin NDC: https://vimeo.com/97507575

Tomas Petricek Channel 9: https://www.youtube.com/watch?v=Sa6bntNslvY

Mark Seeman NDC: https://vimeo.com/131631483


Blogs


http://fsharpforfunandprofit.com/ddd/

http://tomasp.net/blog/type-first-development.aspx/

http://gorodinski.com/blog/2013/02/17/domain-driven-design-with-fsharp-and-eventstore/


Wrapping up


In this part you’ve seen how we take take a real-life domain and come up with a concise and accurate model for it using F#.

We’ve got some great language features to model domains in F#! If you are thinking about how to represent things in code, the ideas in this post are a great starting point.


Next time


Charts and data! You’ll see how to take the domain we’ve just created and chart it in a Desktop app with FSharp.Charting. We’ll also look at accessing data via type providers.

Wednesday, October 26, 2016

Back to the Start


This week, I’ve had the dubious pleasure of doing a code review of my first ever production F# code. The application behaviour needed to be changed, and so another team member has had to look at my F# code and change it.

As we all know, looking at your old code can often make you shudder, so when it’s some of the first real code you wrote in a particular language that feeling gets even stronger!

I thus went about reviewing the changes with three primary questions in mind:

  • Do the changes to the code follow the right ‘architecture’? (i.e. what one normally looks for in a code review)
  • Is there anything in the original version that should definitely be changed?
  • How do I feel now, reading the original version?

What did I learn?


Point-free problems


In my haste to be idio(ma)tic, I made a number of my main API functions point-free.

As the functions are called from C# projects, this has the unfortunate effect of making them compile down to instances of the FSharpFunc class, meaning that it we have a function f with a tacit parameter x, it must be called in C# using f.Invoke(x). Yuck!

Instead, if we explicitly write the parameter (so let f = (...) becomes let f x = (...)x, the function compiles to a static method and can be called using f(x). Much better!

This kind of hidden, gnarly situation is one that should have made me think during the original implementation, but I was clearly blindsided by my newly-acquired functional programming knowledge and keen to apply unnecessary techniques.


Option and Result aren’t obvious


My absolute favourite thing about F# is its ability to do domain modelling — I challenge you to watch this video and not fall a little bit in love with the ML type system.

Once you get the hang of it, things come very naturally: when things might fail, use Result rather than throwing exceptions. When they might be missing, use Option rather than null. When you want something that is a string but not all strings are valid, use a single-case DU.

If you haven’t seen this stuff before, then exceptions, null, and plain old strings will still be your go-to constructs.

This made me realise the importance of sharing around my knowledge, specifically when it comes to F# domain modelling. A large part of my upcoming F# workshop will focus on how to model complex domains with the F# type system, and if I can share this with the community I can definitely share it with my team-mates!


Custom operators confuse


Just by looking, do you know what the operators =, & and | do?

Boolean logic, right?

What about >>=, >=> and <*>?

Er, what?

If you’re a keen functional programmer you might know that the latter three are used in monadic composition and represent bind, Kleisli composition and apply. You might even be able to look at your function signature and know exactly which of the three you need to glue your functions together.

Or, you might not.

It’s a bit of a crime to assume too much theoretical knowledge on the part of your collaborators (present and potential future). If you understand something only through a lot of reading and research, don’t kid yourself that the next person to read your code will be so impassioned by the new concepts to bone up on them for hours on end.

Keep it simple! At the very least, for every custom operator you need to expose the underlying function that it represents.

Being more prudent, you shouldn’t be using these kind of operators in code that’s (by some measure) likely to be often modified. Sure, if it’s a core library with some well-known functions then use custom operators sparingly, but if your business logic is peppered with (>=>) you’re doing it wrong.


Help your Discriminated Unions


Some features of F# need a little work to carry them over to C#. The main one I’m thinking about is the discriminated union (of which I include Option<T> as a member).

Typically I end up writing a few helper functions to deal with using DUs in C#:

  • ToString and FromString functions (basically these) that allow you to use your custom DUs from C# code by writing a simple wrapper for your type.
  • Specific C# helpers for the Option type that wrap the F# IsSome and IsNone types, as well as something to create a new FSharpOption.
  • An OptionConverter based on this one that enables (de)serlialsation using Json.NET.

What became clear was that it’s not obvious that these even exist to another developer, let alone how and when to use them.

The solution: curate these methods, put them in a separate project and share them using a NuGet package.

Alternatively, I could use something like F# extras which no doubt do what I want (but also do a lot, lot more!)


Guard your patterns


When I looked over the code, I realised that I didn’t ever use a guard in one of my match expressions.

There were definitely times when I should have done — take a look at this line:

if input.AnalysisDate = DateTime.MinValue || input.AnalysisDate = (DateTime.FromOADate 0.0) then 
    Failure "A valid analysis date must be supplied"
else Success input

In my view, this can be rehashed as:

match input.AnalysisDate with
| x when x = DateTime.MinValue || x = (DateTime.FromOADate 0.0) 
    -> Failure "A valid analysis date must be supplied"
| _ -> Success input

At the time of writing there are 16 pattern types of which I have probably used six.

Advanced pattern matching features are very powerful, and I don’t think I’ve used them well enough in my F# work so far.


Get piping!


One of the elements of F# that took a while to come naturally was function pipelining.

This is pretty strange — I have a C# background and have used these techniques extensively under the guise of IEnumerable<T>.

For whatever reason, when writing F# code it was more natural to put g (f (x)) than x |> f |> g. If I’m honest the temptation to write the former still lingers, but when reading real code this gets ugly, quickly.

One point I’m still mulling over is using the double and triple pipe operators. If anything I think they clear up the code, but for many they might serve to confuse.

Over to you

Here’s a real function from my code that gets data from a web service asynchronously, written three ways:

No pipelining:

let getSchemeAssetAllocation handler schemeName analysisDate = 
     Async.RunSynchronously (getAssetAllocationAsync handler schemeName analysisDate)

Single pipe only:

let getSchemeAssetAllocation handler schemeName analysisDate = 
    getAssetAllocationAsync handler schemeName analysisDate |> Async.RunSynchronously`

Allowing the triple pipe:

let getSchemeAssetAllocation handler schemeName analysisDate = 
    (handler, schemeName, analysisDate)
    |||> getAssetAllocationAsync 
    |> Async.RunSynchronously

Which version would you pick: 1, 2 or 3?

I’m still not sure what I prefer.


Brackets aren’t banned


I like not writing curly braces. I shouldn’t have extended this to parentheses.

I remember reading in Real-World Functional Programming that the preferred F# style is to include brackets for single-argument functions (so writing f(x) rather than f x.

I can definitely see why now — looking at code written using the latter style is a little disconcerting, and simple lines like let validate = validateLinks handler looks much more confusing than they actually are. For whatever reason, writing as let validate = validateLinks(handler) makes it much clearer what’s going on — we are calling the validateLinks method and passing handler as a parameter.


Conclusions


I’ve learnt a huge amount from returned to old code, especially as it was written with very little real-world F# knowledge.

There are definitely some specific things I’d like to act on — writing a small common library to help working with Discriminated Unions, and sharing my thoughts on domain modelling with F# are two clear actions.

Other than that, my main hope is that when writing production F# code in the future, I do so in a manner than makes it clearer for others (and myself) to read and modify.

Thursday, October 06, 2016

Bottom-up Akka.NET using F#


Ever tried to take an existing application and convert it to use the Actor Model? Not easy, right!

It seems like there is no easy way to make ‘a bit’ of your app use Actors — but in this post I’m going to show you how.

We’ll being with a quick overview of the Actor Model, focus in on Akka and Akka.NET, then think about how we design actor systems.

I’ll then show you a slightly different way to work with Actors, one that leverages the Akka.NET F# API and helps you build Actor systems from the bottom-up.

Introduction to the Actor Model

To give a whirlwind overview, the Actor model is a way to do concurrency that treats ‘Actors’ as the unit of computation. The main alternative in the .NET world is multithreading, compared to which Actors are meant to be extremely lightweight and thus support more concurrent computations.


Akka and Akka.NET

One framework for creating Actor sytems in .NET is Akka.NET, a port of the Akka framework that runs on the JVM and it written in Scala. Though it is written in C#, there is also an F#-specific API that allows you to write more idiomatic functional code compared to calling their C# methods directly in F# code.

For a more hands-on introduction, I recommend the Akka.NET bootcamp (there is a C# version as well as one using F#). Written by the makers of Akka.NET, it it easy to follow and gives you an idea about how to create Actor systems.


Designing actor systems

As with all broad topics, there is no one true method for designing Actor systems, and indeed there is no one type of application that you would build using the Actor model. Overall, there are probably just two principles that we must adhere to:

  • Actors do not expose any state
  • Actors communicate through immutable messages

However, when we see examples using the Actor model, they most often take the form of whole-app, top-down systems where ‘everything is an actor’. Case in point: this post from the creators of Akka.NET.

This is a good way to design systems, but what if you don’t have the luxury of re-wiring your whole application yet still want to try and use actor-based concurrency?

In this post, I’ll show you how to start using the actor model in just part of an existing application


F# API and libraries

F# API

The actor computation expression


Using Actors, Bottom-up

It’s time to see some code!

I’ve based this project loosely on the content in Visualizing Stock Prices Using F# Charts. I will take a system that gets stock data from Yahoo finance and charts it on Windows Forms, and use Akka.NET actors to concurrently retrieve the data.

The idea behind that is that, were you to build this system for real, going to Yahoo for every bit of data would soon get pretty slow. Rather than use Threading and Tasks to process the data requests in an asycnhronous and potentially parallel manner, let’s see if we can harness the speed and low footprint of Akka.NET actors instead.


Data Retrieval

open FSharp.Data
open System

type Stocks = CsvProvider< AssumeMissingValues=true, IgnoreErrors=true, Sample="Date (Date),Open (float),High (float),Low (float),Close (float),Volume (int64),Adj Close (float)" >

let url = "http://ichart.finance.yahoo.com/table.csv?s="
let startDateString (date : DateTime) = sprintf "&a=%i&b=%i&c=%i" (date.Month - 1) date.Day date.Year
let endDateString (date : DateTime) = sprintf "&d=%i&e=%i&f=%i" (date.Month - 1) date.Day date.Year

let getStockPrices stock startDate endDate = 
    let fullUrl = url + stock + startDateString startDate + endDateString endDate
    Stocks.Load(fullUrl).Rows
    |> Seq.toList
    |> List.rev

Charting

open FSharp.Charting
open FSharp.Charting.ChartTypes
open System
open System.Windows.Forms

let defaultChart = createChart (new DateTime(2014, 01, 01)) (new DateTime(2015, 01, 01))

let getCharts (tickerPanel : Panel) mapfunction (list : string []) = 
    let sw = new System.Diagnostics.Stopwatch()
    sw.Start()
    let charts = mapfunction defaultChart list
    let chartControl = new ChartControl(Chart.Combine(charts).WithLegend(), Dock = DockStyle.Fill, Name = "Tickers")
    if tickerPanel.Controls.ContainsKey("Tickers") then tickerPanel.Controls.RemoveByKey("Tickers")
    tickerPanel.Controls.Add chartControl
    sw.Stop()
    MessageBox.Show(sprintf "Retrieved data in %d ms" sw.ElapsedMilliseconds) |> ignore

Running these either in sequence or using Task-based parallelism is then very simple:

let getChartsSync (tickerPanel : Panel) = getCharts tickerPanel Array.map
let getChartsTasks (tickerPanel : Panel) = getCharts tickerPanel Array.Parallel.map

Converting to actors

The first thing we do is define the messages that will be passed around our actor system. A DataMessage is one that will be passed around the top-level actor responsible for collecting data — it will either say ‘get me the data for these tickers when coming in from our application’ or ‘I have data for this ticker’ when going back out. The DrawChart message will tell an actor to get the data from Yahoo, and we have implemented a very basic caching strategy which means we need a way to clear the cache — here, just a simple message!


type DrawChart = 
    | GetDataBetweenDates of StartDate : DateTime * EndDate : DateTime
    | ClearCache 

type DataMessage = 
    | StockData of string * Stocks.Row list
    | GetData of string []

We next define the actor responsible for getting a single ticker’s data from Yahoo. This is the tickerActor. It is implemented as two mutually recursive actor computation expressions that correspond to a mini FSM implementation — it starts in the start doesNotHaveData, when it receives a message to get data it does so, passes the data back to the message sender, and moves to the hasData state. In this state, further requests for the same data can be serviced instantly, and a request to clear the cache puts us back as doesNotHaveData. You can also see how easy it would be to remove this caching feature — the commented out line where the actor kills itself after getting the data is all it would take!

let tickerActor (ticker : string) (mailbox : Actor<_>) = 
    let rec doesNotHaveData() = 
        actor { 
            let! message = mailbox.Receive()
            match message with
            | GetDataBetweenDates(startDate, endDate) -> 
                let stockData = StockData((ticker, getStockPrices ticker startDate endDate))
                mailbox.Sender() <! stockData
                //mailbox.Self <! (PoisonPill.Instance)
                return! hasData (stockData)
            | ClearCache -> return! doesNotHaveData()
        }

    and hasData (stockData : DataMessage) = 
        actor { 
            let! message = mailbox.Receive()
            match message with
            | GetDataBetweenDates(_) -> 
                mailbox.Sender() <! stockData
                return! hasData (stockData)
            | ClearCache -> return! doesNotHaveData()
        }

    doesNotHaveData()

Next, we define the actor that will take multiple ticker requests, dispatch each to a tickerActor, and wait for them all to come back. This is the gatheringActor. Again this uses mutual recursion, here we start in the waiting state until the application asks for tickers. We then get the address of the tickerActor instances responsible for getting that data (the ActorRef will just be the ticker name), creating a new actor if we don’t already have one for that ticker. The gatheringActor then changes state to gettingData, which starts off knowing how many sets of ticker data it is awaiting. Every time it gets some it decreases this value, when it’s waiting for no more it draws the ticker data onto the WinForms chart.

let gatheringActor (tickerPanel : Panel) (sw : Stopwatch) (system : ActorSystem) (mailbox : Actor<_>) = 
     let rec waiting (existingActorRefs : IActorRef Set) = 
           actor { 
               let! message = mailbox.Receive()
               match message with
               | GetData d -> 
                   sw.Restart()
                   let existingNames = existingActorRefs |> Set.map (fun (x : IActorRef) -> x.Path.Name)
                   let newActors = existingNames |> Set.difference (Set.ofArray d)

                   let newActorRefs = 
                       [ for item in newActors do
                             yield spawn system (item.ToString()) (tickerActor (item.ToString())) ]

                   let combinedActorRefs = existingActorRefs |> Set.union (Set.ofList newActorRefs)
                   let tell = fun dataActorRef -> dataActorRef <! (GetDataBetweenDates(new DateTime(2014, 01, 01), new DateTime(2015, 01, 01)))
                   Set.map tell combinedActorRefs |> ignore
                   return! gettingData (Set.count combinedActorRefs) combinedActorRefs []
               | _ -> return! waiting (existingActorRefs)
           }

       and gettingData (numberOfResultsToSee : int) (existingActorRefs : IActorRef Set) (soFar : (string * Stocks.Row list) list) = 
           actor { 
               let! message = mailbox.Receive()
               match message with
               | StockData(tickerName, data) when numberOfResultsToSee = 1 -> 
                   let finalData = ((tickerName, data) :: soFar)
                   createCharts tickerPanel finalData
                   sw.Stop()
                   MessageBox.Show(sprintf "Retrieved data in %d ms" sw.ElapsedMilliseconds) |> ignore
                   return! waiting existingActorRefs
               | StockData(tickerName, data) -> return! gettingData (numberOfResultsToSee - 1) existingActorRefs ((tickerName, data) :: soFar)
               | _ -> return! waiting existingActorRefs
           }

       waiting (Set.empty)

Finally, we create our actor system when initialising our Windows Form. Note that we need a bit of hocon that makes things play nicely with the UI thread.

let sw = new System.Diagnostics.Stopwatch()
let gatheringActor = spawn system "counters" (MyActors.gatheringActor tickerPanel sw system)
<hocon>
  <![CDATA[
      akka {
        actor{
          deployment{
            /counters{
              dispatcher = akka.actor.synchronized-dispatcher
            }
          }
        }
      }
  ]]>
</hocon>

Was it faster?

A little bit.

The sequential access was slowest of all, as expected, and took an average of 550 ms to retrieve 10 tickers.

The task-based method took an average of 185ms .

The first call to retrieve data from actors took 162ms, with subsequent requests around 30ms due to the caching implementation.

This isn’t a huge performance bonus, but with only 10 requests anything should work ok!


Summary

Well, that wasn’t so bad (?!).

I think it’s clear that there’s a lot more code involved in setting up an actor system, compared to using task-based parallelism.

Is it worth it? Hard to say. For a small application such as this, probably not. When building something that might scale to the point where it needs distributing over multiple machines, it’ll be a different story.

The main point to take away is that we didn’t have to start the process with actors in mind. We took an existing app, found the bit that could potentially benefit from the actor concurrency model, and converted only that section to use actors.


Tuesday, September 20, 2016

Prog F# Sneak Preview


It’s official! I’ve been given the opportunity to prsent a workshop at the upcoming Progressive F# Tutorials. This is a fantastic event and something which seems almost unique to F# as a language — I’ve certainly not heard of the like in the C# world!

It’s taking place on 5th-6th December at the excellent CodeNode facility of Skills Matter. As well as being a super-cool place, it has the added bonus of being just around the corner from my office.

To get you in the mood (early bird tickets are still available!) I thought I’d give you a sneak peek into the event by answering a few questions that the hosts have put to me:

I am excited to joining Prog F# where I will be sharing my thoughts on…

FinTech in F#!

This is an area that I work with every day at Redington, where we’re trying to make 100 million people financially secure, and I think it has the potential to improve people’s lives — if applied properly!

The problem in a nutshell?

Most of us aren’t saving enough for our future.

This isn’t something that’s going to go away - we’re living longer and many of us are underestimating how much money we need for a comfortable retirement.

How can we help solve this problem?

By harnessing the power of F# to deliver clear and easy-to-understand advice and recommendations that takes you on a journey from confused to confident.

Over the last couple of years, robo-advisors have emerged as a platform for automating this advice as one part of the Fintech revolution.

If you come along to my workshop, you’ll get to build a fully functioning web-based robo advisor that will tell you if you’re on track to hit your savings goals, and give you recommendations if you aren’t quite there yet.

The best thing about being part of the F# community is…

The sheer number of incredibly useful open-source projects that people dedicate their free time to. You’ll see them used in any F# talk or workshop, whoever the speaker, and I plan to be no exception!

Ones that spring to mind are Ionide, Paket, Fake and Suave. Each one solves a real problem that F~ developers have and hugely improved the experience of F# devs, new and old!

At Progressive F# Tutorials, I most look forward to learning about…

How F# can be used for Web development. I’ll admit to having used the language primarily for back-end work - domain modelling, business logic, financial calculations, etc.

What I’d love to get out of the upcoming tutorials tutorials is a bit more understanding of how to get F# playing nicely with the internet — I’m told it’s the future, after all.

My talk will be enjoyed by…

Anyone who wants to see the power and flexibility of F# as a language.

Ever wondered how to really model your domain as it is in real-life? I’ll show you how, using F#’s algebraic data types.

Still wrangling with csv files to get your data into your app and creating fiddly charts? F# can help by way of a type provider and a brilliant charting library.

Unsure how to get your content onto the web without using a big heavy framework? Enter Suave and WebSharper!

I think the most anticipated/exciting development for the F# community over the next 12 months will be…

The evolution of the community-driven projects that I mentioned previously, into the ‘dominant’ tool or framework in their area. I think they are tantalisingly close to replacing the out-of-the-box tools that you get form something like Visual Studio, and am looking forward to the day that they do so!


Wrap-up

I’m pretty excited about giving this workshop — I get to show off the power of F#, and use it to build something that I think could really help people!

Thursday, September 01, 2016

The United States of Monad


A couple of months ago I was chatting with approximately half of my readership about Monads, and more specifically this post which asked: Why do we have monads?

Two of the reasons I gave were exception handling and state. I have already touched on the former, and again refer you to Railway Oriented Programming for a brilliant explanation of how to handle exceptions in a purely functional manner.

What you’ll learn today

  • Why we need to handle state functionally in the first place
  • How to convert imperative for and while loops to functional ones that embed the local state
  • How this can be formalised using the State monad and F# computation expressions
  • How to extend this to keep track of a global state without resorting to a mutable variable

As ever, the code is on GitHub.


Handling state functionally

State is a funny old thing. It seems pretty harmless in small doses, but trying to control state globally and across threads and processes is a nightmare.

If we handle state using purely functional techniques we can control this — we will see how it is threaded through our computations rather than simply ambient, which allows us to make stronger statements about the concurrency and parallelism of our program.

Before we see that, let’s go back to basics and look at handling state in the smallest scope possible, a loop.


Keeping it in the loop

Let’s consider a very simple example. We have an action, say printing to the console, that we want to do a set number of times. Each time we do it, we want to also print how many times we’ve done it, i.e. the state of the computation. Keeping it even simpler, let’s fix the number of iterations:

let startVal = 1
let endVal = 10

We can achieve our goal in a number of ways, the first being a standard for loop. This doesn’t look like it’s impure, but we have a variable i below that will be mutated on each iteration of the loop body:

let forLoop act = 
    for i = startVal to endVal do
        act (i)

The first step towards a functional approach is to notice that a for loop is the same as a while loop that has a mutable integer variable. We thus rewrite:

let whileLoop act = 
    let mutable i = startVal
    while i <= endVal do
        act (i)
        i <- i + 1

We have seen how to eliminate these kind of variables before using an inner recursive function. In the method below we are threading the state through the computation as the second parameter of our aux function:

let recLoop act = 
    let rec aux i = 
        match i > endVal with
        | true -> ()
        | _ -> 
            act (i)
            aux (i + 1)
    aux startVal

We now have state being treated in a pure, functional method. No variables are being mutated,

Let’s run the code:

let action i = printf "%d " i

forLoop action
printf "%s" System.Environment.NewLine
whileLoop action
printf "%s" System.Environment.NewLine
recLoop action
printf "%s" System.Environment.NewLine

Thankfully, the output is exactly the same in each case!

What have we learnt?

We can handle state in a functional manner by recursive function calls that take the state as an input.


Monads, monads everywhere

What on earth does the above have to do with monads?

Let’s take our example a step further and say that we want to do something really complicated, like summing a list of integers. In this instance, all of the ‘temporary’ state (i.e. the ‘sum so far’) is contained entirely within the loop construct, totally hidden from the consumer of the function. We can do this once more using for, while or rec:

let forSum list = 
    let mutable sum = 0
    for elt in list do
        sum <- sum + elt
    sum

let whileSum list = 
    let mutable sum = 0
    let mutable i = 0
    while i < List.length list do
        sum <- sum + list.[i]
        i <- i + 1
    sum

let recSum list = 
    let rec aux remainderOfList sumSoFar = 
        match remainderOfList with
        | [] -> sumSoFar
        | head :: tail -> aux tail (sumSoFar + head)
    aux list 0

The key thing to note above is that in the recursive example, nothing is marked as mutable. This means we’ve now seen a technique that helps us:

  1. Run loops without introducing a mutable variable
  2. Keep track of temporary state whilst keeping functional purity

This quickly breaks down when doing anything non-local — imagine how you might update a global variable from inside a pure function that itself returns a value. To do this, we combine what we’ve seen before and can write the function such that it takes in the global state as an input, and returns the (updated) global state as well as whatever other value the function returns:

let sumListAndUpdateState list state = 
    (recSum list, state)

The above shows that you can update global state fairly simply by returning a tuple of the function result plus updated state.

How do we generalise this so that we don’t have to explicitly pass around the state on every function call? The answer is the state monad. There are many explanations of it out there, but I will walk you through the F# one myself. It all looks pretty complicated, but I’m hoping you get the gist rather than get bogged down by the details — we’re creating a way to thread state silently through any computation.

Being precise, the state monad is based on a type — one that encodes a function taking a state of type 'a and returns a state-result tuple where the result is of type 'b':

type StateMonad<'a,'b> = 'a -> ('a * 'b)

To be classed as a monad, we need to be able to do some stuff with it. First, we need to be able to create an instance of it from an initial state:

let returnS a = (fun s -> a, s)

We also need to be able to combine an instance of it with the result of a previous computation:

let (>>=) x f = 
    (fun s0 -> 
    let a, s = x s0
    f a s)

Finally, we want to combine two instances of it:

let (>=>) m1 m2 = 
    (fun s0 ->
    let _, s = m1 s0
    m2 s)            

Done as an F# computation expression, this looks like:

type StateBuilder() = 
    member m.Bind(x, f) = x >>= f
    member m.Return a = returnS a
    member m.ReturnFrom(x) = x

let state = new StateBuilder()

We can then define some helper functions that we can use in our state computation expressions: the first gets the current state; the second sets it; the third executes our computation and returns the final state.

let getState = (fun s -> s, s)
let setState s = (fun _ -> (), s)
let Execute m s = m s |> snd

Now that we have the concept of threading state through a function encoded in a generic type, let’s put it to use!

A contrived example is our previous one summing a list — it’s overkill for a monad as it only ever used local state, but here goes:

let stateSum list = 
    let rec aux t = 
        state { 
            match t with
            | head :: tail -> 
                let! s = getState
                do! setState (s + head)
                return! aux tail
            | [] -> return ()
        }
    Execute (aux list) 0

Let’s focus on the aux computation expression. The most obvious thing is that the state isn’t an input! This is because aux actually returns a function — one that takes in the initial state and returns the final state plus the function return value (which in this case is unit but could be anything)

Behind the scenes it is going to call Bind to combine the recursive calls to aux into one big computation, each stage of which will take the sum of the list so far plus the tail of the list as inputs. When we have an empty list, we return the state plus the function ‘return value’. Here, the state is the value we want so we can just return unit.

Of course, passing around an integer as a state isn’t necessary. We can write a generic fold function using our state computation expression as follows:

let fold aggregator initialValue list = 
    let rec aux t = 
        state { 
            match t with
            | head :: tail -> 
                let! s = getState
                do! setState (aggregator s head)
                return! aux tail
            | [] -> return ()
        }
    Execute (aux list) initialValue

Now we are passing in everything we need to control the aggregation, and we have left with a function that takes in an initial state, a list, and a way to update the state from the list, returning the final state. Complicated, but beautiful to reason about.

Next, let’s see how we can handle global state.


Globalisation

So far, we have shown a number of ways in which local state can be handled. This is a fundamentally different problem (and easier) than that of global state.

Let’s use the example of a function that will only run an action five times. On the sixth run it will instead perform a failure action.

Here’s how it looks normally: we create a mutable variable and read that from inside the (impure, yuck) function:

let mutable globalCounter = 0

let canOnlyRunFiveTimes passAction failAction = 
    match globalCounter < 5 with
    | true ->            
        passAction globalCounter
        globalCounter <- globalCounter + 1
    | false -> 
        failAction globalCounter
        globalCounter <- -1

Using our state monad, we can do exactly the same thing but we have controlled the impurity:

let monadicCounter = 0

let canOnlyRunFiveTimesWithStateMonad passAction failAction = 
        state { 
            let! s = getState
            match s < 5 with
            | true -> 
                passAction s
                do! setState (s + 1)
            | false -> 
               failAction s
               do! setState (- 1)
        }

The best way to see this in action is by running some tests.

First, let’s run our function five times using the version that reads from a global mutable variable. It should return 5! Then run it a sixth time; it should return -1 to signify an error.

[<Test>]
let ``Global mutable state``() = 
  for _ in 1 .. 5 do canOnlyRunFiveTimes (printf "Run %d \r\n") (printf "Not run %d \r\n")
  globalCounter =! 5
  canOnlyRunFiveTimes (printf "Run %d \r\n") (printf "Not run %d \r\n")
  globalCounter =! -1

Here’s how it looks using global monadic state (no mutable keyword, whoop!!). We use our >=> operator from before that glues together two monad thingies to run the computation more than once:

[<Test>]
let ``Global monadic state``() =
  let monadicCounter = 0
  let m = canOnlyRunFiveTimesWithStateMonad (printf "Run %d \r\n") (printf "Not run %d \r\n") 
  let composedFiveTimes =  m >=> m >=> m >=> m >=> m
  let composedSixTimes =  composedFiveTimes >=> m
  Execute composedFiveTimes monadicCounter =! 5
  Execute composedSixTimes monadicCounter =! -1

The results are exactly the same!


Recap

Let’s look at what I promised you:

What you’ll learn today

  • Why we need to handle state functionally in the first place
  • How to convert imperative for and while loops to functional ones that embed the local state
  • How this can be formalised using the State monad and F# computation expressions
  • How to extend this to keep track of a global state without resorting to a mutable variable

I hope that you feel satisfied that you know these things now — if you don’t, let me know which bits are a struggle and how I can improve my explanation!

Sunday, June 19, 2016

Functional Creational Patterns


Much is made in the object-oriented world of applying Design Patterns. The majority of the time, the pattern in question is one from the canonical ‘Gang of Four’ book of the same name. I feel that I use these patterns in the manner intended — reusing a solution that has worked in the past to solve a new problem.

When learning a new paradigm, in my case functional programming with F#, I naturally had no previous solutions to fall back on, and thus no set of design patterns stored in my brain, patiently waiting their turn. This got me thinking about how well-known C# patterns can be applied to F# code, so I did a bit of searching and came up with a related talk by Scott Wlaschin. Whilst helpful, it was more about taking yourself out of an OO mindset as opposed to trying to reimagine existing patterns. I was still curious to answer this, which brought me to the question I’m about to answer in this post:

What do C# design patterns look like in F#?

I will do this in three installments to mirror those in the book: first Creational Patterns, then Structural Patterns and finally Behavioural Patterns.

The code can be found on GitHub.


Creational Patterns


A creational pattern abstracts the process of instantiating objects, decoupling a system from how its objects are represented and created.

There are five such patterns listed in the Gang of Four book. For each, I will:

  • briefly state the goal of the pattern (that is, what problem it is trying to solve);
  • show some minimal C# code using the pattern;
  • present some F# code solving the same problem; and
  • discuss the differences.

Quick links:


Abstract Factory


The goal:

This allows us to create families of related objects without specifying their concrete classes.

C# code:

using System;
namespace DesignPatterns
{
    public class UseAbstractFactoryPattern
    {
        public void Run ()
        {
            var printer = new ComponentInformationPrinter (new Type1Factory ());
            printer.PrintInformation ();
        }
    }

    public class ComponentInformationPrinter
    {
        public ComponentInformationPrinter (ComponentFactory componentFactory)
        {
            this.ComponentFactory = componentFactory;
        }

        private ComponentFactory ComponentFactory { get; }

        public void PrintInformation ()
        {
            var component = this.ComponentFactory.CreateComponent ();
            Console.WriteLine (component.Type ());
            Console.WriteLine (component.Category ());
        }
    }

    public abstract class ComponentFactory
    {
        public abstract Component CreateComponent ();
    }

    public abstract class Component
    {
        public abstract int Type ();
        public abstract string Category ();
    }

    public class Type1Component : Component
    {
        public override int Type () => 1;
        public override string Category () => "Category 1";
    }

    public class Type1Factory : ComponentFactory
    {
        public override Component CreateComponent () => new Type1Component ();
    }

    public class Type2Component : Component
    {
        public override int Type () => 2;
        public override string Category () => "Category 2";
    }

    public class Type2Factory : ComponentFactory
    {
        public override Component CreateComponent () => new Type2Component ();
    }
}

F# code:

module AbstractFactory

type Component = 
    | Type1Component
    | Type2Component

type ComponentFactory = 
    { Component : Component }

let getComponent (componentFactory : ComponentFactory) = componentFactory.Component

let getType = 
    function 
    | Type1Component -> 1
    | Type2Component -> 2

let getCategory = 
    function 
    | Type1Component -> "Category 1"
    | Type2Component -> "Category 2"

let printInformation componentFactory = 
    let componentInst = getComponent componentFactory
    printfn "%d" (getType componentInst)
    printfn "%s" (getCategory componentInst)

do printInformation { Component = Type1Component }

Discussion:

The code is more easily read if we distill it into the essence of what it is trying to achieve:

We want to specify the type of component we want, and in return be given a way to print information about the component.

In the C# version, this is done using inheritance and constructor injection: we create a ComponentInformationPrinter that accepts a ComponentFactory in its constructor, and then delegates to that factory to create the components which it then queries for their information. In turn, the components inherit from an abstract base class and provide their own implementation of the information-giving functions.

Whilst I could have made the F# code very similar (it does have classes, after all), the point of this post is to figure out what such patterns look like in a functional paradigm. Instead, I decided to use pattern matching functions to handle different types of components, which are defined using a discrimiated union.

This has the effect that, were we to add another type of component, we would need to update all of these functions to handle that particular case (or provide a degenerate catch-all case in the match function). Note that nothing on the factory side of things need be changed — in fact, the factory is simple a record type that wraps the underlying Component.

This is in stark contrast to the C# approach where we instead define a new class that inherits from Component, implement any required abstract functions, and then write a new factory class that inherits from ComponentFactory in order to get the new type back.

So which is better?

I think it really depends on the use case. The functional approach has the advantage that the factory definition is almost implicit, and we can define new factories on-the-fly. However, it has the disadvantage that we need to modify existing code to add a new type of Component.

This is unsuitable for (e.g.) distributing to third parties as a library, who will not then have the capability to extend the functionality. One way around this would be to define an interface with two methods, GetType and GetCategory, which clients could implement themsleves using object expressions.

On the whole, the patterns don’t look too dissimilar in a functional language, but I’m not sure the F# implementation really solves a real-world requirement in the same way that it does in C# — we could dispense with the factory altogether in F# and it wouldn’t really change anything from a caller’s perspective. I think the C# version is thus more of a ‘complete’ pattern.


Builder


The goal:

To make constructing a complex object independent of its representation, meaning that we can construct different representations via the same process.

C# code:

using System;
namespace DesignPatterns
{
    public class UseBuilderPattern
    {
        public void Run ()
        {
            var director = new ItemDirector ();

            var builder = new YellowItemBuilder ();

            director.Construct (builder);

            var builtItem = director.Construct (builder);

            builtItem.ShowParts ();
        }
    }

    public class ItemDirector
    {
        public Item Construct (ItemBuilder builder)
        {
            builder.BuildFirstPart ();
            builder.BuildSecondPart ();
            return builder.GetItem ();
        }
    }

    public abstract class ItemBuilder
    {
        public abstract void BuildFirstPart ();
        public abstract void BuildSecondPart ();
        public abstract Item GetItem ();
    }

    public class YellowItemBuilder : ItemBuilder
    {
        private readonly Item item = new Item ();

        public override void BuildFirstPart ()
        {
            item.FirstPartName = "YellowFirstPart";
        }

        public override void BuildSecondPart ()
        {
            item.SecondPartName = "YellowSecondPart";
        }

        public override Item GetItem () => item;
    }

    public class GreenItemBuilder : ItemBuilder
    {
        private readonly Item item = new Item ();

        public override void BuildFirstPart ()
        {
            item.FirstPartName = "GreenFirstPart";
        }

        public override void BuildSecondPart ()
        {
            item.SecondPartName = "GreenSecondPart";
        }

        public override Item GetItem () => item;
    }

    public class Item
    {
        public string FirstPartName;
        public string SecondPartName;

        public void ShowParts () => Console.WriteLine ($"{FirstPartName}, {SecondPartName}.");
    }
}

F# code:

module Builder

type Item = {mutable FirstPart : string; mutable SecondPart : string} 
    with member this.ShowParts() = sprintf "%s, %s." this.FirstPart this.SecondPart

type ItemBuilder =
    abstract member BuildFirstPart : unit->unit
    abstract member BuildSecondPart : unit->unit
    abstract member GetItem : unit -> Item
    abstract member item : Item

let yellowItemBuilder =  { new ItemBuilder with
      member this.item = {FirstPart = ""; SecondPart = ""}
      member this.BuildFirstPart() = this.item.FirstPart <- "YellowFirstPart"
      member this.BuildSecondPart() = this.item.FirstPart <- "YellowSecondPart"  
      member this.GetItem() = this.item }

let construct(builder:ItemBuilder) =
    builder.BuildFirstPart()
    builder.BuildSecondPart()
    builder.GetItem()

let run = 
    let item = construct yellowItemBuilder
    item.ShowParts()

Discussion:

The F# code took me an unreasonable amount of time to write, with lots of failed attempts to ‘make it more functional’ along the way. Diving into the literature, one suggested way to implement the Builder pattern in F# is to pass in a list of creation/processing functions to the director.

However, to me this is at odds with one of the main points of the pattern, that the director gets to choose the ordering of steps. If it takes a list of functions in, then the caller gets to choose the order.

Overall, I struggled to see how to solve this particular problem in a functional way. After all, the Builder pattern relies fundamentally on mutating things, which doesn’t sit nicely in the functional world.


Factory Method


The goal:

To allow subclasses to decide what type of class to instantiate.

This is a similar but subtly distinct goal from that of the Abstract Factory pattern above — to see the precise differences I am reusing the code from that example and showing only those sections that have changed.

C# code:

public class UseFactoryMethod
{
    public void Run ()
    {
        var printer = new ComponentInformationPrinter ();
        printer.PrintInformation ();
    }
}

public class ComponentInformationPrinter
{
    public virtual Component CreateComponent () =>  null;
    public void PrintInformation ()
    {
        var component = CreateComponent ();
        Console.WriteLine (component.Type ());
        Console.WriteLine (component.Category ());
    }
}

public class Type1Printer : ComponentInformationPrinter
{
    public override Component CreateComponent () => new Type1Component ();
}

public class Type2Printer : ComponentInformationPrinter
{
    public override Component CreateComponent () => new Type2Component ();
}

F# code:

let printInformation createComponent = 
    let componentInst = createComponent()
    printfn "%d" (getType componentInst)
    printfn "%s" (getCategory componentInst)

do printInformation (fun () -> Type1Component)

Discussion:

It should be clear from the code that this is really a subtle mutation of the Abstract Factory case which essentially gets rid of the factory class and moves the ‘hook’ into the ComponentInformationPrinter itself (i.e. to CreateComponent).

As such, the conclusions are much the same too — the patterns do broadly the same thing in both C# and F#.


Prototype

The goal:

To be able to create new objects by copying a prototypical instance.

C# code:

using System;
namespace DesignPatterns.Prototype
{
    public class UsePrototypes
    {
        public void Run ()
        {
            Component prototype = new Type1Component();
            var printer = new ComponentInformationPrinter (prototype);
            printer.PrintInformation ();
        }
    }

    public class ComponentInformationPrinter
    {
        public ComponentInformationPrinter (Component prototype)
        {
            this.prototype = prototype;
        }

        private Component prototype { get; }

        public void PrintInformation ()
        {
            var component = this.prototype.Clone ();
            Console.WriteLine (component.Type ());
            Console.WriteLine (component.Category ());
        }
    }

    public abstract class Component
    {
        public abstract int Type ();
        public abstract string Category ();
        public abstract Component Clone ();
    }

    public class Type1Component : Component
    {
        public override int Type () => 1;
        public override string Category () => "Category 1";
        public override Component Clone () => new Type1Component ();
    }

    public class Type2Component : Component
    {
        public override int Type () => 2;
        public override string Category () => "Category 2";
        public override Component Clone () => new Type1Component ();
    }
}

F# code:

let printInformation createComponent = 
    let componentInst = createComponent()
    printfn "%d" (getType componentInst)
    printfn "%s" (getCategory componentInst)

do printInformation (fun () -> Type1Component)

Discussion:

We can actually implement this pattern as another variation on the Abstract Factory pattern, this time injecting any required prototypes into the constructor of ComponentInformationPrinter and augmenting the Component class with a Clone() method.

In more complex examples of classes with data as well as behaviour, Clone() would have to decide exactly how much information to copy over (deep vs. shallow etc.), but here we simply have some baked-in behaviour that will transfer over.

The eagle-eyed amongst you will have noticed that the F# code is identical to that used for the Factory Method pattern. This is because the createComponent function was not only a factory method but also a prototype — this is because we were modelling Component as a discriminated union and so whatever we created was an immutable instance of that union.


Singleton


The goal:

Ensure that only one instance of a particular class can exist.

C# code:

public sealed class Singleton
{
    private static readonly Singleton instance = new Singleton ();

    static Singleton (){ }

    private Singleton () { }

    public static Singleton Instance => instance;
}

F# code:

type Singleton private() =
    static let instance = lazy(Singleton())
    static member Instance with get() = instance

Discussion:

The C# version looks deceptively simple. However, there are so many other versions of trying the same thing that run into problems with thread-safety, double locks, etc. that the pattern becomes slightly dangerous to use, especially for the less experienced developer.

I find it unlikely that someone who hasn’t written lots of singletons before will remember that they require static constructor so that the CLR marks the class as beforeFieldInit, and that deadlocking and performance are real possiblities, so to me this pattern doesn’t look ideal in C#.

On the other hand, the F# code is supposed to be less idiomatic, but actually looks better in my eyes thanks to static let bindings. It’s shorter and much clearer — we have a ‘thing’ (instance) and a way to get the one version of it that get ever exist, and that’s it!

Overall, this very common pattern is short & sweet in F#.


Conclusion & Next Time


Object creation looks quite different in C# and F#, and this flows through to the use of creational design patterns. As F# is an impure functional language, it has often felt like I have been using the less idiomatic parts of the language to write these object-oriented constructs. I suppose this is reassuring in a way — after all, the patterns were designed with OO in mind!

Next time we will look at Structural Patterns, and I expect there to be much bigger differences.

Monday, May 30, 2016

What's in a Monad?


Over the last few months I’ve progressed from the basics of functional programming to writing real-life, production code in F#. Inevitably, the concept of monads has come up.

Trying to learn what these slippery things are has been tough. The tutorials out there range from a lovely pictorial representation of Haskell monads to an example-driven series showing how common C# types are in fact monads to focusing on the actual functions that monads have in F#.

It’s fair to say that no single article or ever series of posts made me sit up and think ‘ah, so that’s what a monad is!’, which seems to tally with other people’s experience. For me, what made things click was asking the right question:

Why do we have monads?


Back to the start


It sounds a little strange when you put it like this, but so much of what I initially read on the topic focused on more esoteric questions that, to me, didn’t get to the crux of the matter, things like:

  • What is a monad?
  • What are the monad laws?
  • What does [insert monad name here] look like in Haskell/F#/…

I wanted to know why. Something must have driven computer scientists to look at category theory and think ‘maybe these will help us write programs!’

To understand this reasoning, I went back to the start. This meant reading the original paper that introduced monads back in 1996.

If this sounds scary to you, I don’t blame you, but bear with me. The paper is so nicely written and well explained that it made a huge difference to my understanding, and finally got my brain to connect the dots.

It did this by starting with a problem, and ending with the solution. So when I sat down to write my first production F# code, and I immediately came across a very similar problem, I already knew what the answer was.

In this post, I’m going to walk you through not only how I used monads to solve a real-world programming problem, but also say why they are such a good fit for the solution technique

This is going to be in three sections:

  • What is the problem we want to solve
  • How to we normally solve it in an object-oriented or imperative fashion
  • How can monads help us solve it more functionally?

A Common Problem


I write financial applications, and wanted a way for someone to select some asset data in Excel and send it to a central database. In more detail, this meant:

  • getting the input from a user through an ExcelDNA function;
  • validating that the input was correctly formed;
  • checking the exsiting database for similar records;
  • checking that links to other applications were valid; and
  • posting the data to the database

I realised quickly that this looked pretty similar to an example on Scott Wlaschin’s site, in which he uses the Either Monad to do something pretty similar, using the concept of a railway to explain things.

First though, I’ll outline how we would typically solve this in an imperative fashion, using C# as an example.


The Imperative Approach


In C#, this kind of thing might typically be done like this:

    [ExcelFunction (IsMacroType = true)]
    public static object StoreAssetData ([ExcelArgument] object [,] inputAllocation)
    {
        bool isValid = ValidateInput (inputAllocation);

        if (!isValid) 
        {
            return ExcelError.ExcelErrorValue;
        }

        bool similarDataAlreadyExists;
        try 
        {
            similarDataAlreadyExists = CheckExistingDatabase (inputAllocation);
        } 
        catch 
        {
            return ExcelError.ExcelErrorValue;
        }


        if(similarDataAlreadyExists) 
        {
            return ExcelError.ExcelErrorValue
        }

        bool linksToOtherAppsAreValid;
        try 
        {
            linksToOtherAppsAreValid = CheckOtherServices (inputAllocation);
        } 
        catch 
        {
            return ExcelError.ExcelErrorValue;
        }

        if (!linksToOtherAppsAreValid) 
        {
            return ExcelError.ExcelErrorValue;
        }

        try
        { 
            PostDataToDatabase(inputAllocation);
        } 
        catch
        {
            return ExcelError.ExcelErrorValue;
        }

        return "Done!";
    }

To me, there are a couple of problematic areas here. First, the control flow is hard to follow, and we keep getting forced to return early from the method if one bit fails. Second, operations that might throw exceptions need to be wrapped in fairly generic try-catch blocks, with the result of those operations defined outside of the block to be accessible elsewhere. Third, your focus is lost amongst the outline of the curly braces and you can’t easily figure out what the method wants to do.

Specifics aside, the imperative style looks at the core operations as then rather than and. That is, we are saying ‘validate this, then check that, then do this’, which has the problem that we need to check what state we’re in after each operation.

From a logical standpoint, we want to be able to say ‘validate and check and…’, and once it’s all done, tell me if it’s worked or not.

To do this, we need composition.


How Do Monads Help?


As scary as they sound, monads are exactly about this kind of things — composition, chaining, pipelining, and all these good functional things.

To help with our problem we need to be able to do a couple of very specific things:

  • Parallel composition: in this example we’ve got an array of entries to validate. Ideally we want to check every column of every row and tell the user about all of their mistakes, not just the first one.
  • Sequential composition: here we’ve got several steps that check for inconsistencies with existing data. However, if the data input to the function is malformed then there’s no point in doing this so we want some way of ‘skipping’ certain steps if previous ones failed (without exiting early from the routine and proliferating our curlies)
  • Result extraction: we want to get right to the end of the method and be able to say ‘this is the result’. Moreover if it’s not good news, we want to have as much information to give to the user about what went wrong as we can.

Fortunately, it turns out that we can do all of this with monads.


Briefly Introducing Result


type Result<'TSuccess, 'TFailure> = 
    | Success of 'TSuccess
    | Failure of 'TFailure

Here’s the type that we will be using to ease all of our pains. The idea is simple: the result of doing something is either Success or Failure, and in each case we include some data — the success case carries the input data that we want to feed to the next operation, and the failure case carries error messages.

This is beautiful in its simplicity and actually gives us everything we wanted. We have a way to compose operations that are successful by passing whatever we want in the success case, and a way to signal prior failure by passing through error messages in our failure wrapper

All we need to do now is see exactly how this composition works.


Parallel Composition: apply


let plusResult addSuccess addFailure result1 result2 = 
    match result1, result2 with
    | Success x1, Success x2 -> Success(addSuccess x1 x2)
    | Failure y1, Success _ -> Failure y1
    | Success _, Failure y2 -> Failure y2
    | Failure y1, Failure y2 -> Failure(addFailure y1 y2)

let apply f result = plusResult id (@) f result

let (<*>) = apply

Composing things in parallel is done with apply. In English, we take the results of two things (result1 and result2, each of which will either be Success or Failure), and say what to do in each of the possible cases:

  • If we’ve got two successes, return whatever was in the first case using the identity function id.
  • If we’ve got one failure and one success, return the Failure along with its error data.
  • If we’ve got two failures, return a new Failure with the error messages from each concatenated using infix list concatenation (@).

The infix version <*> is handy for neatening up the syntax, so we can write f1 <*> f2 <*> f3 rather than apply f1 (apply f2 f3).


Sequential Composition: bind


let either successFunc failureFunc = 
    function 
    | Success x -> successFunc x
    | Failure y -> failureFunc y

let bind f = either f Failure

let (>>=) x f = bind f x

This is even easier: we are passing in something to a function, if the thing is Success then apply the function to do data within, if it’s Failure then return the existing error messages.


Result Extraction: either


Getting the result out at the end of our computation is simply a more general case of sequencing things together — we do what we did for bind but allow the user to pass in whatever success or failure functions they want.

This has other benefits, for example testing — let’s say that in production we want a popup displaying either the error message or a happy success one. When we test, we’ve already abstracted the nasty UI/popup concerns and can inject our own tests implementations.


Putting It Into Practice


It looks like we’ve got the pieces that we need — and I haven’t even had to mention the monad laws! It’s deceptively simple stuff once you boil it down to exactly why you need each individual part of a monad.

And so to an actual solution. This is how we read our raw input rows contianing data:

let getRows (rows : obj [,]) = 
    seq { 
        for i in 0..(rows.GetLength(0) - 1) do
            let asset = readRow rows.[i, *]
            match asset with
            | Success x -> yield Success x
            | Failure failureMessages -> 
                yield Failure(sprintf "Row %d has problems: " i + String.concat "; " failureMessages)
    }

For each row, we read it. If the data is good, we yield it. If it’s not, we aggregate all the failure messages that we got back from reading it and yield a new one with some more contextual information.

Here’s how we read a single row:

let readRow (row : obj []) = 
        let createAsset displayName portfolioName marketValue collateral Fund Strategy = 
        { DisplayName = displayName
          PortfolioName = portfolioName
          MarketValue = marketValue
          CollateralClassification = collateral
          FundName = Fund
          StrategyName = Strategy }
    Success createAsset 
    <*> getString [ "DisplayName was missing or empty" ] row.[DisplayNameColumn] 
    <*> getStringOption row.[PortfolioNameColumn] 
    <*> getDouble [ (sprintf "MarketValue (%O) was not a valid number" row.[MarketValueColumn]) ]  row.[MarketValueColumn] 
    <*> getStringOption row.[CollateralColumn] 
    <*> getStringOption row.[FundColumn]
    <*> getStringOption row.[StrategyColumn]

We use the infix version of apply to read each column in turn and aggregate any problems we may have together. The helper functions getString, getStringOption and getDouble do pretty much what you would expect — they take an individual element of the input array and if they are of the desired type return Success, otherwise they return Failure with the injected error message.

We combine these with our result creation using bind:

let readAssetData inputAssetData = 
inputAssetData.InputAssets
|> getRows
|> foldResults
>>= createAssetData

where the createAssetData function is just a factory method to format the result, and foldResults simply combines the sequences yielded by getRows into one.

To spice things up a little, extracting the result can be done in C# by calling into the either function (in F#) we saw earlier:

public static class ResultExtensions
{
    internal static TResult Result<TSuccess, TFailure, TResult>(
        this Result<TSuccess, TFailure> twoTrackInput,
        Func<TSuccess, TResult> successFunc,
        Func<TFailure, TResult> failureFunc)
    {
        var sucessConverter = new Converter<TSuccess, TResult>(successFunc);
        var failureConverter = new Converter<TFailure, TResult>(failureFunc);

        var onSuccess = FSharpFunc<TSuccess, TResult>.FromConverter(sucessConverter);
        var onFailure = FSharpFunc<TFailure, TResult>.FromConverter(failureConverter);

        var result = either(onSuccess, onFailure, twoTrackInput);

        return result;
    }
}

This is a little verbose but shows how the language interop works, and means that we can use higher-order fucntions in C# (yay!) to do things like logging and error handling.

Adding logging

Let’s say we want to add logging to our application. We could use something like aspect-oriented programming and bring in loads of heavyweight third-party frameworks, or we could use monads:

    public static readonly Func<string, object> LogFailure = errorMessage =>
    {
        Logger.Log(errorMessage);
        return ExcelError.ExcelErrorValue;
    };

This function is passed into Result as the failureFunc parameter, and magically logs all of our failure messages!


Wrapping Up


In this post we’ve seen how a common imperative solution to a standard business problem can be transformed into something much neater and more scalable using some functional programming concepts.

The fact that these concepts have their roots in semi-obscure mathematical theory should not put you off using them — to see this, begin with the why.

Ask yourself why monads are useful, and with a bit of reading you will figure it out. Then next time you see a problem like this one, you’ll know exactly what to do.

Sunday, May 22, 2016

SOLID - F# vs. C#


Do you want to write code that’s more testable, maintainable, extensible? Of course you do!

However much you think you already code in such a fashion, how many of you have worked on a shared codebase where:

  • a) there are so many thousands of classes, layers of indirection or abstraction that you just don’t know what does what?
  • b) a crisp, clean piece of architecture that you originally constructed has been slowly mangled by less experienced team members?
  • c) everything important goes through one massive ‘God class’ and you just know that what a change needs making, that class is the one to look at first?

I’m going to guess that most of you would put your hand up.

What if I told you there was a way to stop any of these things happening in your codebases? The traditional answer might be SOLID. Mine is functions. By the end of this post, I hope to have convinced you!


Background

Last week, I watched Mark Seemann’s excellent Pluralsight course on Encapsulation and SOLID.

It in, he talks about there being some duality between applying Uncle Bob’s SOLID principles of object-oreitned design, and using functional programming. He, and others, have also blogged in similar veins:

http://blog.ploeh.dk/2014/03/10/solid-the-next-step-is-functional/
http://gorodinski.com/blog/2013/09/18/oop-patterns-from-a-functional-perspective/
https://neildanson.wordpress.com/2014/03/04/it-just-works/

These posts have been pretty good in giving some idea as to why the two concepts are similar, but I want to approach the issue from another angle: learning.


The Woes of the Junior Developer

As someone that’s recently had to learn object-oriented programming (self-directed, from scratch), I can safely say I haven’t seen a ‘beginners’ guide to SOLID that has any chance of teaching an inexperienced developer how to write code that adheres to these principles.

It’s hard enough trying to write code in an existing codebase that matches whatever million-tiered architecture the original authors came up with, and to learn the right time to create an abstraction, how to use polymorphism effectively, ensure you are encapsulating things properly, etc.

Then, someone tells you that objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program, and you think that’s all well and good, but what on earth does that mean in terms of real code?

The same applies in less or greater measure to the remaining four principles — they clearly work and result in maintainable, flexible, testable code, but when you’re starting down the barrel of a couple of million lines of someone else’s handiwork, it’s nigh-on impossible to figure out how to ‘solidify’ what you’re looking at.

On the other hand, I said before that there is a duality between SOLID and functional programming. So how easy is it to teach the relevant concepts in a functional language? The rest of this post will show you that the answer is: very!


Why does this matter?

I’ve just told you that you can get the same results (flexibility, testability, extensibility, maintainability, …) that applying the SOLID principles give, but do so in a way that’s


What’s up next?

I will now:

  • Give a very quick introduction to each of the SOLID principles.
  • Show an example of violating the principle in C#.
  • See how we can fix the problems using object-oriented code.
  • Present an alternative, functional approach in F#.
  • Explain why the functional approach is easier to learn and teach.

All of the code for the examples below can be found at https://github.com/douglasbruce88/SOLID


Single Responsibility Principle

What is it?

A class or module should one do one thing (and hopefully do it well!)

Give me an example of breaking it

The class below both counts the number of commas in a given string, and logs that number. It thus does two things.

public class DoesTwoThings
{
    public int NumberOfCommas (string message)
    {
        int count = 0;
        foreach (char c in message)
        {
            if (c == '/') count++;
        }
        return count;
    }

    public void Log (string message)
    {
        Console.WriteLine ($"The string you gave me has { NumberOfCommas(message) } commas");
    }
}
How can I fix it in C#?

Assuming that we actually wanted the class to do the comma counting, we use dependency injection to stick in a logger and delegate the logging functionality to that. This involves creating an interface, a non-default constructor, and a class field.

public class DoesOneThing
{
    readonly ILogger Logger;

    public DoesOneThing (ILogger logger)
    {
        this.Logger = logger;
    }

    public int NumberOfCommas (string message)
    {
        int count = 0;
        foreach (char c in message) {
            if (c == '/') count++;
        }
        return count;
    }

    public void Log (string message)
    {
        this.Logger.Log($"The string you gave me has { NumberOfCommas (message) } commas");
    }
}

public interface ILogger
{
    void Log (string message);
}
What about F#?

I will admit that the code to count the number of commas is not strictly in functional style, but I wanted to keep it as similar to the C# variant as I could. The more functional approach using recursion can be found with the rest of the blog code.

All we do to solve the SRP issue is use higher-order functions, i.e. the logger. This logger can be any function with the required signature (in this case string -> unit). It doesn’t need to implement an interface (which can be tough if the code doesn’t adhere to the ISP principle, more on which to follow), we don’t need a constructor or a field in the module.

module DoesOneThing = 

  let numberOfCommas (s : string) = 
     let mutable count = 0
     for c in s do if c = '/' then count <- count + 1
     count

  let log logger (message : string) = 
      logger (sprintf "The string you gave me has %i commas" (numberOfCommas message))
      |> ignore
Why is this easier?

Contrast the OO approach:

Oh, you didn’t know about dependency injection? Here, let me explain using about twelve open-source frameworks and a 400-page book. Once you’re finished with that (not on company time, mind!), you can easily solve the problem we had.

… and by the way, the ILogger interface actually has eight methods that you need to implement, not just the one. Enjoy!

… with the functional approach

Use a higher-order function for the logger. As long as the signature matches, you’re good to go!

FP: one; OO: nil.


Open/Closed Principle

What is it?

A class or module should be open for extension, but closed for modification.

Give me an example of breaking it

Let’s say we are playing Monopoly. The rules say that if we roll a double, we get to roll again, so we write the code below. Looking more closely at the rules, three doubles in a row sends us to jail.

To make the code adhere to this rule, we have to modify the CanRollAgain method, so the class isn’t closed for modification.

public class OpenForModification
{
    public bool CanRollAgain (int firstDieScore, int secondDieScore)
    {
        return firstDieScore == secondDieScore;
    }
}
How can I fix it in C#?

One way is to make the class and implementation of the rules abstract, and then allow someone to extend the functionality by inheriting from the class.

public abstract class ClosedForModification
{
    public bool CanRollAgain (int firstDieScore, int secondDieScore)
    {
        return CanRollAgainImpl (firstDieScore, secondDieScore);
    }

    public abstract bool CanRollAgainImpl (int firstDieScore, int secondDieScore);
}

public class ThreeDoubles : ClosedForModification
{
    readonly bool LastTwoRollsAreDoubles;
    public ThreeDoubles (bool lastTwoRollsAreDoubles)
    {
        this.LastTwoRollsAreDoubles = lastTwoRollsAreDoubles;
    }

    public override bool CanRollAgainImpl (int firstDieScore, int secondDieScore)
    {
        return !LastTwoRollsAreDoubles && (firstDieScore == secondDieScore);
    }
}
What about F#?

Use a higher-order function. We allow the caller to inject an arbitrary rule function that takes in two integers and returns a bool.

let canRollAgain (firstRoll : int) (secondRoll : int) ruleSet : bool = 
    ruleSet firstRoll secondRoll
Why is this easier?

Contrast the OO approach:

So you’ll need to be able to figure out whether it’s best to use an abstract base class with an abstract implementation method, or a non-abstract base class with a virtual method and some default behaviour. Maybe go and learn about inheritance first, and then ask yourself why we didn’t use polymorphism instead.

… with the functional approach

Use a higher-order function for the rules. As long as the signature matches, you’re good to go!

FP: two; OO: nil.


Liskov Substitution Principle

What is it?

You should be able to swap an interface with any implementation of it without changing the correctness of the program.

Give me an example of breaking it

Let’s use the ILogger interface that we introduced in the Single Responsibility Principle section. Clearly the two implementations do quite different things.

public class ConsoleLogger : ILogger
{
    public void Log (string message)
    {
        Console.WriteLine (message);
    } 
}

public class NewLogger : ILogger
{
    public void Log (string message)
    {
        throw new NotImplementedException ();
    }
}
How can I fix it in C#?

In a way, you can’t. Whilst you can do something more sensible in NewLogger, it doesn’t stop someone coming along a few months later and writing

public class AnotherLogger : ILogger
{
    public void Log (string message)
    {
        throw new NotImplementedException ();
    }
}
What about F#?

Going back to the F# version, we had this injected function logger with a signature of string -> unit that we had to match.

It gets a little woolly here — there is nothing stopping you from writing

let logger : string -> unit = raise (new NotImplementedException())

In a more pure functional language such as Haskell, you wouldn’t be able to implicitly throw such an exception. However in F#, the real solution is to use something like Railway-Oriented Programming which uses the Either monad (also known as the Exception monad) to ensure that you always return something from your functions.

This relies on the developer adhering the the design principle of not throwing exceptions. Whilst this is certainly easier to do in a functional language, it’s still something else to remember.

Why is this easier?

It’s a little harder to justify here, but to me the solution is much firmer in F# — the concept of not throwing exceptions is more natural in a language where we can create constructs (such as the Either monad) to show us another way of handling exceptional situations.


Interface Segregation Principle

What is it?

A client of an interface should not be made to depend on methods it doesn’t use.

Give me an example of breaking it

Using the logging example again (and thinking about to NUnit’s version of ILogger), if we have an interface as below and want to do some kind of tracing on the method, we have no need for the LogError method on ILogger

interface ILogger
{
    void LogError (string message);

    void LogDebug (string message);
}

public class ClientWithLogging
{
    readonly ILogger Logger;

    public ClientWithLogging (ILogger logger)
    {
        this.Logger = logger;
    }

    public int NumberOfCommasWithLogging (string message)
    {
        this.Logger.LogDebug ($"Counting the number of commas in {message}");
        int count = 0;
        foreach (char c in message) {
            if (c == '/') count++;
        }

        this.Logger.LogDebug ($"{message} has {count} commas.");
        return count;
    }
}
How can I fix it in C#?

Prefer a Role Interface over a Header Interface. That is, define two interfaces doing one job each.

public interface IErrorLogger
{
    void LogError (string message);
}

public interface IDebugLogger
{
    void LogDebug (string message);
}

public class ClientWithLogging
{
    readonly IDebugLogger Logger;

    public ClientWithLogging (IDebugLogger logger)
    {
        this.Logger = logger;
    }

    public int NumberOfCommasWithLogging (string message)
    {
        this.Logger.LogDebug ($"Counting the number of commas in {message}");
        int count = 0;
        foreach (char c in message) {
            if (c == '/') count++;
        }

        this.Logger.LogDebug ($"{message} has {count} commas.");
        return count;
    }
}

There’s a slight twist to this one, though. These kind of single-method interfaces are essentially just delegates, and delegates are just encapsulations of function signatures.

The delegate form looks like this:

public delegate void DebugLogger (string message);

public class ClientWithLogging
{
    readonly DebugLogger LogDebug;

    public ClientWithLogging (DebugLogger debugLogger)
    {
        this.LogDebug = debugLogger;
    }

    public int NumberOfCommasWithLogging (string message)
    {
        this.LogDebug ($"Counting the number of commas in {message}");
        int count = 0;
        foreach (char c in message) {
            if (c == '/') count++;
        }

        this.LogDebug ($"{message} has {count} commas.");
        return count;
    }
}

The function form looks like this:

public class ClientWithLogging
{
    readonly Action<string> LogDebug;

    public ClientWithLogging (Action<string> debugLogger)
    {
        this.LogDebug = debugLogger;
    }

    public int NumberOfCommasWithLogging (string message)
    {
        this.LogDebug ($"Counting the number of commas in {message}");
        int count = 0;
        foreach (char c in message) {
            if (c == '/') count++;
        }

        this.LogDebug ($"{message} has {count} commas.");
        return count;
    }
}
What about F#?

We don’t have to change any of our code! The higher-order logger function with signature string -> unit looks suspiciously like the Action<string> method that we introduced in C#.

Why is this easier?

OO:

First, you need to read a few articles about interfaces, like those by Martin Fowler to which I linked earlier. Then, you need to know that single-method interfaces are the same as delegates. Then go and research delegates in C#.

Then realise that they are just a function encapsulation so you can use one of them instead.

Oh, but your method returns void, so you can’t use the flexible Func, you need to use the specific Action.

FP:

Higher-order functions, again. You already know about them, right?

Another win for FP!


Dependency Inversion Principle

What is it?
  • High-level modules should not depend on low-level modules. Both should depend on abstractions.
  • Abstractions should not depend on details. Details should depend on abstractions.
Give me an example of breaking it

You’ve seen this already in the very first code of the article — as well as doing two things, the high-level class below depends on the low-level behaviour of logging to the console.

public class DoesTwoThings
{
    public int NumberOfCommas (string message)
    {
        int count = 0;
        foreach (char c in message)
        {
            if (c == '/') count++;
        }
        return count;
    }

    public void Log (string message)
    {
        Console.WriteLine ($"The string you gave me has { NumberOfCommas(message) } commas");
    }
}
How can I fix it in C#?

You’ve seen this too! We use Inversion of Control to make sure that the logging implementation is abstracted

public class DoesOneThing
{
    readonly ILogger Logger;

    public DoesOneThing (ILogger logger)
    {
        this.Logger = logger;
    }

    public int NumberOfCommas (string message)
    {
        int count = 0;
        foreach (char c in message) {
            if (c == '/') count++;
        }
        return count;
    }

    public void Log (string message)
    {
        this.Logger.Log($"The string you gave me has { NumberOfCommas (message) } commas");
    }
}

public interface ILogger
{
    void Log (string message);
}
What about F#?

Well, this wasn’t a problem to begin with — we passed in a logger function that abstracted the logging bit.

Why is this easier?

I think I’ve covered most of this in the discussion of the Single Responsibility Principle. In essence, to satisfy the Dependency Inversion Principle you need to learn about several object-oriented principles, some of which are quite advanced such as Dependency Injection (yes, the book is 584 pages long).

In contrast, we only need to learn about one thing (higher-order functions) to solve the issues in F#.

Another win for FP — might just be a whitewash?


Conclusion

I hope I’ve shown in this post that the SOLID principles of object-oriented design can be satisfied in a functional programming using nothing but functions.

This is a huge contrast to an object-oriented language where a developer needs to understand a huge range of topics to implement the principles properly.

Now ask yourself which set of techniques you want to teach your next Junior Developer, and how much time, effort and money you could save by switching to a functional approach.