Saturday, April 16, 2016

Excel calculations in F# - The Problem

I’ve been playing around with the more esoteric/academic features of functional programming for a couple of months now, so I think it’s about time to get some hands-on experience.

To do this, I set myself the task of choosing a mini-project to complete in F# — one that might have some real-world use!

I ended up settling on implementing a simple Excel expression parser. In this post I’ll introduce the problem that I want it to solve.


The Problem

I’m guessing you’re familiar with writing expressions in Excel. The program comes with a little formula bar that allows you to write expressions of varying complexity, precede it with the = key, and have it evaluated for you. Excel allows you to do all sorts of things here, but I’m going to focus on a bit of a gap in it’s functionality: vectors.

We begin with a simple question: how do we add two vectors of numbers in Excel? The concept of array formulas already exists, and indeed they get us to a solution pretty quickly. To add {1;2;3} to {4;5;6} we can do this:

Simple vector addition

So far this seems very straightforward. Referring to each vector in its entirety means we can do basic arithmetic on the vectors — addition, subtraction, multiplication, even exponentiation.

Let’s push the program a bit at this stage. Excel is used for a lot of financial modelling, and spreadsheets get very complicated very quickly in these applications. What if, instead of creating each vector explicitly and referring to it, we create it in a single cell as a one-dimensional constant.

If this sounds contrived, imagine that these are lists of 10,000 stock prices that we’re trying to analyse and that we’re getting them from a third-party source. Excel is incredibly resource-hungry — if we take these 10,000 numbers and paste them into 10,000 cells the resulting memory allocation from all the cell formatting etc. is huge compared to putting them all in one cell. Extend this over hundreds of rows in dozens of tabs and we might have a problem.

So, we want to store a vector in a single cell, another vector in another single cell, and add them together. Let’s see what Computer says:

Using a one-dimensional constant

No

This surprised me when I first saw it. After all, if you read the link above on one-dimensional constants, there are plenty of examples where they add and multiply these kind of things together. Even more strange is that we can do exactly what we want by putting both vectors directly into the third expression. To see what I mean, look at the picture below:

With formula expressions

I hope this makes it clear what problem I want to solve.

I want a generic way of being able to refer to array constants, do arithmetic on them, and have the result be another array constant that I can use.

This, I hope, will strike a chord with the hordes of finance professionals writing huge, slow Excel spreadsheets and wishing they could make them faster and more compact.


The Solution

Before I outline a proposed solution, if anyone reading this knows of a way to do what I'm trying to do already, please say so!

Not being an Excel developer, I’m going to stop short of implementing anything that messes with current Excel syntax. That is, I’m not going to make the exact formula you see in the picture above work.

Instead, I’m going to write a new function, let’s call it Calculate, that will do the job for us. More specifically, if we write =Calculate(A1 * A2 ^ A3 + A4) where A1, A2, A3 and A4 are all vectors of numbers, the result will be what we expect from following standard.


Next Time

I’ll introduce the tools that I’ve decided to use to write my Calculate function,.

No comments:

Post a Comment