Thursday, August 25, 2016

Fitting Data in Excel and OpenOffice Calc

I've stopped installing Microsoft Office onto my home computers. Instead, I installed OpenOffice on my latest two computers. It does everything I need it to do. Or so I thought.

One day I decided I wanted OpenOffice Calc to perform a polynomial fit to some measurements I'd been taking. So I did what I always do in Excel: create a Scatterplot of the data, add a Trendline, choose Polynomial and its order, and then click the checkboxes to display both the equation and the R2 value on the chart. Except there was no polynomial Trendline!

According to the OO help forums, there really is no built-in polynomial trendline in Calc. But there is an extension called CorelPolyGUI that does the trick.

Unfortunately, I wasn't too thrilled with this extension. My measurements were accumulating every week. With Excel, I'd be able to add new data and watch the scatterplot update the polynomial. But CorelPolyGUI doesn't work that way. The result from CorelPolyGUI essentially is a snapshot that's disconnected from the input data. To get it to update, I found that I had to delete the result and then re-invoke the extension with new data ranges. And that was a big problem – I wasn't sure over what extent of my data I wanted to fit. Perhaps I could get a better fit if I omitted some measurements. So I needed to alter the input range a few times and compare the results.

There is another method, that if you're clever, you can program yourself. It's based on the following set of equations (for a second-degree poly fit):



To get the values of A, B and C, you simply matrix multiply the vector on the left side of the last equation with the inverted 3 x 3 matrix on the right hand side. Once you program this into your spreadsheet, you can alter the data and instantly get the polynomial function to update.

But it turns out there's an even easier way. I found it accidentally in the results from my search for how to fit a polynomial in OpenOffice. It involves using LINEST to operate on array data. The web page^3 describes using LINEST in Excel to perform non-linear curve fitting. It turns out that the OpenOffice Calc LINEST function works the same way, albeit with slightly different syntax. And the syntax is different because of how you define an array in Calc, separating the elements with semi-colons instead of commas.

In retrospect, I see now the same solution in the OO help forums. The user Villeroy attached a spreadsheet with an example that uses LINEST. But instead of raising the range of independent values to {1;2}, he added the X2 column and referenced it in LINEST. Same thing, really, but perhaps a bit less elegant.