Apache OpenOffice (AOO) Bugzilla – Issue 123824
Exponential curve fit y=a*exp(b*x) ignores data points with zero y-values
Last modified: 2017-05-20 10:04:01 UTC
Created attachment 82063 [details] spreadsheet with charts to demonstrate y=exp(x) problem When choosing an exponential trend line y=a*exp(b*x) through some actual data I realized that a number of the data point that had zero y-values were being ignored. This is different from a reported bug that described a curve fit not going through a data point at the origin. Although no exponential regression of the form y=a*exp(b*x) can intersect the x-axis, there should be no reason why a minimization of least square values between data and candidate regressions should have a problem. The attached spread sheet demonstrates the problem. I did not try to demonstrate what a data point within the x-range of data with a nonzero y-value would do, but the example shown is enough to demonstrate a problem. Note the example was created and showed the problem before it was saved in .xls format for the first time for whatever that is worth.
With the current algorithm it is not possible to use such data pairs, because the fitting is not done on the equation y=a*exp(b*x) but on the equation ln(y)=ln(a)+ b*x. Excel has disabled exponential trend lines in those cases.
Quite right, and if Excel is disabled with any y=0 data points, that is the safe things to do. It appears Chart just ignores any such data points, does not alert the user and merrily goes on to calculate an incorrect curve. This could be addressed with an alert or at least stated somewhere that the user needs to translate his/her own data so all x and y values are positive for valid results. Not knowing the philosophy behind OpenOffice, it is not clear if it is supposed to produce the same answers as Excel, or do things better when appropriate. Anyway, I fired up Systat's SigmaPlot and got done what I needed. I believe it just picks initial value for coefficients and uses something like a method of steepest ascent to minimize least square differences, but that is just a guess.