Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Yes, it does.

I'm trying to follow your explanation but help me understand better: which model assumptions would it violate?

Typical assumptions for time-series forecasting are stationarity (differencing is used to achieve this if not already the case), residuals are homoskedastic (constant variance) and normally distributed, etc. In fact, the primary use of the Box-Cox is to stabilize the variance and make the data more normal. Box-Cox itself doesn't violate any model assumptions. Forecasting on Box-Cox transformed data shouldn't either -- if anything, Box-Cox attempts to better satisfy the assumptions of time-series forecast models (well, as best it can -- some data just don't want to be normal).

Now the question is whether the inverse Box-Cox (or the entire process) violates any model assumptions. Intuitively, I don't believe it does (and Rob Hyndman, author of book on Forecasting agrees) but I'm not certain how to demonstrate this rigorously.



The assumption that your target is normally distributed about the expected value. Let's say you take the log of your target, and run a regression using RMSE. You now assume that log(Y) is normally distributed. Now, for a given sample, your model outputs E[log(Y)|X]. However, exp(E[log(Y)|X]) is not the same as E[exp(log(Y)|X], which is what you really want. It can be shown mathematically that in order to get to E[exp(log(Y)|X], you need to multiply by an offset factor, which in this case is exp(var(E[log(Y)]-log(Y)/2), which is how you would normally transform between the mean of a normal distribution to a lognormal distribution.

Here is a paper which addresses the issue in the introduction. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4024993/


Let me see if I understood you correctly. Let Y represent a random variable representing stationary time series data. Let g(.) be the forecast function.

1) Log-transformation: log(Y)

2) Forecast results of transformed data: E[g(log(Y))]

3) Inversion of forecast results: exp(E[g(log(Y))])

4) However, what we really want is the expectation of the transformed-forecasted-inversed results, which is E[exp(g(log(Y)))].

Jensen's inequality states that for a convex function φ,

φ(E[X]) <= E[φ(X)]

And equality is only attained if either X is constant or φ is affine. Since neither is (generally) the case here, therefore

exp(E[g(log(Y))]) < E[exp(g(log(Y)))]

An offset factor ε is needed to correct (3) to (4)

E[exp(g(log(Y)))] = exp(E[g(log(Y))]) * ε

Did I get that right? (Offset changed to multiplicative factor)


Yes, exactly. However in this case, the offset is multiplicative.


Understood. To add one more point, I'm noticing the reason the above works the way it does is because most forecast algorithms output an expected value instead of a random variable, hence the results are E[g(log(Y)] instead of just g(log(Y)).

It strikes me that if you package the entire thing as a random variable:

Z = exp(G(log(Y)))

and use a different kind of forecast function G : Y -> Y' where Y, Y' ~ Normal, then we don't need the multiplicative factor -- which can be difficult to calculate for an arbitrary transformation. We can just get the expected value of Z, ie. E[Z] = E[exp(G(log(Y)))]. This is not done in the article, but in theory it could be.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: