> Yes, it does. I'm trying to follow your explanation but help me understand bet...

CapmCrackaWaka · on Feb 22, 2019

The assumption that your target is normally distributed about the expected value. Let's say you take the log of your target, and run a regression using RMSE. You now assume that log(Y) is normally distributed. Now, for a given sample, your model outputs E[log(Y)|X]. However, exp(E[log(Y)|X]) is not the same as E[exp(log(Y)|X], which is what you really want. It can be shown mathematically that in order to get to E[exp(log(Y)|X], you need to multiply by an offset factor, which in this case is exp(var(E[log(Y)]-log(Y)/2), which is how you would normally transform between the mean of a normal distribution to a lognormal distribution.

Here is a paper which addresses the issue in the introduction. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4024993/

wenc · on Feb 23, 2019

Let me see if I understood you correctly. Let Y represent a random variable representing stationary time series data. Let g(.) be the forecast function.

1) Log-transformation: log(Y)

2) Forecast results of transformed data: E[g(log(Y))]

3) Inversion of forecast results: exp(E[g(log(Y))])

4) However, what we really want is the expectation of the transformed-forecasted-inversed results, which is E[exp(g(log(Y)))].

Jensen's inequality states that for a convex function φ,

φ(E[X]) <= E[φ(X)]

And equality is only attained if either X is constant or φ is affine. Since neither is (generally) the case here, therefore

exp(E[g(log(Y))]) < E[exp(g(log(Y)))]

An offset factor ε is needed to correct (3) to (4)

E[exp(g(log(Y)))] = exp(E[g(log(Y))]) * ε

Did I get that right? (Offset changed to multiplicative factor)

CapmCrackaWaka · on Feb 23, 2019

Yes, exactly. However in this case, the offset is multiplicative.

wenc · on Feb 23, 2019

Understood. To add one more point, I'm noticing the reason the above works the way it does is because most forecast algorithms output an expected value instead of a random variable, hence the results are E[g(log(Y)] instead of just g(log(Y)).

It strikes me that if you package the entire thing as a random variable:

Z = exp(G(log(Y)))

and use a different kind of forecast function G : Y -> Y' where Y, Y' ~ Normal, then we don't need the multiplicative factor -- which can be difficult to calculate for an arbitrary transformation. We can just get the expected value of Z, ie. E[Z] = E[exp(G(log(Y)))]. This is not done in the article, but in theory it could be.