Monday, December 1, 2008

Why do we flog?

I just put an example of using flogs to compare scores on two years of placement scores.  But some of you may be confused why we take flogs in the first place.  Here are the main points.

1.  We want to compare two proportions, say p1 and p2.  There are two problems with a direct comparision like p1/p2.

a)  This measure depends on whether we consider the proportions p1, p2 or the proportions 1-p1, 1-p2, and that's a problem.  The choice of p1, p2 or 1-p1, 1-p2 shouldn't change our comparision.

b)  Small proportions tend to have smaller variation than large proportions (close to .5).  The ratio p1/p2 = 1.5 is more meaningful (significant) if the p's are close to zero, than if the p's are close to 0.5.

So we want to transform a proportion p so that 

-- it doesn't matter if we consider p or 1-p
-- the variance of the transformed p will be roughly the same for p near 0 and p near 0.5.

By using the flog reexpression log(p/(1-p)) we achieve these two goals.

By the way, the flog reexpression is the basis for logistic regression models.  You will likely be using logistic regression in your statistical life, but we are trying here to motivate why we use the flog reexpression.


No comments: