Derivation of the F-Distribution
We derive the F-distribution PDF from scratch — turns out it's basically next of kin to the t-distribution — then wrap up with how to actually read an F-table.
This time it’s the F-distribution — the last one in our little parade of distributions.
Why study the F-distribution right after the t-distribution? Well, here’s the cute reason: if a statistic $T$ follows the t-distribution, then $T^2$ follows the F-distribution. So they’re basically next of kin.
Buuut that’s not to say the F-distribution is only the distribution that $T^2$ follows —
let’s go more general than that.
First we need to define the random variable $F$,
and honestly, let me just toss the definition out there first and we’ll talk about it.

where $U$ follows a chi-squared distribution with degrees of freedom

and $V$ follows a chi-squared distribution with degrees of freedom

Since we had

it makes total sense that $T^2$ would follow the F-distribution — I’ll come back and hit this point one more time at the very end.
And while we’re at it, let me throw out the probability density function of the F-distribution too:

Of course, this whole post is about deriving this F-distribution PDF. So let’s get into it.
So first up,

I’m going to construct the joint density function of $U$ and $V$.

Pulling out the chi-squared PDF as our reference,
I’ll call the joint density of $U$ and $V$ just $h(u,v)$.
(The principle for building this joint density is the exact same one we used in the t-distribution derivation in the previous post.)

OK, and after that — for the cumulative distribution…
I want to write it as $F(\sim)$, but that’s gonna get super confusing because we already have $F$ doing other work here.
So let me use a different symbol for the CDF in this post!!!

That’ll do, I think.

And from here on, it’s by hand;;;;
Typing this out is way too much~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


OK so the derivation is done.
Let me grab a few basic properties of the F-distribution and wrap this thing up!!!!

Just like before, there’s an F-distribution table for this guy too,
and let me quickly show how to read it —
the F-distribution table expresses the $(1-\alpha)$ quantile as

and apparently it means this:

Let me run a quick example with $\alpha = 0.05$~

If you go look it up, you get $5.19$,
and what does that number actually mean?
If we draw it as a picture,

it’d look something like this!
Also also also also also — another property of the F-distribution:
if you swap the numerator and denominator in the definition, you get an F-distribution with the degrees of freedom flipped. So:


So for instance,
if you want to find

but there’s no table for $\alpha = 0.95$,
even with just the $\alpha = 0.05$ table, you can still get the answer like this!

So that’d be this kind of picture:

Also, summarizing a bit more about the F-distribution,

The two random samples have to be independent too!!!!
Then

that’s the content,
and the principle is simple.

Also, waaay back~~~~~~ at the very beginning,
I said that if $T$ is a statistic that follows the t-distribution,
then $T^2$ follows the F-distribution —
let’s actually verify the degrees of freedom too (of course, everyone’s already finished this calc in their heads by now… (crying))

And that’s just the F-distribution in a nutshell.
Now let’s talk a bit about testing..
r1 = [1, 2, 5, 10, 100]
r2 = [1, 1, 2, 1, 100]
r = list(zip(r1, r2))
for i, j in r: # i = n
x = np.linspace(0, 5, 1000)
y = sc.f(i, j).pdf(x)
plt.plot(x, y, linewidth=2.0, label = r'$r_1$=%s $r_2$=%s' % (i, j))
plt.grid(True)
plt.legend()
plt.ylabel('p(x)')
plt.xlabel('x')
plt.ylim(0, 2.5)
plt.title('F Distribution')
plt.savefig('8.F Distribution.jpeg')

Originally written in Korean on my Naver blog (2017-11). Translated to English for gdpark.blog.
Comments
Discussion happens via GitHub Discussions. You'll need a GitHub account to comment.