Derivation of the Poisson Distribution [Basic Statistics I Studied #2]

OK so — let’s finally march into distributions!!!!

So I’m actually a physics major, and the field I want to specialize in is statistical physics, which means I’ve been grinding statistical physics pretty hard.

And in stat mech… well, you start from Gaussian,

then it’s Fermi-Dirac, Bose-Einstein, all that good stuff~~~ so I figured I knew distributions a little bit…….

But in statistics………. I started running into distributions I’d never heard of, never seen before………

I was, uh, not a little shook……. So…….

My original plan was to only cover the distributions I didn’t know in the basics post — the Γ-distribution,

$\chi^{2}$

-distribution, t-distribution, F-distribution —

but honestly I think it’ll be way better to build up step by step from some more basic distributions first.

First up: the Poisson Distribution!!!!

Funny enough, I’ve actually covered the Poisson distribution before.

This is really really really really really really really really really really really really really really really a distribution for events that barely ever happen,

and apparently it’s the go-to for stuff like deaths in the military, or the number of traffic accidents in a specific region.

Originally, this Poisson distribution was supposedly developed to find an approximation of the binomial distribution.

Like — say the probability that one person dies from some blood disease in a year is 0.00001,

and we want the probability that out of 200,000 people, 5 or more die from that blood disease in a year —

with n=200000

and p=0.00001, trying to crunch this as a binomial gives you a filthy calculation. That’s the issue.

So, fixing $np = \lambda$ in the binomial distribution $B(n, p)$,

let’s look at what the binomial approximates to as we crank n up to infinity —

the probability density function of the binomial:

(The modeling logic for this PDF is super natural if you remember combinations/permutations from high school,

so I’ll skip the derivation of that and jump straight in.)

I’m gonna pick at this part first!!!!

Red is $n(n-1)(n-2)$ ~~~~~

Green is $(n-x)(n-x-1)(n-x-2)$ ~~~~

Blue is $n$ to the power of $x$.

Now — when n is heading off to the limit, do the $-1$, $-2$ in $n(n-1)(n-2)$ ~~~ actually matter?

It’s like scooping a paper cup of water out of the Han River and then calling the Han River Management Office in a panic going “THE HAN RIVER IS DRYING UP, THE HAN RIVER IS DISAPPEARING!!!”

OK so what order is the red term???? What power of $n$ does the red term work out to?

Right. To make counting easy, let me play a little trick on the expression: $(n-0)(n-1)(n-2)$~~~~$(n-(n-1))$

From 1 to $n-1$ that’s $n-1$ terms, plus the 0, so $n$ terms total.

That is — $n$ to the power of $n$.

(How to count terms. http://gdpresent.blog.me/221113380202)

How to count terms (how to count integers in an inequality: don’t get tripped up by $

This is the thing I just keep linking back to whenever I need it, as the foundation for what I’m about to say~~~ Here we go. …

gdpresent.blog.me

OK — what about the order of $(n-x)(n-x-1)(n-x-2)$ ~~~~???? Gotta count terms here too.

It runs all the way to $(n-x)(n-x-1)(n-x-1)$~~~~$(n-x-(n-x-1))$,

so let’s count carefully.

$(n-x-(0))(n-x-(1))(n-x-(2))$~~~~$(n-x-(n-x-1))$

Same trick, count the terms,

from 1 to $n-x-1$ that’s $n-x-1$ terms,

and tack on the 0 to get $n-x$ terms total.

So — $n$ to the power of $n-x$.

Now let’s tidy it all up:

If you just stare at the colored bits, they all bow their heads together and flip upside down.

Alright ~ one thing handled.

Now if we look at the other terms with $n$ going to infinity,

It turns into this.

The natural constant… that thing….. …..

Don’t need to walk through it, right??????

Nah, you know what, I’ll do it once anyway.

It’s easy enough, so —

Right, the definition is just this,

and so

Derivation: complete!!!!

“Why’d you bother deriving it?” — you ask?????

The point is, we can now see exactly where and how the Poisson distribution comes from —

and that’s what the derivation just bought us!!!!!

Since the probability distribution function looks like that,

if you wanted to write down the cumulative distribution function,

Obviously if it were continuous you’d use an integral, right?

The problem-solving practice for the Poisson distribution

lives over here!!!! ( http://gdpresent.blog.me/220582073367)

You only need Prob 3.3! >_<

Chapter 3 Practice Problems [ Thermal and Statistical Mechanics I Studied #2 ]

Like I said in #1, I’m gonna re-establish the concepts now. *Heat (Q) What is this thing called heat~ To the question… I… don’t really kno…

gdpresent.blog.me

↓ One-page summary, written nice and neat on the iPad

import numpy as np
import matplotlib.pyplot as plt
import os
import scipy.stats as sc

pdp = []
pd = sc.poisson(1)
for num in range(1, 40):
    pdp.append(pd.pmf(num))
plt.plot(pdp, linewidth=2.0, label=r'$\lambda =$1')

pdp = []
pd = sc.poisson(4)
for num in range(1, 40):
    pdp.append(pd.pmf(num))
plt.plot(pdp, linewidth=2.0, label=r'$\lambda =$4')

pdp = []
pd = sc.poisson(10)
for num in range(1, 40):
    pdp.append(pd.pmf(num))
plt.plot(pdp, linewidth=2.0, label=r'$\lambda =$10')

plt.grid(True)
plt.legend()
plt.ylabel('Probability')
plt.xlabel('Number of Event')
plt.title('Poission Distribution')
plt.savefig('1.Poission Distribution.jpeg')

Originally written in Korean on my Naver blog (2017-11). Translated to English for gdpark.blog.

Comments