Skip to main content

Normal distribution PDF derivation

GitHub repository

In this Jupyter notebook, we will derive the pdf of normal distribution using Dart thought experiment.

PDF:

f(x)=1σ2π⋅e−12(x−μσ)2f(x)=\frac{1}{\sigma \sqrt{2\pi}}\cdot e^{-\frac{1}{2} (\frac{x-\mu}{\sigma})^2}

Dart thought experiment​

The dart thought experiment is a conceptual way to understand the derivation of the probability density function (PDF) of a normal distribution, often known as a Gaussian distribution. Here's an explanation of the thought experiment:

  1. Dartboard Analogy: Imagine a dartboard where darts are thrown randomly. Assume that the darts are more likely to hit near the center of the board and less likely to hit as you move away from the center. This setup is analogous to a random variable with a normal distribution, where values near the mean are more likely than values far from the mean.

  2. Two-Dimensional Distribution: Consider the dartboard as a two-dimensional space with the center representing the mean of the distribution. The x and y coordinates of where the dart hits can be thought of as two independent normally distributed random variables, each with its own mean and standard deviation.

  3. Radial Symmetry and Distance: The probability of a dart landing at a particular point should only depend on the distance of that point from the center, not the direction. This radial symmetry suggests that the probability density at any point depends only on the distance from the mean, not the specific x and y values.


Simulation of darts shots​

import numpy as np
from scipy.integrate import quad
import matplotlib.pyplot as plt
import matplotlib.patches as patches
np.random.seed(2609)
shots=np.random.normal(0,2,(100,2))

plt.axhline(y=0, color='black', linestyle='--',alpha=0.7)
plt.axvline(x=0, color='black', linestyle='--',alpha=0.7)

plt.scatter(shots[:,0],shots[:,1],s=5,label='Dart shots')
plt.scatter(0,0,marker='*', color='r',s=70,label='Target')
plt.plot([0,2.15], [0, 0.88], linestyle=':', color='red',label='radial distance')


square = patches.Rectangle((2.10, 0.83), 0.3, 0.3, fill=False, color='red') # A small square

plt.gca().add_patch(square)

plt.annotate('dx',xy=(2.07, 0.4),xytext=(2.07, 0.4),fontsize=7)
plt.annotate('dy',xy=(2.5, 0.85),xytext=(2.5, 0.85),fontsize=7)
plt.annotate("'r'",xy=(2.5, 0.85),xytext=(0.4, 0.35),color='r',fontsize=10)


plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Dart Shots')

plt.axis('equal')
plt.legend(loc='upper left')
plt.grid()
plt.show()

png


Derivation of probability density function​

Main: Part 1​

Consider a function Ï•\phi which takes xx and yy coordinates and spits out the probability that the Dart shot would be in the area dA=dx.dydA=dx.dy.

ϕ:(x,y)→[0,1]≡ϕ:r→[0,1]\phi: (x,y) \rightarrow [0,1] \equiv \phi: r \rightarrow [0,1], where rr is the polar coordinate of (x,y)(x,y).

∫Sϕ(r)⋅dA=∫−∞∞∫−∞∞ϕ(r)⋅dx⋅dy=1\int_{S}\phi(r)\cdot dA = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \phi(r)\cdot dx \cdot dy = 1

Since xx and yy are independent,

Ï•(r)=fX(x)â‹…fY(y),\phi(r)=f_X(x)\cdot f_Y(y),

where fX(x)f_X(x) and fY(y)f_Y(y) are marginal pdfs of XX and YY.

rr is given by x2+y2\sqrt{x^2 + y^2}, hence

Ï•(r)=Ï•(x2+y2)=fX(x)â‹…fY(y).\phi(r)=\phi(\sqrt{x^2 + y^2})=f_X(x)\cdot f_Y(y).

Let y=0y=0 and fY(0)=λf_Y(0)=\lambda, then

ϕ(x2+02)=ϕ(x)=fX(x)⋅fY(0)=fX(x)⋅λ.  ⟹  ϕ(x)=λ⋅fX(x)  ⟹  ϕ(x2+y2)=λ⋅fX(x2+y2)  ⟹  λ⋅fX(x2+y2)=fX(x)⋅fY(y)\begin{align*} &\phi(\sqrt{x^2 + 0^2})= \phi(x) =f_X(x)\cdot f_Y(0) = f_X(x)\cdot \lambda .\\ &\implies \phi(x) = \lambda \cdot f_X(x)\\ &\implies \phi(\sqrt{x^2 + y^2}) = \lambda \cdot f_X(\sqrt{x^2 + y^2})\\ &\implies \lambda \cdot f_X(\sqrt{x^2 + y^2}) = f_X(x)\cdot f_Y(y) \end{align*}

Divide the last equation by λ2\lambda^2, we get

fX(x2+y2)λ=fX(x)λ⋅fY(y)λ.\begin{align*} &\frac{f_X(\sqrt{x^2 + y^2})}{\lambda} = \frac{f_X(x)}{\lambda}\cdot \frac{f_Y(y)}{\lambda}. \end{align*}

Assume that both the random variables, XX and YY, have same mean and standard deviation.   ⟹  fX(.)=fY(.)=f(.)\implies f_X(.)=f_Y(.)=f(.) and the above equation can be written as,

f(x2+y2)λ=f(x)λ⋅f(y)λ.\begin{align*} &\frac{f(\sqrt{x^2 + y^2})}{\lambda} = \frac{f(x)}{\lambda}\cdot \frac{f(y)}{\lambda}. \end{align*}

Let g(x)=f(x)λg(x)=\frac{f(x)}{\lambda},

  ⟹  g(x2+y2)=g(x)⋅g(y).\begin{align*} &\implies g(\sqrt{x^2 + y^2}) = g(x)\cdot g(y). \end{align*}

Aside: Algebra​

Consider the following observations:

  • nxâ‹…ny=n(x+y)n^x \cdot n^y = n^{(x+y)}
  • nx2â‹…ny2=n(x2+y2)n^{x^2} \cdot n^{y^2} = n^{(x^2+y^2)}
  • Let g(x)=ekx2g(x)=e^{kx^2}
  ⟹  g(x)⋅g(y)=ekx2⋅eky2=ek(x2+y2)=g(x2+y2)\implies g(x)\cdot g(y)=e^{kx^2}\cdot e^{ky^2}=e^{k(x^2+y^2)}=g(\sqrt{x^2+y^2})

Main: Part 2​

Now we have,

g(x)=ekx2 and also g(x)=f(x)λ  ⟹  f(x)=λ⋅g(x)=λ⋅ekx2\begin{align*} &g(x)=e^{kx^2} \text{ and also } g(x) = \frac{f(x)}{\lambda}\\ &\implies f(x) = \lambda \cdot g(x) = \lambda \cdot e^{kx^2} \end{align*}

Note: kk has to be negative otherwise f(x)f(x) would be an increasing function of xx as we have assumed that darts are more likely to hit the center of the board. To ensure kk to be negative, we set k=−m2,∀m∈Rk=-m^2, \forall m \in \mathbb{R}.

  ⟹  f(x)=λe−m2x2.\implies f(x)=\lambda e^{-m^2x^2}.

Since f(x)f(x) is a pdf,

∫−∞∞f(x)⋅dx=∫−∞∞λe−m2x2⋅dx=1\int_{-\infty}^{\infty}f(x) \cdot dx = \int_{-\infty}^{\infty}\lambda e^{-m^2x^2} \cdot dx=1

Let u=mxu=mx,   ⟹  du=mdx\implies du=mdx,

∫−∞∞λe−m2x2⋅dx=λm∫−∞∞e−u2⋅du=1\int_{-\infty}^{\infty}\lambda e^{-m^2x^2} \cdot dx = \frac{\lambda}{m}\int_{-\infty}^{\infty} e^{-u^2} \cdot du =1

Aside: Tricky integral​

Integrate ∫−∞∞e−u2⋅du\int_{-\infty}^{\infty} e^{-u^2} \cdot du.

We will use two ways to calculate above integral, analytical and numerical.

Analytical​

Consider

I=∫−∞∞e−u2⋅du,I=\int_{-\infty}^{\infty} e^{-u^2} \cdot du,

then,

I2=(∫−∞∞e−x2⋅dx)⋅(∫−∞∞e−y2⋅>dy)I^2=\left(\int_{-\infty}^{\infty} e^{-x^2} \cdot dx\right)\cdot\left(\int_{-\infty}^{\infty} e^{-y^2} \cdot >dy\right)

In terms of xx and yy, this can be expressed as a double integral over the entire plane:

I2=∫−∞∞∫−∞∞e−(x2+y2)⋅dx⋅dyI^2=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} e^{-(x^2+y^2)} \cdot dx \cdot dy

Switch from Cartesian coordinates (x,y)(x,y) to polar coordinates (r,θ)(r,\theta). In polar coordinates, x2+y2=r2x^2+y^2=r^2 >and dx dy=r dr dθdx dy=r dr d\theta.[How?]

The limits for rr will be from 00 to ∞\infty, and for θ\theta, from 00 to 2π2\pi.

  ⟹  I2=∫02π∫0∞e−r2⋅r⋅dr⋅dθ\implies I^2=\int_{0}^{2\pi} \int_{0}^{\infty} e^{-r^2} \cdot r \cdot dr \cdot d\theta
  • Step 1: Calculate ∫0∞e−r2â‹…râ‹…dr\int_{0}^{\infty} e^{-r^2} \cdot r \cdot dr Substitute r2=ur^2=u, we get du=2rdrdu=2rdr and :
12∫0∞e−u⋅du=12−e−u∣0∞=12[0−(−1)]=12\frac{1}{2}\int_{0}^{\infty} e^{-u} \cdot du =\frac{1}{2}-e^{-u}\Bigr|_{0}^{\infty}=\frac{1}{2}[0-(-1)]=\frac{1}{2}
  • Step 2: Calculate I2=∫02Ï€12â‹…dθ:I^2=\int_{0}^{2\pi} \frac{1}{2} \cdot d\theta:
I2=12θ∣02π=πI^2 = \frac{1}{2}\theta\Bigr|_{0}^{2\pi}=\pi   ⟹  I=π\implies I=\sqrt{\pi}

Numerical​

inf = float('inf')
def f(u):
return (np.e)**(-(u**2))

quad(f, -inf, inf)[0]

1.772453850905516

The above value is equal to π\sqrt{\pi}.

(np.pi)**(0.5)

1.7724538509055159

Main: Part 3​

We had

∫−∞∞f(x)⋅dx=∫−∞∞λe−m2x2⋅dx=λm∫−∞∞e−u2⋅du=1=λπm=1  ⟹  m2=λ2⋅π  ⟹  k=−λ2⋅π\begin{align*} \int_{-\infty}^{\infty}f(x) \cdot dx&=\int_{-\infty}^{\infty}\lambda e^{-m^2x^2} \cdot dx = \frac{\lambda}{m}\int_{-\infty}^{\infty} e^{-u^2} \cdot du =1 \\ &=\frac{\lambda \sqrt{\pi}}{m} = 1\\ &\implies m^2=\lambda^2 \cdot \pi\\ &\implies k = -\lambda^2 \cdot \pi \end{align*}

Hence,

f(x)=λe−λ2πx2.f(x)=\lambda e^{-\lambda^2 \pi x^2}.

Now let's talk about variance of XX:

Var(X)=σ2=E[(X−μX)2]=∫−∞∞(x−μX)2f(x)=∫−∞∞(x−μX)2λe−λ2πx2⋅dxVar(X)=\sigma^2=\mathbb{E}[(X-\mu_X)^2]=\int_{-\infty}^{\infty}(x-\mu_X)^2f(x)=\int_{-\infty}^{\infty}(x-\mu_X)^2 \lambda e^{-\lambda^2 \pi x^2} \cdot dx

We assumed μX=0\mu_X=0 since the beginning. Hence,

σ2=∫−∞∞x2λe−λ2πx2⋅dx=12πλ2\sigma^2=\int_{-\infty}^{\infty}x^2 \lambda e^{-\lambda^2 \pi x^2} \cdot dx=\dfrac{1}{2\pi\lambda^2}

This integral is not complicated by has too many steps so I recommend you to use this solution.

From the above expression, we get:

λ=1σ2π\lambda=\frac{1}{\sigma\sqrt{2\pi}}   ⟹  f(x)=1σ2π⋅e−12(xσ)2\implies f(x)=\frac{1}{\sigma\sqrt{2\pi}}\cdot e^{-\frac{1}{2}\left(\frac{x}{ \sigma}\right)^2}

If μX\mu_X is different from zero then in our derivation all x′sx's will be replaced by x−μx-\mu and we will get the following:

f(x)=1σ2π⋅e−12(x−μσ)2■f(x)=\frac{1}{\sigma\sqrt{2\pi}}\cdot e^{-\frac{1}{2}\left(\frac{x-\mu}{ \sigma}\right)^2} \qquad \qquad \blacksquare

References​

https://youtu.be/N-bI-Dsm-rw?si=HtiUuOghxs_X1SLM