Empirical Proof of the Central Limit Theorem in MATLAB

The Central Limit Theorem (CLT) is a fundamental theorem in probability and statistics which tells us that the sampling distribution of the mean is asymptotically Gaussian as long as the sample size is sufficiently large, no matter what distribution is followed by the population. The sampling distribution of the mean has a mean equal to the population mean (μ) and variance given by σ2/N, where σ2 is the population variance and N is the sample size. Generally, the sample is considered sufficiently large for sample size greater than or equal to 30 (N ≥ 30). The variance of the sampling distribution of the mean is reduced by the factor N as the number of samples increases.

The ab initio proof of the CLT is rather complicated and requires strong knowledge of the underpinnings of probability theory1. However, the CLT can be explored and understood empirically, through observations. Here is a MATLAB code I wrote to explore the CLT in a graduate class I am teaching on Data Analysis for the Earth Sciences.

MATLAB code for exploring the CLT

MATLAB code for exploring the CLT

Population distribution (Rayleigh)

Population distribution (Rayleigh)

Sampling distribution of the mean with various sample size. Population distribution is Rayleigh.

Sampling distribution of the mean with various sample size. Population distribution is Rayleigh.

1Stark & Woods (2001) – Probability and Random Processes with Applications to Signal Processing (3rd Edition)

 

Advertisements
This entry was posted in Uncategorized and tagged , , , on by .

About WQ

I received my PhD (2013) in Remote Sensing, Earth and Space Science at the Dept. of Aerospace Engineering Sciences, University of Colorado, Boulder, USA, under a Fulbright fellowship. Currently, I'm an Assistant Professor in the Dept. of Space Science at Institute of Space Technology (IST), Islamabad, Pakistan, where I have been a founding member of the Geospatial Research & Education Lab (GREL). My general expertise is in Remote Sensing where I have worked with various remote sensing datasets through my career, while for my PhD thesis I specifically worked on Remote Sensing using SAR (Synthetic Aperture Radar) and Oceanography, working extensively on development of techniques to measure ocean surface currents from space-borne SAR intensity images and interferometric data. My research interests are: Remote sensing, Synthetic Aperture Radar (SAR) imagery and interferometric data processing & analysis, Visible/Infrared/High-resolution satellite image processing & analysis, Oceanography, Earth system study and modelling, LIDAR data processing and analysis, Scientific programming. I am a reviewer for IEEE Transactions on Geoscience & Remote Sensing, Forest Ecosystems, GIScience & Remote Sensing, Journal of African Earth Sciences, and Italian Journal of Agronomy. I am an alumnus of Pakistan National Physics Talent Contest (NPTC), an alumnus of the Lindau Nobel Laureate Meetings, a Fulbright alumnus, and the Pakistan National Point of Contact for Space Generation Advisory Council (SGAC). I was an invited speaker at the TEDxIslamabad event held in Nov., 2014. I've served as mentor in the NASA International Space App Challenge Islamabad events in April 2015 and April 2016.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s