Central Limit Theorem
In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.
Some of the important versions of this are:
Variant | Assumptions | Formula Type | Use Case |
---|---|---|---|
Classical / Lindeberg–Lévy | i.i.d., finite variance | General inference, sample mean | |
Lyapunov | Independent, not identically dist. | Normalized sum | Heterogeneous data |
Lindeberg–Feller | Independent, with Lindeberg condition | General normalized sum | Most general CLT |
Multivariate CLT | i.i.d. vectors | Multivariate stats, finance | |
Functional CLT | i.i.d., viewed as stochastic process | Time series, Brownian motion modeling | |
Dependent Variable CLT | Weakly dependent vars | Sum | Time series, Markov chains |
Classical Central Limit Theorem
If
- Mean
, - Variance
,
Then the standardized sum:
as
Application 1 - Usage of Standard Deviation(σ)
CLT says:
The sampling distribution of the sample mean (
) will be approximately normal with:
Mean:
(same as the population mean) Standard Deviation (of the sample mean):
Where:
= standard deviation of the sampling distribution (also called standard error) = population standard deviation = sample size
How σ Works with CLT
1. Reduces Variability with Bigger Samples
As sample size
This means: Your sample mean becomes more stable and reliable as you take larger samples.
2. Allows Confidence Interval Estimation
In business, we often construct a confidence interval for the mean:
Where:
= sample mean = z-score (e.g., 1.96 for 95% confidence) = standard error
You need σ to determine how “wide” or “narrow” your estimate is.
3. Enables Hypothesis Testing
When comparing two groups (A/B testing, quality control, etc.), you calculate:
- A smaller σ → more precise Z-scores → clearer decisions.
Example
Let’s say a factory produces nails with a known population standard deviation
Standard error:
So your sampling distribution of the mean length will have:
- Mean = population mean (say, 50 mm)
- Std dev = 0.2 mm, not 2 mm!
Your estimate of the average length from the sample is much more precise than any one nail.
Summary
Concept | Role of σ (sigma) |
---|---|
Sampling distribution | Determines spread of sample means |
Standard error | |
Confidence intervals | Used to calculate margin of error |
Hypothesis testing (Z-tests) | Critical in test statistic formula |
If the population σ is unknown, we use the sample standard deviation