scoresvideos
Intro to Business Statistics
Table of Contents

The Central Limit Theorem for proportions is a game-changer in statistics. It tells us that when we take big enough samples, the distribution of sample proportions becomes normal, no matter what the original population looks like.

This theorem is super useful for making educated guesses about population proportions. It helps us create confidence intervals and do hypothesis tests, which are key tools for drawing conclusions from sample data about larger populations.

The Central Limit Theorem for Proportions

Central Limit Theorem for proportions

  • States sampling distribution of sample proportion $\hat{p}$ is approximately normal when sample size is large enough (typically $n \geq 30$) regardless of population distribution shape
  • Requires both $n \cdot p \geq 10$ and $n \cdot (1-p) \geq 10$, where $n$ is sample size and $p$ is population proportion
  • Mean of sampling distribution of $\hat{p}$ equals population proportion $p$
  • Standard deviation of sampling distribution of $\hat{p}$ is $\sqrt{\frac{p(1-p)}{n}}$
  • Sampling distribution of $\hat{p}$ becomes more normally distributed as sample size increases (Law of Large Numbers)
  • Applies to binomial distributions, which model the number of successes in a fixed number of independent trials

Mean and standard deviation of sampling distributions

  • Mean of sampling distribution of $\hat{p}$: $\mu_{\hat{p}} = p$
  • Standard deviation of sampling distribution of $\hat{p}$: $\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}$
    • Requires population proportion $p$ and sample size $n$
  • When population proportion $p$ is unknown, estimate using sample proportion $\hat{p}$
    • Estimated standard deviation: $\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$
  • Larger sample sizes result in smaller standard deviations and more precise estimates of population proportion

Confidence intervals for population proportions

  • Formula: $\hat{p} \pm z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$
    • $\hat{p}$ is sample proportion
    • $z_{\alpha/2}$ is critical value from standard normal distribution for desired confidence level ($z_{0.025} = 1.96$ for 95% confidence)
    • $n$ is sample size
  • Provides range of plausible values for population proportion $p$ based on sample data
  • Confidence level (90%, 95%, 99%) is probability that interval contains true population proportion
  • Assumptions:
    1. Random sample from population
    2. Large enough sample size: $n \cdot \hat{p} \geq 10$ and $n \cdot (1-\hat{p}) \geq 10$
    3. Independent observations in sample
  • Example: Survey of 500 customers finds 60% satisfaction. 95% confidence interval: $0.60 \pm 1.96 \cdot \sqrt{\frac{0.60(1-0.60)}{500}} = (0.552, 0.648)$
  • The margin of error is represented by the term after the ± symbol in the confidence interval formula

Statistical Inference and Hypothesis Testing

  • Statistical inference uses sample data to draw conclusions about population parameters
  • Hypothesis testing involves making decisions about population parameters based on sample data
  • Both rely on probability theory to quantify uncertainty in estimates and decisions
  • The Central Limit Theorem for proportions is crucial for these methods when working with categorical data