Probabilistic Decision-Making

📊Probabilistic Decision-Making Unit 6 – Hypothesis Testing: Single & Dual Populations

Hypothesis testing is a powerful statistical tool used to make decisions about populations based on sample data. It involves formulating null and alternative hypotheses, calculating test statistics, and comparing them to critical values or p-values to draw conclusions. Single population tests examine claims about one parameter, while dual population tests compare parameters between groups. These methods are widely used in research, quality control, and decision-making across various fields, helping to uncover significant differences or effects in data.

Key Concepts and Definitions

  • Hypothesis testing a statistical method used to make decisions or draw conclusions about a population based on sample data
  • Null hypothesis (H0H_0) represents the default or status quo assumption, typically stating no significant difference or effect
  • Alternative hypothesis (HAH_A or H1H_1) represents the claim or research question, suggesting a significant difference or effect
  • Type I error (false positive) occurs when rejecting a true null hypothesis, denoted by α\alpha (significance level)
  • Type II error (false negative) occurs when failing to reject a false null hypothesis, denoted by β\beta
    • Power of a test (1β1 - \beta) represents the probability of correctly rejecting a false null hypothesis
  • Test statistic a calculated value used to compare with a critical value or p-value to make a decision about the null hypothesis
  • Critical value a threshold value determined by the significance level and the distribution of the test statistic under the null hypothesis
  • P-value the probability of observing a test statistic as extreme as or more extreme than the one calculated, assuming the null hypothesis is true

Foundations of Hypothesis Testing

  • Hypothesis testing relies on the principles of probability and sampling distributions
  • The central limit theorem states that the sampling distribution of the mean approaches a normal distribution as the sample size increases, regardless of the population distribution
  • Hypothesis tests assume random sampling, independence of observations, and a large enough sample size (typically n30n \geq 30)
  • The significance level (α\alpha) is determined before conducting the test and represents the maximum acceptable probability of making a Type I error
  • The choice of α\alpha depends on the consequences of making a Type I error and the desired power of the test
  • Hypothesis tests can be one-tailed (directional) or two-tailed (non-directional), depending on the alternative hypothesis
  • The null and alternative hypotheses are mutually exclusive and exhaustive, covering all possible outcomes

Types of Hypotheses

  • One-sample hypotheses test a claim about a single population parameter (mean, proportion, or variance)
    • Example: Testing if the average weight of a product differs from a specified value
  • Two-sample hypotheses compare parameters between two independent populations
    • Example: Comparing the mean scores of two different teaching methods
  • Paired-sample hypotheses test the difference between paired observations or repeated measures
    • Example: Comparing the effectiveness of a drug before and after treatment on the same individuals
  • Analysis of Variance (ANOVA) tests the equality of means among three or more populations
    • Example: Comparing the average yield of multiple fertilizer treatments
  • Chi-square tests assess the association between two categorical variables
    • Example: Testing the relationship between gender and preference for a product

Single Population Tests

  • Z-test for a population mean (μ\mu) when the population standard deviation (σ\sigma) is known
    • Test statistic: Z=Xˉμ0σ/nZ = \frac{\bar{X} - \mu_0}{\sigma / \sqrt{n}}
  • T-test for a population mean (μ\mu) when the population standard deviation (σ\sigma) is unknown
    • Test statistic: t=Xˉμ0s/nt = \frac{\bar{X} - \mu_0}{s / \sqrt{n}}, where ss is the sample standard deviation
  • Z-test for a population proportion (pp)
    • Test statistic: Z=p^p0p0(1p0)/nZ = \frac{\hat{p} - p_0}{\sqrt{p_0(1 - p_0) / n}}, where p^\hat{p} is the sample proportion
  • Chi-square test for a population variance (σ2\sigma^2)
    • Test statistic: χ2=(n1)s2σ02\chi^2 = \frac{(n - 1)s^2}{\sigma_0^2}

Dual Population Tests

  • Two-sample Z-test for comparing means (μ1\mu_1 and μ2\mu_2) when population variances are known
    • Test statistic: Z=(Xˉ1Xˉ2)(μ1μ2)0σ12/n1+σ22/n2Z = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)_0}{\sqrt{\sigma_1^2 / n_1 + \sigma_2^2 / n_2}}
  • Two-sample T-test for comparing means (μ1\mu_1 and μ2\mu_2) when population variances are unknown but assumed equal
    • Test statistic: t=(Xˉ1Xˉ2)(μ1μ2)0sp1/n1+1/n2t = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)_0}{s_p \sqrt{1/n_1 + 1/n_2}}, where sps_p is the pooled standard deviation
  • Welch's T-test for comparing means (μ1\mu_1 and μ2\mu_2) when population variances are unknown and unequal
    • Test statistic: t=(Xˉ1Xˉ2)(μ1μ2)0s12/n1+s22/n2t = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)_0}{\sqrt{s_1^2 / n_1 + s_2^2 / n_2}}
  • Two-proportion Z-test for comparing proportions (p1p_1 and p2p_2)
    • Test statistic: Z=(p^1p^2)(p1p2)0p^(1p^)(1/n1+1/n2)Z = \frac{(\hat{p}_1 - \hat{p}_2) - (p_1 - p_2)_0}{\sqrt{\hat{p}(1 - \hat{p})(1/n_1 + 1/n_2)}}, where p^\hat{p} is the pooled sample proportion
  • Paired T-test for comparing means of paired observations or repeated measures
    • Test statistic: t=DˉμD0sD/nt = \frac{\bar{D} - \mu_{D_0}}{s_D / \sqrt{n}}, where Dˉ\bar{D} is the mean difference and sDs_D is the standard deviation of the differences

Test Statistics and Critical Values

  • Test statistics are calculated from sample data and used to make decisions about the null hypothesis
  • The distribution of the test statistic under the null hypothesis determines the critical value(s) or p-value
  • For Z-tests, the test statistic follows a standard normal distribution (Z-distribution)
  • For T-tests, the test statistic follows a T-distribution with degrees of freedom (dfdf) based on the sample size(s)
  • For Chi-square tests, the test statistic follows a Chi-square distribution with degrees of freedom (dfdf) based on the sample size and number of parameters estimated
  • Critical values are determined by the significance level (α\alpha) and the type of test (one-tailed or two-tailed)
    • For a two-tailed test, the critical values are located at the α/2\alpha/2 and 1α/21 - \alpha/2 percentiles of the distribution
    • For a one-tailed test, the critical value is located at the α\alpha or 1α1 - \alpha percentile, depending on the direction of the alternative hypothesis

Interpreting Results and Decision Making

  • Compare the calculated test statistic with the critical value(s) or p-value to make a decision about the null hypothesis
  • If the test statistic falls in the rejection region (beyond the critical value) or the p-value is less than the significance level (α\alpha), reject the null hypothesis in favor of the alternative hypothesis
  • If the test statistic falls in the non-rejection region (within the critical values) or the p-value is greater than the significance level (α\alpha), fail to reject the null hypothesis
  • Rejecting the null hypothesis suggests that there is sufficient evidence to support the alternative hypothesis
  • Failing to reject the null hypothesis does not prove that the null hypothesis is true, but rather that there is insufficient evidence to support the alternative hypothesis
  • The decision to reject or fail to reject the null hypothesis should be interpreted in the context of the research question and the practical significance of the results
  • Confidence intervals can be constructed to estimate the range of plausible values for the population parameter(s) based on the sample data and the significance level

Real-World Applications

  • Quality control testing the proportion of defective items in a production process to ensure it meets the desired specifications
  • Clinical trials comparing the effectiveness of a new drug to a placebo or an existing treatment
  • Market research testing the preference for a new product feature among different consumer segments
  • Educational research comparing the performance of students under different teaching methods or curricula
  • Environmental studies testing the difference in pollutant levels between two locations or time periods
  • Psychological research comparing the mean scores of participants in different experimental conditions
  • Financial analysis testing the difference in returns between two investment strategies
  • Social science research testing the association between demographic variables and attitudes or behaviors


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.