Another way to check a statistical claim is to perform a significance test for the difference in two population proportions. As with any significance test, we have to write hypotheses, check our conditions and then calculate and conclude. 📲

Still lost? Let's do a refresher!

A statistical significance test is used to determine whether the difference between two population proportions is statistically significant, or whether it could have occurred by chance.

To perform a significance test for the difference in two population proportions, you need to first write your null and alternative hypotheses. The null hypothesis states that there is no difference between the two population proportions, while the alternative hypothesis states that there is a difference.

Next, you need to check that the conditions for the test are met. These include having a large enough sample size and having a random and independent sample.

Once you have checked the conditions, you can calculate the test statistic and determine the p-value. The p-value is the probability of obtaining a test statistic as extreme as the one observed, given that the null hypothesis is true. If the p-value is less than the significance level (usually 0.05), you can reject the null hypothesis and conclude that the difference between the two population proportions is statistically significant. If the p-value is greater than the significance level, you cannot reject the null hypothesis and must conclude that the difference is not statistically significant. 😄

Hypotheses and Parameters

The first thing we need to do when setting up a significance test for the difference in two population proportions is to write out our hypotheses. Our null hypotheses will always have our two population proportions being equal, while our alternate has them either greater than, less than or not equal to each other. 🏆

It is also important in this stage of setting up the test to identify what p1 and p2 represent. We have to define our parameters so the reader knows what we are truly comparing.

Conditions

We also must check our conditions for inference. The same three conditions apply as did for confidence intervals with one little small change in the normal check.

(1) Random

Probably the most important condition is that we need to be sure that both of our samples come from random samples. If we don't take a random sample from our population, then our findings suffer from sampling bias and we are stuck and we can't generalize our findings to our population. 😞

(2) Independence

To check that our sample is independent, we need to make sure that both of our populations are at least 10 times that of our samples. Also, if we are dealing with a randomized experiment, the random assignment of treatments classifies our samples as independently selected. 🔟

(3) Normal

When dealing with proportions, we always check our normal condition by using the Large Counts Condition, which states that our expected successes and failures is at least 10. With a 2 proportion z test, we have to combine our proportions to create a combined p-hat. This is what we use to find our expected failures and successes. 🎩

Then we have to verify that each of our expected failures and successes are at least 10.

This is because we are using a pooled sample. In this test, you combine the two samples into a single "pooled" sample and calculate a single proportion for the combined sample. The test statistic is then calculated based on the difference between the two proportions and the pooled sample proportion. 🏊

Example

Let's return to our MJ vs. Lebron problem from earlier... again. Recall that MJ made 836/1623 shots and Lebron made 622/1493 shots. Instead of testing this claim with a confidence interval, let's test it using a 2 Prop Z Test to verify our results. 🏀

Hypotheses and Parameters

Another great idea when writing our hypotheses is to use meaningful subscripts such as MJ and L that clarify which proportion matches which population.

Conditions

Random: Even though the problem never stated that they were random (and we discussed the problems with this in Unit 6.9) we are going to assume it is random.
Independent: It is reasonable to believe (and obviously true) that MJ took at least 16, 230 shots in his career and Lebron took at least 14,930 shots in his career, so the samples are independent.
Normal: This is the one that will be a bit different. First, we have to calculate our pooled p-hat. Using the formula above, we get 0.468

Next, we have to check our large counts condition using this pooled p-hat.

1623 (0.468) > 10 ✔️
1623 (0.532) > 10 ✔️
1493 (0.468) > 10 ✔️
1493 (0.532) > 10 ✔️

Now that we have checked conditions, we are ready to calculate and test our claim. 🧪

🎥 Watch: AP Stats - Inference: Hypothesis tests for Proportions

Key Terms to Review (13)

2 Proportion Z Test: The 2 Proportion Z Test is a statistical method used to determine if there is a significant difference between the proportions of two independent populations. This test compares the success rates of two different groups and calculates whether any observed differences in these proportions could be attributed to random chance. It utilizes a Z-score to assess the likelihood of observing the data if the null hypothesis were true, making it an essential tool for hypothesis testing in various fields such as social science, healthcare, and marketing.

Alternative Hypothesis: The alternative hypothesis is a statement that contradicts the null hypothesis, suggesting that there is an effect, a difference, or a relationship present in the data. This hypothesis is what researchers aim to support through their analysis and testing, as it represents the possibility that something interesting is happening beyond random chance.

Confidence Interval: A confidence interval is a range of values derived from sample statistics that is likely to contain the true value of an unknown population parameter, with a specified level of confidence. This concept connects statistical inference to the estimation of parameters, allowing researchers to make informed claims about populations based on sample data.

Independent Sample: An independent sample refers to a set of data collected from two different populations that do not influence one another. This concept is crucial in hypothesis testing, especially when comparing the proportions between two groups, ensuring that the results are not affected by overlapping subjects or common influences. By using independent samples, researchers can obtain unbiased estimates of the difference in population proportions.

Large Counts Condition: The Large Counts Condition states that for the sampling distribution of sample proportions to be approximately normal, the counts of successes and failures in a sample must both be large enough, typically at least 10. This condition ensures that the sampling distribution behaves in a predictable manner, making it easier to construct confidence intervals and perform hypothesis tests.

P-value: A P-value is a measure used in hypothesis testing to determine the strength of evidence against the null hypothesis. It quantifies the probability of observing test results at least as extreme as the ones obtained, assuming that the null hypothesis is true. A smaller P-value indicates stronger evidence against the null hypothesis, which is crucial for decision-making in various statistical tests.

Pooled Sample: A pooled sample refers to the combination of data from two or more different populations to estimate a common parameter, such as a proportion or mean. This approach is often used when comparing the differences between two population proportions, allowing for more robust statistical inference by increasing the overall sample size and improving the reliability of estimates.

Random Sample: A random sample is a subset of individuals chosen from a larger population, where each individual has an equal chance of being selected. This method helps ensure that the sample accurately represents the population, minimizing bias and allowing for more reliable statistical inferences.

Sampling Bias: Sampling bias occurs when certain individuals or groups within a population are more likely to be selected for a sample than others, leading to an unrepresentative sample. This can distort the results of statistical analyses, affecting conclusions drawn about the entire population and leading to incorrect generalizations.

Sample Size: Sample size refers to the number of observations or data points collected from a population for the purpose of statistical analysis. It plays a critical role in determining the reliability and validity of the results, impacting the precision of estimates and the power of hypothesis tests.

Significance Test: A significance test is a statistical method used to determine if there is enough evidence to reject a null hypothesis in favor of an alternative hypothesis. This process involves comparing sample data to what is expected under the null hypothesis and calculating a p-value to evaluate the strength of the evidence against the null hypothesis. It plays a crucial role in comparing population means and proportions, interpreting p-values, and drawing conclusions from data.

Significance Level: The significance level, often denoted as alpha (\(\alpha\)), is the threshold used to determine whether to reject the null hypothesis in statistical hypothesis testing. It represents the probability of making a Type I error, which occurs when a true null hypothesis is incorrectly rejected. Understanding the significance level is crucial for interpreting results and making informed decisions based on statistical tests.

Test Statistic: A test statistic is a standardized value that is calculated from sample data during a hypothesis test. It helps determine how far the observed data deviates from what is expected under the null hypothesis, allowing researchers to make decisions about the validity of that hypothesis.

Back

Practice Quiz Glossary

Find gaps with guided practice

Table of Contents

📊ap statistics review

6.10 Setting Up a Test for the Difference of Two Population Proportions

Hypotheses and Parameters

Conditions

(1) Random

(2) Independence

(3) Normal

Example

Hypotheses and Parameters

Conditions

Key Terms to Review (13)

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

Back

6.11 Carrying Out a Test for the Difference of Two Population Proportions