Understanding common probability distributions is key in biostatistics and probabilistic methods. These distributions help model real-world phenomena, from normal data patterns to rare events, guiding decision-making and analysis in various fields, including health and research.
-
Normal (Gaussian) Distribution
- Symmetrical, bell-shaped curve characterized by its mean (ยต) and standard deviation (ฯ).
- Approximately 68% of data falls within one standard deviation of the mean, 95% within two, and 99.7% within three (Empirical Rule).
- Central to many statistical methods due to the Central Limit Theorem, which states that the sum of a large number of independent random variables tends to be normally distributed.
-
Binomial Distribution
- Models the number of successes in a fixed number of independent Bernoulli trials (e.g., coin flips).
- Defined by two parameters: the number of trials (n) and the probability of success (p).
- Useful for calculating probabilities of discrete outcomes, such as the likelihood of getting a certain number of heads in a series of coin tosses.
-
Poisson Distribution
- Describes the number of events occurring in a fixed interval of time or space, given a known average rate (ฮป) and independence of events.
- Particularly useful for modeling rare events, such as the number of phone calls received at a call center in an hour.
- The mean and variance of a Poisson distribution are both equal to ฮป.
-
Exponential Distribution
- Models the time between events in a Poisson process, characterized by the rate parameter (ฮป).
- Memoryless property: the probability of an event occurring in the next time interval is independent of how much time has already elapsed.
- Commonly used in survival analysis and reliability engineering to model lifetimes of objects or time until an event occurs.
-
Chi-Square Distribution
- A distribution of the sum of the squares of k independent standard normal random variables, used primarily in hypothesis testing and confidence interval estimation.
- Commonly applied in tests of independence and goodness-of-fit tests in categorical data analysis.
- The shape of the distribution depends on the degrees of freedom (df), with more degrees of freedom resulting in a distribution that approaches normality.
-
Student's t-Distribution
- Similar to the normal distribution but with heavier tails, making it more suitable for small sample sizes.
- Defined by degrees of freedom, which affects the shape; as sample size increases, it approaches the normal distribution.
- Used primarily in hypothesis testing and constructing confidence intervals for means when the population standard deviation is unknown.
-
Uniform Distribution
- All outcomes are equally likely within a specified range, characterized by minimum (a) and maximum (b) values.
- Can be discrete (e.g., rolling a fair die) or continuous (e.g., selecting a random number between 0 and 1).
- Useful in simulations and scenarios where each outcome has the same probability of occurring.
-
Bernoulli Distribution
- A special case of the binomial distribution with a single trial, representing two possible outcomes: success (1) or failure (0).
- Defined by a single parameter, the probability of success (p).
- Fundamental in probability theory and serves as the building block for more complex distributions.
-
Beta Distribution
- A continuous distribution defined on the interval [0, 1], characterized by two shape parameters (ฮฑ and ฮฒ).
- Flexible in modeling random variables that represent proportions or probabilities.
- Commonly used in Bayesian statistics and for modeling random variables that are constrained to a finite range.
-
Gamma Distribution
- A continuous distribution defined by a shape parameter (k) and a scale parameter (ฮธ), often used to model waiting times.
- Generalizes the exponential distribution; when k is an integer, it can represent the sum of k independent exponential random variables.
- Useful in various fields, including queuing models and reliability analysis.