1.4 Introduction to random variables

3 min readjuly 19, 2024

Random variables are key to understanding probability in engineering. They assign numerical values to outcomes, with discrete variables taking countable values and continuous variables taking any value in a range.

Probability distributions describe the likelihood of values. For discrete variables, we use probability mass functions. For continuous variables, we use probability density functions. Both help calculate probabilities and expected values.

Random Variables

Random variables: discrete vs continuous

Top images from around the web for Random variables: discrete vs continuous
Top images from around the web for Random variables: discrete vs continuous
  • Random variable assigns numerical value to each outcome in sample space
    • Denoted by capital letter (XX, YY, ZZ)
  • takes on countable number of distinct values
    • Number of defective items in a batch
    • Number of customers in a queue (bank, supermarket)
  • takes on any value within specified range or interval
    • Time until light bulb fails (hours, days)
    • Weight of randomly selected product (grams, ounces)

Probability distributions for variables

  • describes likelihood of random variable taking on specific value or falling within particular range
  • (PMF) gives probability of discrete random variable taking on specific value
    • Denoted by P(X=x)P(X = x), where XX is random variable and xx is specific value
    • Properties:
      • 0P(X=x)10 \leq P(X = x) \leq 1 for all xx
      • Sum of P(X=x)=1P(X = x) = 1 over all possible values of xx
  • (PDF) describes relative likelihood of continuous random variable taking on value within particular range
    • Denoted by f(x)f(x), where xx is value within range of random variable
    • Properties:
      • f(x)0f(x) \geq 0 for all xx
      • Integral of f(x)f(x) from -\infty to =1\infty = 1
  • (CDF) gives probability of random variable being less than or equal to specific value
    • Denoted by F(x)=P(Xx)F(x) = P(X \leq x)
    • For discrete random variables: F(x)=yxP(X=y)F(x) = \sum_{y \leq x} P(X = y)
    • For continuous random variables: F(x)=xf(t)dtF(x) = \int_{-\infty}^{x} f(t) dt

Probability calculations with variables

  • Probability of event is likelihood of specific outcome or set of outcomes occurring
  • For discrete random variables:
    • P(X=x)P(X = x): probability of random variable XX taking on value xx
    • P(aXb)=axbP(X=x)P(a \leq X \leq b) = \sum_{a \leq x \leq b} P(X = x)
  • For continuous random variables:
    • P(aXb)=abf(x)dxP(a \leq X \leq b) = \int_{a}^{b} f(x) dx

Expected value and variance

  • () is average value of random variable over large number of trials
    • Denoted by E(X)E(X) or μ\mu
    • For discrete random variables: E(X)=xP(X=x)E(X) = \sum x P(X = x)
    • For continuous random variables: E(X)=xf(x)dxE(X) = \int_{-\infty}^{\infty} x f(x) dx
  • Variance measures dispersion of random variable about its expected value
    • Denoted by Var(X)Var(X) or σ2\sigma^2
    • Var(X)=E[(Xμ)2]Var(X) = E[(X - \mu)^2]
    • For discrete random variables: Var(X)=(xμ)2P(X=x)Var(X) = \sum (x - \mu)^2 P(X = x)
    • For continuous random variables: Var(X)=(xμ)2f(x)dxVar(X) = \int_{-\infty}^{\infty} (x - \mu)^2 f(x) dx
  • is square root of variance
    • Denoted by σ\sigma
    • σ=Var(X)\sigma = \sqrt{Var(X)}

Key Terms to Review (18)

Central Limit Theorem: The Central Limit Theorem (CLT) states that the distribution of the sum (or average) of a large number of independent and identically distributed random variables approaches a normal distribution, regardless of the original distribution of the variables. This key concept bridges many areas in statistics and probability, establishing that many statistical methods can be applied when sample sizes are sufficiently large.
Continuous Random Variable: A continuous random variable is a variable that can take on an infinite number of values within a given range, often represented by real numbers. These variables are characterized by a probability density function (PDF), which describes the likelihood of the variable falling within a particular interval. Understanding continuous random variables is essential for analyzing distributions and relationships between multiple random variables.
Convolution: Convolution is a mathematical operation that combines two functions to produce a third function, expressing how the shape of one function is modified by another. In the context of random variables, it is particularly important for determining the probability distribution of the sum of independent random variables. Understanding convolution helps in analyzing the behavior of functions of random variables and is closely linked to characteristic functions, which are useful in deriving properties and applications in probability theory.
Cumulative Distribution Function: The cumulative distribution function (CDF) is a statistical tool that describes the probability that a random variable takes on a value less than or equal to a specific value. This function provides a complete characterization of the distribution of the random variable, allowing for the analysis of both discrete and continuous scenarios. It connects various concepts like random variables, probability mass functions, and density functions, serving as a foundation for understanding different distributions and their properties.
Discrete Random Variable: A discrete random variable is a type of variable that can take on a countable number of distinct values, often representing outcomes of a random process. These variables are crucial in defining probability distributions, allowing us to understand and calculate probabilities associated with different outcomes. They play a central role in constructing probability mass functions and are also fundamental in exploring marginal and conditional distributions in statistical analysis.
Expected Value: Expected value is a fundamental concept in probability that quantifies the average outcome of a random variable over numerous trials. It serves as a way to anticipate the long-term results of random processes and is crucial for decision-making in uncertain environments. This concept is deeply connected to randomness, random variables, and probability distributions, allowing us to calculate meaningful metrics such as averages, risks, and expected gains or losses.
Identically Distributed: Identically distributed refers to a condition where two or more random variables share the same probability distribution. This means that they exhibit the same statistical properties, such as mean, variance, and shape of the distribution. Recognizing when random variables are identically distributed is crucial in various scenarios, including understanding the behavior of sample averages and applying statistical methods such as the central limit theorem.
Independence: Independence refers to the condition where two events or random variables do not influence each other, meaning the occurrence of one event does not affect the probability of the other. This concept is crucial for understanding relationships between variables, how probabilities are computed, and how certain statistical methods are applied in various scenarios.
Law of Large Numbers: The law of large numbers is a fundamental statistical theorem that states as the number of trials in a random experiment increases, the sample mean will converge to the expected value (population mean). This principle highlights the relationship between probability and actual outcomes, ensuring that over time, averages stabilize, making it a crucial concept in understanding randomness and variability.
Mean: The mean, often referred to as the average, is a measure of central tendency that quantifies the expected value of a random variable. It represents the balancing point of a probability distribution, providing insight into the typical outcome one can expect from a set of data or a probability distribution. The concept of the mean is essential in understanding various statistical properties and distributions, as it lays the foundation for further analysis and interpretation.
Probability Density Function: A probability density function (PDF) describes the likelihood of a continuous random variable taking on a specific value. Unlike discrete probabilities, which can be summed, a PDF must be integrated over an interval to determine the probability of the variable falling within that range, highlighting its continuous nature.
Probability Distribution: A probability distribution is a mathematical function that describes the likelihood of different outcomes in a random experiment. It provides a comprehensive way to understand how probabilities are distributed across the possible values of a random variable, which is essential for making informed predictions and decisions. In contexts involving randomness and uncertainty, probability distributions help to define the behavior of random variables, guide estimation methods, and support simulations.
Probability Mass Function: A probability mass function (PMF) is a function that gives the probability of a discrete random variable taking on a specific value. It assigns probabilities to each possible value in the sample space, ensuring that the sum of these probabilities equals one. The PMF helps in understanding how likely each outcome is, which is crucial when working with discrete random variables.
Random variable: A random variable is a numerical outcome of a random process, which can take on different values based on the result of a random event. This concept is fundamental in probability and statistics, as it allows us to quantify uncertainty and analyze various scenarios. Random variables can be classified into discrete and continuous types, helping us to connect probability distributions with real-world applications and stochastic processes.
Reliability Analysis: Reliability analysis is a statistical method used to assess the consistency and dependability of a system or component over time. It focuses on determining the probability that a system will perform its intended function without failure during a specified period under stated conditions. This concept is deeply interconnected with random variables and their distributions, as understanding the behavior of these variables is crucial for modeling the reliability of systems and processes.
Signal Processing: Signal processing involves the analysis, interpretation, and manipulation of signals, which can be any physical quantity that varies over time or space. This field is crucial for extracting meaningful information from raw data, enabling the effective transformation and representation of random variables, understanding correlations, and analyzing processes that change over time.
Standard Deviation: Standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a set of values. It helps to understand how spread out the numbers are in a dataset, indicating whether they are close to the mean or widely scattered. In probability and randomness, it is crucial for assessing risk, variability in random variables, and is essential in evaluating distributions such as normal and hypergeometric distributions.
Transformation techniques: Transformation techniques refer to mathematical methods used to derive the probability distribution of a new random variable from an existing one. These techniques are crucial for understanding how the behavior of a random variable changes when it undergoes a transformation, such as scaling or shifting. By applying transformation techniques, one can analyze complex problems in probability and statistics by leveraging known distributions.
© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.