Queueing theory is a branch of stochastic processes that studies waiting lines. It analyzes customer arrivals, service times, and system performance to optimize resource allocation and improve efficiency in various real-world scenarios.
From simple single-server models to complex networks, queueing theory provides tools to evaluate and enhance systems. It's used in call centers, hospitals, manufacturing, and more, helping organizations make data-driven decisions to balance customer satisfaction and operational costs.
Queueing theory studies the behavior of waiting lines or queues and involves the mathematical analysis of queue lengths, waiting times, and system performance
A queueing system consists of customers arriving for service, waiting in a queue if the service is not immediately available, and leaving the system after being served
Customers can refer to people, objects, or tasks that require service from a resource or server
Servers are the resources that provide the required service to the customers and can be represented by human operators, machines, or communication channels
Arrival process describes the pattern in which customers enter the queueing system, often characterized by the inter-arrival times between consecutive customers
Service process represents the pattern in which customers are served by the servers, often described by the service times required by each customer
Queue discipline specifies the order in which customers are selected from the queue for service (FIFO, LIFO, priority)
Queueing Models and Notation
Queueing models are mathematical representations of queueing systems used to analyze system performance and make operational decisions
Kendall's notation is a standard way to describe queueing models and has the form A/S/c/K/N/D
A represents the arrival process (M for Markovian, G for general, D for deterministic)
S represents the service process (M for Markovian, G for general, D for deterministic)
c represents the number of servers
K represents the maximum number of customers allowed in the system (default: infinite)
N represents the size of the population from which customers arrive (default: infinite)
D represents the queue discipline (default: FIFO)
Common queueing models include M/M/1 (single server with Poisson arrivals and exponential service times), M/M/c (multiple servers with Poisson arrivals and exponential service times), and M/G/1 (single server with Poisson arrivals and general service times)
Birth-death processes are often used to model queueing systems, where births represent arrivals and deaths represent departures
Arrival and Service Processes
The arrival process is often modeled using a Poisson process, where the inter-arrival times between consecutive customers are independent and exponentially distributed with rate λ
The service process is frequently modeled using an exponential distribution with rate μ, implying that the service times are independent and identically distributed
The traffic intensity or utilization factor, denoted by ρ, is defined as the ratio of the arrival rate to the service rate multiplied by the number of servers: ρ = λ / (cμ)
For a stable system, ρ must be less than 1 to ensure that the queue does not grow indefinitely
General arrival and service processes can be modeled using phase-type distributions, which are constructed by combining exponential distributions
Batch arrivals and batch service can be incorporated into queueing models to represent situations where customers arrive or are served in groups
Performance Measures
Performance measures are used to evaluate the efficiency and effectiveness of a queueing system and to make decisions about resource allocation and system design
The average number of customers in the system (L) and the average number of customers in the queue (Lq) provide insights into the congestion level of the system
The average time a customer spends in the system (W) and the average time a customer spends waiting in the queue (Wq) are important measures of the system's responsiveness
The probability of having n customers in the system (Pn) and the probability of the system being empty (P0) are used to analyze the distribution of the number of customers in the system
The server utilization (ρ) indicates the proportion of time the servers are busy and is a key factor in determining the system's efficiency
The probability of a customer waiting in the queue (Pw) and the probability of a customer being blocked or lost (Pb) are relevant in systems with finite waiting rooms or impatient customers
Little's Law and Its Applications
Little's Law is a fundamental result in queueing theory that relates the average number of customers in a system to the average arrival rate and the average time spent in the system
The law states that L = λW, where L is the average number of customers in the system, λ is the average arrival rate, and W is the average time a customer spends in the system
Little's Law holds for any stable queueing system, regardless of the arrival process, service process, or queue discipline, making it a powerful tool for analyzing queueing systems
The law can be applied to subsystems, such as the queue alone, yielding Lq = λWq, where Lq is the average number of customers in the queue and Wq is the average waiting time in the queue
Little's Law is used to estimate performance measures, plan capacity, and optimize resource allocation in various applications, such as manufacturing, healthcare, and telecommunications
Markovian Queues (M/M/1, M/M/c)
Markovian queues are characterized by Poisson arrival processes (M) and exponential service times (M), resulting in a Markov chain representation of the system
The M/M/1 queue is the simplest Markovian queue with a single server, unlimited queue capacity, and FIFO discipline
The steady-state probability of having n customers in the system is given by Pn = (1 - ρ)ρ^n, where ρ = λ/μ
The average number of customers in the system is L = ρ / (1 - ρ), and the average waiting time in the system is W = 1 / (μ - λ)
The M/M/c queue is an extension of the M/M/1 queue with multiple servers (c) working in parallel
The steady-state probabilities and performance measures can be derived using the birth-death process and the balance equations
The Erlang C formula gives the probability of waiting in the queue, while the Erlang B formula provides the probability of blocking in a system with no waiting room
Markovian queues have the memoryless property, which simplifies the analysis and allows for closed-form solutions of performance measures
Queueing Networks
Queueing networks are systems composed of multiple interconnected queues, where customers may visit several queues before leaving the system
Open queueing networks have external arrivals and departures, while closed queueing networks have a fixed number of customers circulating within the system
Jackson networks are a class of open queueing networks where the arrival process at each node is Poisson, the service times are exponentially distributed, and the routing of customers between nodes is probabilistic
The steady-state distribution of a Jackson network can be expressed as the product of the individual queue distributions, known as the product-form solution
Closed queueing networks are often used to model systems with a fixed population, such as computer systems or manufacturing cells
The Gordon-Newell theorem provides the steady-state distribution for closed queueing networks with exponential service times and a fixed number of customers
Queueing network models are used to analyze the performance of complex systems, optimize resource allocation, and evaluate the impact of system changes or upgrades
Applications and Real-World Examples
Call centers use queueing theory to determine staffing levels, minimize customer waiting times, and improve service quality
The Erlang C formula is often used to calculate the probability of waiting and the average waiting time in a call center with multiple agents
Hospital emergency departments apply queueing models to manage patient flow, reduce waiting times, and optimize resource utilization
Priority queueing models can be used to triage patients based on the severity of their condition
Manufacturing systems employ queueing theory to balance production lines, minimize work-in-process inventory, and maximize throughput
Jackson networks can model the flow of parts through a manufacturing system with multiple workstations
Computer and telecommunication networks use queueing theory to analyze the performance of servers, routers, and switches
The M/M/1/K queue can model a router with a finite buffer size (K) to determine the probability of packet loss
Transportation systems, such as airports and traffic intersections, apply queueing models to optimize flow, reduce congestion, and minimize delays
The M/G/1 queue can be used to model a single-lane traffic intersection with general service times representing the time required for vehicles to cross the intersection