Control Theory

1.1 Linear algebra

Citation:

Linear algebra forms the backbone of control theory, providing tools to represent and analyze linear systems. Matrices, vectors, and linear transformations allow engineers to model system dynamics and design controllers efficiently.

Key concepts like eigenvalues, inner products, and least squares approximation enable stability analysis, optimization, and system identification. These fundamentals are essential for understanding and applying control theory principles.

Fundamentals of linear algebra

Linear algebra is a branch of mathematics that deals with linear equations, matrices, and vector spaces, providing a foundation for many areas of science and engineering, including control theory
Understanding the basics of linear algebra is essential for analyzing and designing control systems, as it allows for the representation and manipulation of linear systems in a concise and efficient manner

Scalars, vectors, and matrices

Scalars are single numbers, while vectors are ordered lists of numbers (typically represented as columns) that can represent quantities with both magnitude and direction
Matrices are rectangular arrays of numbers arranged in rows and columns, used to represent linear transformations and systems of linear equations
Scalar multiplication involves multiplying a vector or matrix by a scalar, while vector and matrix addition involves adding corresponding elements

Linear equations and systems

A linear equation is an equation in which the variables appear with a power of one and are not multiplied together, such as $ax + by = c$
A system of linear equations consists of two or more linear equations with the same variables, and the solution is the set of values that satisfies all equations simultaneously
Linear systems can be used to model various phenomena in control theory, such as the relationship between inputs and outputs of a linear system

Row reduction and echelon forms

Row reduction is a systematic process of applying elementary row operations (row switching, scalar multiplication, and row addition) to a matrix to simplify it into a more manageable form
Row echelon form (REF) is a matrix in which the leading entry (first non-zero entry from the left) of each row is 1, and the column containing the leading 1 has zeros in all other entries
Reduced row echelon form (RREF) is a unique matrix form in which the leading entry of each row is 1, and the column containing the leading 1 has zeros in all other entries, and every leading 1 is the only non-zero entry in its column
Row reduction is used to solve systems of linear equations, find the rank of a matrix, and compute the inverse of a matrix

Vector spaces and subspaces

Vector spaces are fundamental structures in linear algebra that consist of a set of vectors and two operations (addition and scalar multiplication) that satisfy certain axioms
Understanding vector spaces is crucial for analyzing the properties and behavior of linear systems in control theory, as the state space and input/output spaces are typically modeled as vector spaces

Vector space axioms and properties

A vector space $V$ over a field $F$ must satisfy the following axioms for vector addition and scalar multiplication:
1. Closure under addition: For any $u, v \in V$, $u + v \in V$
2. Associativity of addition: For any $u, v, w \in V$, $(u + v) + w = u + (v + w)$
3. Commutativity of addition: For any $u, v \in V$, $u + v = v + u$
4. Additive identity: There exists a unique zero vector $0 \in V$ such that $v + 0 = v$ for all $v \in V$
5. Additive inverses: For every $v \in V$, there exists a unique vector $-v \in V$ such that $v + (-v) = 0$
6. Closure under scalar multiplication: For any $a \in F$ and $v \in V$, $av \in V$
7. Distributivity of scalar multiplication over vector addition: For any $a \in F$ and $u, v \in V$, $a(u + v) = au + av$
8. Distributivity of scalar multiplication over field addition: For any $a, b \in F$ and $v \in V$, $(a + b)v = av + bv$
9. Compatibility of scalar multiplication: For any $a, b \in F$ and $v \in V$, $(ab)v = a(bv)$
10. Scalar multiplicative identity: For any $v \in V$, $1v = v$, where $1$ is the multiplicative identity in $F$
Common examples of vector spaces include $\mathbb{R}^n$ (the space of $n$-dimensional real vectors), $\mathbb{C}^n$ (the space of $n$-dimensional complex vectors), and the space of polynomials of degree at most $n$

Null space, column space, and row space

The null space (or kernel) of a matrix $A$ is the set of all vectors $x$ such that $Ax = 0$, representing the solution space of the homogeneous linear system $Ax = 0$
The column space (or range) of a matrix $A$ is the set of all linear combinations of the columns of $A$, representing the output space of the linear transformation defined by $A$
The row space of a matrix $A$ is the set of all linear combinations of the rows of $A$, which is equal to the column space of the transpose of $A$ ($A^T$)
The null space, column space, and row space are important subspaces associated with a matrix and provide insights into the properties of linear systems

Basis and dimension of vector spaces

A basis of a vector space $V$ is a linearly independent set of vectors that spans $V$, meaning that every vector in $V$ can be uniquely expressed as a linear combination of the basis vectors
The dimension of a vector space $V$ is the number of vectors in any basis of $V$, and it represents the "size" or "degrees of freedom" of the vector space
A vector space is called finite-dimensional if it has a finite basis, and infinite-dimensional otherwise
The standard basis for $\mathbb{R}^n$ consists of the vectors $e_1 = (1, 0, \ldots, 0)$, $e_2 = (0, 1, \ldots, 0)$, $\ldots$, $e_n = (0, 0, \ldots, 1)$, and the dimension of $\mathbb{R}^n$ is $n$

Linear transformations

Linear transformations are functions between vector spaces that preserve the structure of the spaces, i.e., they respect vector addition and scalar multiplication
Linear transformations are essential in control theory for modeling the behavior of linear systems, as they describe how the system's state and output change in response to inputs

Definition and properties of linear transformations

A linear transformation (or linear map) $T: V \to W$ between vector spaces $V$ and $W$ over the same field $F$ satisfies the following properties for all $u, v \in V$ and $a \in F$:
1. Additivity: $T(u + v) = T(u) + T(v)$
2. Homogeneity: $T(av) = aT(v)$
Linear transformations preserve the zero vector, i.e., $T(0_V) = 0_W$, where $0_V$ and $0_W$ are the zero vectors in $V$ and $W$, respectively
The composition of two linear transformations is also a linear transformation, i.e., if $T: V \to W$ and $S: W \to U$ are linear transformations, then $S \circ T: V \to U$ is also a linear transformation
Examples of linear transformations include matrix multiplication, differentiation, and integration (under certain conditions)

Kernel and range of linear transformations

The kernel (or null space) of a linear transformation $T: V \to W$ is the set of all vectors $v \in V$ such that $T(v) = 0_W$, i.e., $\ker(T) = {v \in V : T(v) = 0_W}$
The range (or image) of a linear transformation $T: V \to W$ is the set of all vectors $w \in W$ such that $w = T(v)$ for some $v \in V$, i.e., $\text{range}(T) = {w \in W : w = T(v) \text{ for some } v \in V}$
The kernel and range are subspaces of the domain and codomain of the linear transformation, respectively
The dimension of the kernel is called the nullity of the transformation, while the dimension of the range is called the rank of the transformation

Matrices of linear transformations

Every linear transformation $T: V \to W$ between finite-dimensional vector spaces can be represented by a matrix $A$ with respect to chosen bases of $V$ and $W$
If ${v_1, \ldots, v_n}$ is a basis for $V$ and ${w_1, \ldots, w_m}$ is a basis for $W$, then the matrix $A$ of the linear transformation $T$ is the $m \times n$ matrix whose $j$-th column is the coordinate vector of $T(v_j)$ with respect to the basis ${w_1, \ldots, w_m}$
Matrix multiplication corresponds to the composition of linear transformations, i.e., if $T: V \to W$ has matrix $A$ and $S: W \to U$ has matrix $B$, then the matrix of $S \circ T: V \to U$ is $BA$
The rank of a matrix $A$ is equal to the rank of the linear transformation it represents, and the nullity of $A$ is equal to the nullity of the transformation

Eigenvalues and eigenvectors

Eigenvalues and eigenvectors are crucial concepts in linear algebra that describe the behavior of a linear transformation or matrix when applied to certain vectors
In control theory, eigenvalues and eigenvectors play a vital role in analyzing the stability and dynamics of linear systems, as they characterize the system's response to inputs and perturbations

Characteristic equation and eigenvalues

An eigenvector of a square matrix $A$ is a non-zero vector $v$ such that $Av = \lambda v$ for some scalar $\lambda$, where $\lambda$ is called the eigenvalue corresponding to $v$
The characteristic equation of a square matrix $A$ is $\det(A - \lambda I) = 0$, where $I$ is the identity matrix and $\det$ denotes the determinant
The roots of the characteristic equation are the eigenvalues of the matrix $A$
The eigenvalues of a matrix $A$ determine important properties of the linear transformation it represents, such as whether it is invertible, diagonalizable, or stable

Eigenvectors and eigenspaces

For each eigenvalue $\lambda$ of a matrix $A$, the set of all eigenvectors corresponding to $\lambda$, together with the zero vector, forms a subspace called the eigenspace of $A$ associated with $\lambda$
The dimension of an eigenspace is called the geometric multiplicity of the corresponding eigenvalue
The algebraic multiplicity of an eigenvalue is its multiplicity as a root of the characteristic equation
The geometric multiplicity of an eigenvalue is always less than or equal to its algebraic multiplicity

Diagonalization of matrices

A square matrix $A$ is diagonalizable if it is similar to a diagonal matrix, i.e., there exists an invertible matrix $P$ such that $P^{-1}AP = D$, where $D$ is a diagonal matrix
The diagonal entries of $D$ are the eigenvalues of $A$, and the columns of $P$ are the corresponding eigenvectors
A matrix $A$ is diagonalizable if and only if it has a full set of linearly independent eigenvectors, i.e., the sum of the geometric multiplicities of its eigenvalues is equal to its size
Diagonalization simplifies matrix powers and exponentials, which is useful for solving systems of linear differential equations in control theory

Inner product spaces

An inner product space is a vector space equipped with an inner product, which is a generalization of the dot product that allows for the measurement of lengths and angles between vectors
Inner product spaces are fundamental in control theory for defining optimization problems, analyzing system stability, and designing optimal controllers

Dot product and inner product

The dot product of two vectors $x = (x_1, \ldots, x_n)$ and $y = (y_1, \ldots, y_n)$ in $\mathbb{R}^n$ is defined as $x \cdot y = x_1y_1 + \ldots + x_ny_n$
An inner product on a vector space $V$ over a field $F$ (either $\mathbb{R}$ or $\mathbb{C}$) is a function $\langle \cdot, \cdot \rangle: V \times V \to F$ that satisfies the following axioms for all $x, y, z \in V$ and $a \in F$:
1. Conjugate symmetry: $\langle x, y \rangle = \overline{\langle y, x \rangle}$
2. Linearity in the second argument: $\langle x, ay + z \rangle = a\langle x, y \rangle + \langle x, z \rangle$
3. Positive definiteness: $\langle x, x \rangle \geq 0$ and $\langle x, x \rangle = 0$ if and only if $x = 0$
The dot product is an example of an inner product on $\mathbb{R}^n$, and the standard inner product on $\mathbb{C}^n$ is defined as $\langle x, y \rangle = \overline{x}_1y_1 + \ldots + \overline{x}_ny_n$

Orthogonality and orthonormal bases

Two vectors $x$ and $y$ in an inner product space are called orthogonal if their inner product is zero, i.e., $\langle x, y \rangle = 0$
A set of vectors ${v_1, \ldots, v_n}$ is called orthogonal if any two distinct vectors in the set are orthogonal
An orthogonal set of vectors is called orthonormal if each vector has unit length, i.e., $\langle v_i, v_i \rangle = 1$ for all $i$
An orthonormal basis is a basis that is both orthogonal and orthonormal, and it provides a convenient coordinate system for representing vectors and linear transformations

Gram-Schmidt orthogonalization process

The Gram-Schmidt orthogonalization process is an algorithm for constructing an orthonormal basis from a given set of linearly independent vectors
The process works by sequentially subtracting the projection of each vector onto the previous orthogonal vectors, and then normalizing the resulting vector
Given a set of linearly independent vectors ${v_1, \ldots, v_n}$, the Gram-Schmidt process produces an orthonormal set ${e_1, \ldots, e_n}$ as follows:
1. Set $e_1 = \frac{v_1}{|v_1|}$
2. For $i = 2, \ldots, n$: a. Set $u_i = v_i - \sum_{j=1}^{i-1} \langle v_i, e_j \rangle e_j$ b. Set $e_i = \frac{u_i}{|u_i|}$
The Gram-Schmidt process is useful for solving least squares problems, computing QR decompositions, and designing orthogonal controllers

Least squares approximation

Least squares approximation is a method for finding the best-fitting curve or function to a given set of data points by minimizing the sum of the squares of the residuals (differences between the observed and predicted values)
In control theory, least squares techniques are used for system identification, parameter estimation, and optimal control design

Orthogonal projections and least squares

Given a vector $y$ and a subspace $W$ of an inner product space $V$, the orthogonal projection of $y$ onto $W$ is the unique vector $\hat{y} \in W$ that minimizes the distance $|y - \hat{y}|$
The orthogonal projection $\hat{y}$ is characterized by the property that $y - \hat{y}$ is orthogonal to every vector in $W$
The least squares problem can be formulated as finding the orthogonal projection of a vector $y$ onto the subspace spanned by the columns of a matrix $A$, i.e., minimizing $|Ax - y|$ over all vectors $x$
The solution to the least squares problem is given by the normal equations $A^TAx = A^Ty$, where $A^T$ denotes the transpose of $A$

Normal equations and pseudoinverse

The normal equations $A^TAx = A^Ty$ arise from the orthogonality condition between the residual vector $y - Ax$ and the column space of $A$
If the matrix $A$ has full column rank, then $A^TA$ is invertible, and the unique least squares solution is given by $x = (A^TA)^{-

Table of Contents

🎛️control theory review