All Study Guides Linear Algebra for Data Science Unit 3
➗ Linear Algebra for Data Science Unit 3 – Vector Spaces and SubspacesVector spaces and subspaces form the foundation of linear algebra, providing a framework for understanding multidimensional data structures. This unit explores the properties of vector spaces, including closure, associativity, and distributivity, while introducing key concepts like linear combinations, span, and linear independence.
The study of subspaces, basis vectors, and dimensionality offers powerful tools for data analysis and machine learning. These concepts enable efficient data representation, feature selection, and dimensionality reduction, crucial for tackling high-dimensional datasets and uncovering underlying patterns in complex systems.
Key Concepts and Definitions
Vector space consists of a set of vectors and two operations (vector addition and scalar multiplication) that satisfy certain properties
Subspace is a subset of a vector space that is closed under vector addition and scalar multiplication
Linear combination expresses a vector as the sum of scalar multiples of other vectors (basis vectors)
Span refers to the set of all possible linear combinations of a given set of vectors
Geometrically represents the space "spanned" by the vectors
Linear independence means a set of vectors cannot be expressed as linear combinations of each other
Removing any vector from the set changes the span
Linear dependence occurs when one or more vectors can be expressed as linear combinations of the others
Basis is a linearly independent set of vectors that spans the entire vector space
Dimension equals the number of vectors in a basis for a vector space
Vector Space Fundamentals
Vector spaces are defined over a field (real numbers or complex numbers)
Elements of a vector space are called vectors and can be represented as arrays or lists of numbers
Vector addition combines two vectors element-wise (u + v = [ u 1 + v 1 , u 2 + v 2 , … , u n + v n ] \mathbf{u} + \mathbf{v} = [u_1 + v_1, u_2 + v_2, \ldots, u_n + v_n] u + v = [ u 1 + v 1 , u 2 + v 2 , … , u n + v n ] )
Scalar multiplication scales each element of a vector by a constant (c v = [ c v 1 , c v 2 , … , c v n ] c\mathbf{v} = [cv_1, cv_2, \ldots, cv_n] c v = [ c v 1 , c v 2 , … , c v n ] )
Zero vector 0 \mathbf{0} 0 has all elements equal to zero and serves as the identity element for vector addition
Negative of a vector − v -\mathbf{v} − v satisfies v + ( − v ) = 0 \mathbf{v} + (-\mathbf{v}) = \mathbf{0} v + ( − v ) = 0
Standard basis vectors e 1 , e 2 , … , e n \mathbf{e}_1, \mathbf{e}_2, \ldots, \mathbf{e}_n e 1 , e 2 , … , e n have a 1 in the i i i -th position and 0s elsewhere
Properties of Vector Spaces
Closure under vector addition: u + v \mathbf{u} + \mathbf{v} u + v is in the vector space for any u \mathbf{u} u and v \mathbf{v} v in the space
Closure under scalar multiplication: c v c\mathbf{v} c v is in the vector space for any scalar c c c and vector v \mathbf{v} v in the space
Associativity of vector addition: ( u + v ) + w = u + ( v + w ) (\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v} + \mathbf{w}) ( u + v ) + w = u + ( v + w )
Commutativity of vector addition: u + v = v + u \mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u} u + v = v + u
Identity element for vector addition: v + 0 = v \mathbf{v} + \mathbf{0} = \mathbf{v} v + 0 = v for any vector v \mathbf{v} v
Inverse elements for vector addition: For any v \mathbf{v} v , there exists − v -\mathbf{v} − v such that v + ( − v ) = 0 \mathbf{v} + (-\mathbf{v}) = \mathbf{0} v + ( − v ) = 0
Distributivity of scalar multiplication over vector addition: c ( u + v ) = c u + c v c(\mathbf{u} + \mathbf{v}) = c\mathbf{u} + c\mathbf{v} c ( u + v ) = c u + c v
Distributivity of scalar multiplication over field addition: ( c + d ) v = c v + d v (c + d)\mathbf{v} = c\mathbf{v} + d\mathbf{v} ( c + d ) v = c v + d v
Subspaces: Definition and Examples
Subspace is a non-empty subset of a vector space that is closed under vector addition and scalar multiplication
Inherits the vector space properties from the parent space
Examples of subspaces include lines and planes passing through the origin in R 2 \mathbb{R}^2 R 2 and R 3 \mathbb{R}^3 R 3
Set of all polynomials of degree at most n n n forms a subspace of the vector space of all polynomials
Null space (kernel) of a matrix A A A is a subspace of the domain, defined as { x : A x = 0 } \{\mathbf{x} : A\mathbf{x} = \mathbf{0}\} { x : A x = 0 }
Column space (range) of a matrix A A A is a subspace of the codomain, spanned by the columns of A A A
Row space of a matrix A A A is a subspace of the codomain, spanned by the rows of A A A
Eigenspaces corresponding to eigenvalues of a matrix are subspaces
Linear Combinations and Span
Linear combination of vectors v 1 , v 2 , … , v k \mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k v 1 , v 2 , … , v k is c 1 v 1 + c 2 v 2 + … + c k v k c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \ldots + c_k\mathbf{v}_k c 1 v 1 + c 2 v 2 + … + c k v k for scalars c 1 , c 2 , … , c k c_1, c_2, \ldots, c_k c 1 , c 2 , … , c k
Span of a set of vectors is the set of all possible linear combinations of those vectors
Denoted as span ( v 1 , v 2 , … , v k ) \text{span}(\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k) span ( v 1 , v 2 , … , v k )
Spanning set for a vector space is a set of vectors whose span equals the entire space
Trivial subspace { 0 } \{\mathbf{0}\} { 0 } is spanned by the empty set
Span of standard basis vectors e 1 , e 2 , … , e n \mathbf{e}_1, \mathbf{e}_2, \ldots, \mathbf{e}_n e 1 , e 2 , … , e n is the entire space R n \mathbb{R}^n R n
Span of a single non-zero vector is a line passing through the origin
Linear Independence and Dependence
Set of vectors is linearly independent if no vector can be expressed as a linear combination of the others
Equivalently, the only solution to c 1 v 1 + c 2 v 2 + … + c k v k = 0 c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \ldots + c_k\mathbf{v}_k = \mathbf{0} c 1 v 1 + c 2 v 2 + … + c k v k = 0 is c 1 = c 2 = … = c k = 0 c_1 = c_2 = \ldots = c_k = 0 c 1 = c 2 = … = c k = 0
Set of vectors is linearly dependent if at least one vector can be expressed as a linear combination of the others
Standard basis vectors e 1 , e 2 , … , e n \mathbf{e}_1, \mathbf{e}_2, \ldots, \mathbf{e}_n e 1 , e 2 , … , e n are linearly independent
Any set containing the zero vector is linearly dependent
In R n \mathbb{R}^n R n , any set of more than n n n vectors is linearly dependent (by the Steinitz exchange lemma)
Linearly independent sets are minimal spanning sets for their span
Basis and Dimension
Basis for a vector space is a linearly independent spanning set
Minimal set of vectors that spans the entire space
Dimension of a vector space equals the number of vectors in any basis
All bases for a vector space have the same number of vectors
Standard basis { e 1 , e 2 , … , e n } \{\mathbf{e}_1, \mathbf{e}_2, \ldots, \mathbf{e}_n\} { e 1 , e 2 , … , e n } is a common choice of basis for R n \mathbb{R}^n R n
Dimension of the trivial subspace { 0 } \{\mathbf{0}\} { 0 } is 0
Dimension of a line is 1, a plane is 2, and R n \mathbb{R}^n R n is n n n
Rank of a matrix equals the dimension of its column space (or row space)
Maximum number of linearly independent columns (or rows)
Applications in Data Science
Vector spaces provide a foundation for representing and manipulating data in machine learning and data analysis
High-dimensional data can be represented as vectors in R n \mathbb{R}^n R n , where each feature corresponds to a dimension
Subspaces can model lower-dimensional structures or patterns in data (principal components, clusters, manifolds)
Linear independence is crucial for feature selection and dimensionality reduction techniques (PCA, ICA, SVD)
Identifies non-redundant features that capture the essential information in data
Basis vectors serve as building blocks for representing data efficiently and compactly
Change of basis techniques enable data transformations and visualizations
Dimension measures the intrinsic complexity or degrees of freedom in a dataset
Curse of dimensionality: challenges arise as the number of features grows large relative to the sample size
Null space and column space of a data matrix reveal relationships between features and samples
Used in least squares regression, matrix factorization, and recommender systems