Core Mathematical Concepts: Theorems, Matrices, and Vector Fields

Posted on Oct 31, 2025 in Mathematical Engineering

The Cayley-Hamilton Theorem

Let $A$ be an $n \times n$ matrix, and let $P_A(\lambda)$ be its characteristic polynomial. The characteristic polynomial is defined as:

$$P_A(\lambda) = \det(\lambda I – A)$$

Expanding this determinant, we can write it as a polynomial in $\lambda$:

$$P_A(\lambda) = a_n \lambda^n + a_{n-1} \lambda^{n-1} + \cdots + a_1 \lambda + a_0$$

The Cayley-Hamilton Theorem states that every square matrix $A$ satisfies its own characteristic equation. This means that if we substitute the matrix $A$ for $\lambda$ in the characteristic polynomial (and replace the constant term $a_0$ with $a_0 I$, where $I$ is the identity matrix), the result is the zero matrix:

$$P_A(A) = a_n A^n + a_{n-1} A^{n-1} + \cdots + a_1 A + a_0 I$$

According to the theorem, the final result is the zero matrix:

$$P_A(A) = 0$$

Linear Algebra Fundamentals

Matrix Rank: Dimension of the Vector Space

The rank of a matrix is a fundamental concept in linear algebra that describes the “dimension” of the vector space spanned by its rows or columns. It represents the number of linearly independent rows or columns in the matrix.

A common way to find the rank of a matrix is to reduce it to its row echelon form. The rank is then equal to the number of non-zero rows in the echelon form.

For a square matrix of order $n \times n$:

If the determinant is non-zero, the rank is equal to $n$.
If the determinant is zero, the rank is less than $n$.

Example of Matrix Rank

Consider matrix $A$:

$$A = \begin{vmatrix} 1 & 2 & 3 \\ 4 & 8 & 12 \end{vmatrix}$$

The second row is 4 times the first row, so the rows are linearly dependent. The rank of this matrix is 1, because there is only one linearly independent row (or column).

Linear Dependence and Independence

Vectors are linearly dependent if one can be expressed as a linear combination of the others. Conversely, vectors are linearly independent if no vector in the set can be expressed this way.

Linearly Dependent Vectors

Definition: A set of vectors is linearly dependent if there is a nontrivial linear combination of the vectors that equals the zero vector.
Property: The determinant of a matrix formed by linearly dependent vectors will be zero.
Example: The vectors $(1, 2)$ and $(2, 4)$ are linearly dependent because $2 \cdot (1, 2) – 1 \cdot (2, 4) = (0, 0)$.

Linearly Independent Vectors

Definition: A set of vectors is linearly independent if the only linear combination of the vectors that equals the zero vector is the trivial combination (where all the scalars are zero).
Example: The vectors $(1, 0)$ and $(0, 1)$ are linearly independent because neither vector can be expressed as a scalar multiple of the other.

Eigenvalues and Eigenvectors

An eigenvector of a matrix is a non-zero vector that, when the matrix transformation is applied to it, changes only by a scalar factor, known as the eigenvalue (denoted by $\lambda$).

In simpler terms, an eigenvector represents a direction that doesn’t change its orientation when the matrix transformation is applied; it might just be stretched or shrunk. Eigenvalues and eigenvectors are essential for understanding the behavior of linear transformations.

Orthogonal Transformations

An orthogonal transformation is a linear transformation that preserves the inner product (dot product) between vectors. This means it preserves the lengths of vectors and the angles between them, effectively acting as a “rigid” motion.

Preservation of Inner Product: A transformation $T$ is orthogonal if $T(\mathbf{u}) \cdot T(\mathbf{v}) = \mathbf{u} \cdot \mathbf{v}$ for all vectors $\mathbf{u}$ and $\mathbf{v}$.
Representation: These transformations are represented by orthogonal matrices, where the rows and columns are orthonormal vectors (mutually orthogonal and of unit length).

Definition of a Linear Transformation

In linear algebra, a linear transformation is a function between vector spaces that preserves the operations of vector addition and scalar multiplication. It is a way to transform vectors while maintaining the underlying linear structure of the space.

Calculus Fundamentals

Taylor Series Approximation

A Taylor series is a way to represent a function as an infinite sum of terms calculated from the function’s derivatives at a single point. It is used to approximate complex functions with a polynomial. The more terms you include in the series, the more accurate the approximation becomes.

L’Hopital’s Rule for Indeterminate Forms

L’Hopital’s Rule is a method used to evaluate limits that result in indeterminate forms, such as $0/0$ or $\infty/\infty$.

The rule involves taking the derivatives of both the numerator and the denominator and then evaluating the limit again. If the limit of the derivatives is also indeterminate, you can apply the rule repeatedly until you get a determinate result.

Example Application

Let’s find the limit of $(x^2 – 1)/(x – 1)$ as $x$ approaches 1.

Initial Limit: Plugging in $x = 1$ directly yields $(1^2 – 1)/(1 – 1) = 0/0$, which is an indeterminate form.
Apply L’Hopital’s Rule:
- Differentiate the numerator: $d/dx(x^2 – 1) = 2x$
- Differentiate the denominator: $d/dx(x – 1) = 1$
The new limit expression is $\lim_{x \to 1} (2x)/(1)$.
Evaluate the New Limit: Plugging in $x = 1$ into $2x$ gives $2 \cdot 1 = 2$.

Therefore, the limit of $(x^2 – 1)/(x – 1)$ as $x$ approaches 1 is 2.

Euler Integral Functions: Beta and Gamma

There are two types of Euler integral functions: the Beta function and the Gamma function.

The Gamma function is a generalization of the factorial function to complex numbers. It is a single-variable function.
The Beta function is a special function defined by an integral. It is a two-variable function.

Relation and Properties

The two functions are closely related. The Beta function can be written in terms of the Gamma function:

$$B(p, q) = \frac{\Gamma(p)\Gamma(q)}{\Gamma(p+q)}$$

The Beta function is symmetric, meaning the value is irrespective of the order of its parameters:

$$B(p, q) = B(q, p)$$

Other key properties include:

$B(p, q) = B(p, q+1) + B(p+1, q)$
$B(p, q+1) = B(p, q) \cdot \frac{q}{p+q}$
$B(p+1, q) = B(p, q) \cdot \frac{p}{p+q}$
$B(p, q) \cdot B(p+q, 1-q) = \frac{\pi}{p \sin(\pi q)}$

Partial vs. Total Derivatives

Partial and total derivatives differ in how they consider the variables of a function.

Partial Derivative ($\partial f / \partial x$)

A partial derivative of a function of multiple variables is its derivative with respect to one variable, keeping the others constant.
It measures the instantaneous rate of change of the function with respect to a single variable while holding all other variables fixed.

Total Derivative ($df / dt$)

A total derivative considers the change in a function when all its variables change simultaneously.
It measures the instantaneous rate of change of the function with respect to a change in all its input variables, taking into account how those variables are interrelated.

Rolle’s Theorem

Rolle’s Theorem states that if a function $f(x)$ is continuous on a closed interval $[a, b]$ and differentiable on the open interval $(a, b)$, and if $f(a) = f(b)$, then there exists at least one point $c$ in the open interval $(a, b)$ such that $f'(c) = 0$.

Geometric Interpretation: If a continuous and differentiable function has the same value at both endpoints of an interval, there must be at least one point within that interval where the tangent to the function’s graph is horizontal (i.e., the slope is zero).

Vector Calculus and Multivariable Integration

The Jacobian Matrix and Determinant

The Jacobian is a matrix of partial derivatives used in coordinate transformations. It is crucial because its determinant quantifies how a region is stretched or compressed during a transformation, ensuring accurate calculations in multiple integrals.

Components of the Jacobian

If you have a function $f(x, y)$ that maps $(x, y)$ to $(u, v)$, the Jacobian matrix ($J$) contains the partial derivatives:

$$J = \begin{vmatrix} \frac{\partial u}{\partial x} & \frac{\partial u}{\partial y} \\ \frac{\partial v}{\partial x} & \frac{\partial v}{\partial y} \end{vmatrix}$$

The Jacobian determinant ($|J|$) acts as a multiplicative factor in the integral to account for the coordinate change, determining how the differential area element $dx\,dy$ transforms into $du\,dv$.

Vector Field Operators: Gradient, Divergence, and Curl

Gradient ($\nabla f$ or $\text{grad } f$)

The gradient of a scalar field $f(x, y, z)$ is a vector field that represents the direction and magnitude of the steepest increase of the scalar field at a specific point.

Direction: Points in the direction of the steepest increase.
Magnitude: Represents the maximum rate of change.
Mathematical Representation: In Cartesian coordinates:
$$\nabla f = \left(\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z}\right)$$

Divergence ($\nabla \cdot \mathbf{F}$)

Divergence represents the rate at which a vector field ($\mathbf{F}$) is expanding or contracting at a point.

Interpretation: Positive divergence means the field is spreading out (a source); negative divergence means it is converging (a sink).
Applications: Used in fluid dynamics and electromagnetism.

Curl ($\nabla \times \mathbf{F}$)

Curl represents the rotation or swirling motion of a vector field ($\mathbf{F}$) at a point.

Interpretation: The curl is a vector whose magnitude represents the strength of the rotation and whose direction represents the axis of rotation.
Applications: Used in fluid dynamics (vorticity) and electromagnetism (magnetic fields).

Applications of Double and Triple Integrals

Double and triple integrals are used to calculate quantities that involve integrating over two or three dimensions.

Double Integrals (2D)

Area of a region.
Volume under a surface.
Surface area.
Average value of a function over a plane region.

Triple Integrals (3D)

Volume of a solid region.
Mass of a solid with a given density function.
Coordinates of the center of mass.
Moments of inertia.

Lagrange Multipliers: Constrained Optimization

The method of Lagrange multipliers is a technique used to find the maximum or minimum of a function subject to one or more constraints. It involves introducing a new variable, the Lagrange multiplier, and a new function (the Lagrangian) that combines the original function and the constraint(s).

The method finds the critical points where the gradient of the objective function is parallel to the gradient of the constraint function.

Stokes’ Theorem (Curl Theorem)

Stokes’ Theorem states that the line integral of a vector field around a closed curve ($C$) is equal to the surface integral of the curl of that vector field over any surface ($S$) bounded by the curve.

$$\oint_C \mathbf{F} \cdot d\mathbf{r} = \iint_S (\nabla \times \mathbf{F}) \cdot d\mathbf{S}$$

It allows conversion between a line integral problem and a potentially easier surface integral problem.

The Divergence Theorem (Gauss’s Theorem)

The Divergence Theorem relates the total flux of a vector field ($\mathbf{F}$) across a closed surface ($\partial V$) to the volume integral of the divergence of that vector field over the volume ($V$) enclosed by the surface.

$$\iint_{\partial V} \mathbf{F} \cdot d\mathbf{S} = \iiint_V (\nabla \cdot \mathbf{F}) dV$$

The theorem states that the total amount of flux flowing out of the volume is equal to the sum of all the sources or sinks (divergence) within the volume.