Cauchy-Schwarz Inequality

Arpita Bhattacharya
6 min readDec 2, 2022

--

Objective of this article: Stating the inequality, discussing a simple proof (there are many) and some applications in the areas of mathematics.

Augustin-Louis Cauchy — The mathematician whose name I’ve heard again and again countless times throughout my entire Math degree—Cauchy’s theorem or Cauchy’s integral formula , Cauchy sequence, Cauchy-Riemann equations, Cauchy distribution, etc. He always appeared at least once in any module I took at the university — compulsory or choice — especially Analysis, Multivariable Calculus, Probability Theory and in a few algebraic (Linear Algebra) and geometric modules.

Out of his many theorems and results, one inequality that he published, is the Cauchy–Schwarz inequality (or, Cauchy–Bunyakovsky–Schwarz inequality). First Cauchy published the simple inequality. Bunyakovsky and Schwarz followed, publishing the integral version of the inequality and it’s modern proof.

Statement

Let’s being with the statement of the inequality.

The Cauchy-Schwarz inequality states that,

Let’s dissect this properly because mathematical notations are not universal. Many times the same notation is used differently in multiple places depending on the context of the mathematical concept.

The Euclidean norm, or length, or magnitude of

is denoted by |x| and is defined by,

In many other contexts and disciplines, the notation ||x|| is used instead of |x|.

The direction of a nonzero vector x is defined to be the unit vector x/|x|. The obvious relation

is the mathematical statement of the informal definition of a (nonzero) vector as a quantity which has both magnitude and direction (*Note that the zero vector does not have a direction).

The Euclidean distance between vectors x and y in R^n is defined as

The Euclidean inner product x.y, also called the dot product and scalar product, of vectors x, y belonging in R^n is defined as

Other notations for x.y include (x, y) and <x, y>. From the above definitions, it is evident that

So the Cauchy-Schwarz inequality,

or, for lesser confusion,

can be re-written as,

Proof

Now, let’s prove this. There are many small and simple ways to prove this inequality. I will discuss one of them in this article.

First we take two non-zero vectors x, y belonging to R^n.

let f(k) be a function defined as,

(ky — x) is also a vector. We know that the length of any real vector is always positive because length is the absolute value of the vector which means it’s the root of squares. So, the length of any real vector is always greater than or equal to 0.

So,

We know that if we take any vector v,

So for vector (ky — x),

We know that dot product is distributive, associative and commutative. Using the distributive property of the dot product,

Then using the commutative and associative properties of the dot product,

Let’s take

This is going to be greater than 0 for any k.

Now, we will evaluate k = b/2a into the function f(k). Before we evaluate, we should know for sure that the denominator is not zero. So, a = y.y where y is a non-zero vector. We also previously established that length of any real vector is greater than 0. Hence, a is non-zero and so is 2a.

Evaluating,

Simplifying the inequality gets us,

Putting back the original values a, b and c,

Taking root on the two sides of the inequality,

This is the Cauchy-Schwarz inequality. Hence, proved!

Except this, there are many variations in which this inequality can be proved.

Now, before we move to it’s applications, let me just point out one thing. What if one vector in the inequality is the scalar multiple of the other. That means, what if

Then,

Thus, in the case where one vector in the inequality is the scalar multiple of the other, the Cauchy-Schwarz inequality becomes an equality,

Applications

This inequality is applied in many fields and areas of study like in linear algebra (matrices, vectors, and transformations), probability theory (random variables, expected values, and correlation) as well as in physics (uncertainty principle and photon noise) and engineering (root-mean-square values compared to peak values of a waveform)

In geometry, it can be used to find the angle between two nonzero vectors. The Cauchy-Schwarz inequality implies that, if vectors x and y are both nonzero, then there exists unique

If you know the triangle inequality from analysis, defined as

Knowing simple algebraic, vector and inner product properties, this triangle inequality can be proved using the Cauchy-Schwarz inequality.

The equation

is used in statistics, specifically in probability theory to prove the covariance inequality, where ‘Var’ denotes variance and ‘Cov’ denotes covariance.

In Multivariable Calculus, which was part of my coursework at Warwick, this inequality was used was used while defining and proving a relation between |Ax| and |x| from which the operator norm arises as x ranges over R^n, where,

L(R^n,R^k) is the space of linear maps,

Hence, in this article we delved deep into one of the many important results in advanced mathematics which is a very useful tool in multiple proofs and in understanding various concepts in maths.

--

--

Arpita Bhattacharya
Arpita Bhattacharya

Written by Arpita Bhattacharya

23 | Masters in Math from Lund University, Sweden | Undergraduate from Warwick Uni, UK | STEM Enthusiast |

No responses yet