A nice duality problem

In the recent Qualifying Exam, there was a very nice (I mean, tough) problem in the Linear Optimization paper that nobody could solve. I was able to crack it two days later. Here it is:


A matrix A \; (n \times n) is positive semidefinite if x^\intercal A x \geq 0 \; \forall x \in \mathbb{R}^n. Prove that if A is positive semidefinte then the following system has a solution: \{x^\intercal A \geq 0; \ x \geq 0; x \neq 0 \}.


\begin{cases} x^\intercal A \geq 0 \\ x \geq 0 \\ x \neq 0 \end{cases} \Leftrightarrow \begin{cases} x^\intercal A \geq 0 \\ x \geq 0 \\ \mathbf{1}\cdot x > 0 \end{cases} \Leftrightarrow \begin{cases} x^\intercal A \geq 0 \\ x \geq 0 \\ -\mathbf{1}\cdot x < 0 \end{cases}

where \mathbf{1} is a vector in \mathbb{R}^n whose entries are all 1.

Consider the following linear program

\min \ -\mathbf{1}\cdot x

subject to

x^\intercal A \geq 0

x \geq 0

Its dual is

\max \ \mathbf{0}\cdot y

subject to

Ay \leq -1

y \geq 0

Observe that the dual is infeasible, because if there exists a solution y such that Ay \leq -1 < 0 and y \geq 0 then this implies y^\intercal Ay < 0, contradicting the fact that A is positive semidefinite.

Since the dual is infeasible, the primal is either infeasible or unbounded. But x = 0 is a solution to the primal, so it is unbounded. Thus there exists a solution x such that \{-\mathbf{1}\cdot x < 0; x^\intercal A \geq 0; x \geq 0\} Q.E.D.

Now that the math is done, it’s time for some story. All of us were taught that the Farkas lemma is extremely important, and thus all of us tried to use it to solve this problem, to no avail. I was particularly bugged by the condition x \neq 0, which we had never seen in all our exercises. The exam was on a Wednesday. Two days later, on a Friday evening, on a bus ride from school home, I was determined to solve it. Perhaps it was the motion and the rumbling sound of the bus, perhaps it was my subconscious mind having spent enough time on the problem, or perhaps it was just my shining moment, but I finally realized that x \neq 0 means \mathbf{1}\cdot x > 0, given that x \geq 0. Now the next question is how do I use that in an LP? Well, I can’t use it in the constraints, so it must be in the objective function. Let’s try to \max \; \mathbf{1}\cdot x. But wait, I have a \geq constraint, so I must \min \; -\mathbf{1}\cdot x. And the rest is easy. That was probably one of my proudest moments in the PhD program so far.


Vector and function

So I’ve been thinking about the great talk by Shaowei (great ideas always make us think). Apart from the sheaf, another important thing for me is the fact that vector and function are the same thing. A function is an infinite-dimension vector, while a vector is a function that maps from {1, 2, …, n} to .

I came to understand that a function is an infinite dimension vector while studying Gaussian process (a Gaussian process is a Multivariate Normal Distribution with infinite dimension). After Shaowei’s talk, I came to understand the other way round. Now, I have the complete picture.

But a question I had is that if vector and function are the same thing, how come one is used to represent a point and the other is used to represent a collection of points? Well, we use these abstract concepts to represent specific things, but the concepts are not the things they represent. Furthermore, I reckon the symbols tell us that one point in an infinite-dimension space is the same as an infinite number of points in a one-dimensional space. I couldn’t see this connection between this two geometric objects when a coordinate system was in my head, but abstracting that out, the symbols showed the way. And with that thought, it sort of makes sense to get rid of the coordinate system….

An inspiring math talk

I attended a math talk by Assistant Professor Shaowei Lin yesterday. The talk was about the history of algebraic geometry, about how Descartes came up with the coordinate system 500 years ago to symbolically represent geometric objects, about how mathematicians spent the next 500 years embracing the system only to try eliminating it over the last 100 years. And they succeeded. He talked about Bertrand Russell, Alexander Grothendiek and many great achievements of mathematics before, during and after their times. As always, his talk was very interesting and inspiring.

Linear versus affine functions

The term linear function has different meaning in calculus and in linear algebra, which often confuses me. Here’s the clarification, which is a note I made to myself.

In calculus, a linear function is a first degree polynomial: f(x) = ax + b. Obviously, the function is called linear because its graphical representation is a line. However, in linear algebra, this function does not have the linear properties. The notion linear function, or linear transformation, is a mapping that has the following properties:

\begin{array}{l c l l} f(x+y) &=& f(x) + f(y) & (1) \\ f(kx) &= & kf(x) & (2) \end{array}

In this context, f(x) = ax + b is not a linear function. To verify this, we check (1) and (2). On the left hand side,

\begin{array}{l c l} f(x+y) &=&a(x+y)+b\\ f(kx) &=&akx + b\end{array}

while on the right hand side,

\begin{array}{l c l c l}f(x) + f(y) &= &ax + b + ay + b &= &a(x + y) + 2b\\ kf(x) &= &k(ax+b) &= &akx + kb\end{array}

So both (1) and (2) are not satisfied, which shows that f(x) = ax + b is not a linear function. On the other hand, we can prove easily that f(x) = ax is a linear transformation.

The function f(x) = ax + b has another name: it is an affine function, which is a linear function plus a constant.

Note that a linear transformation preserves the origin (zero is mapped to zero) while an affine transformation does not. In other words, a linear function maps a straight light through the origin to another straight line through the origin (effectively, it makes a rotation with an angle a), while an affine function rotates the line by an angle $late a$ and translate it by a distance b.

In higher dimensions, what I’ve described still hold. x and b can be vectors in a vector space and a can be a matrix.

Okay, enough with the equations. Next, I’m just gonna playing with a few functions to give you some examples.

In Figure 1, the line x + y = 0 (black) is mapped to the red line with the linear transformation f(x) = 2x and to the blue line with the affine transformation f(x) = 2x + 3.

Figure 1

In Figure 2, the same linear and affine transformations are applied to the curve y = x^3. Again, we see that the origin is preserved after the linear transformation but not the affine transformation.

Figure 2

And finally, we apply the same transformations to a circle in Figure 3.

Figure 3

I think this is fun.