I am taking Convex Optimization this semester (2021 Spring). The notes are based on my understanding of the course materials provided by Prof. Zhouchen Lin. The notes only offer concise definition and formulas with few proofs and examples.
In this chapter, we introduce convexity. We define convex sets first and will discuss convex functions in the next chapter. We go from affine sets to convex sets and cones. Later on, we will talk about some operations that could preserve convexity. Also, we will discuss about how to separate two convex sets.
Convex Sets
Table of Contents:
Affine and convex sets
Affine
-
Line segment
The line passing through two given points $ \mathbf{x}{1}, \mathbf{x}{2}$ can be described parametrically by θ:
y=θx1+(1−θ)x2=x2+θ(x1−x2)
-
Affine sets
A set is affine if for any $ \mathbf{x}{1}, \mathbf{x}{2}\in {C}, \theta\in\mathbb{R}$, we have θx1+(1−θ)x2∈C.
e.g. The solution set of a linear equation is an affine set:
Ax1=b,Ax2=b⇒A(θx1+(1−θ)x2)=b
In reverse, every affine set can be expressed as the solution set of a system of linear equations.
-
Affine hull
Given a set C, generalize to finite(to ensure the affine combination to be finite) multiple points and we have its affine hull:
affC={θ1x1+…+θkxk∣x1,…,xk∈C,θ1+…+θk=1}
We can define relative interior and relative boundary of an affine set. By relative, we mean that when dimC<dimB(x,r), only B(x,r)∩affC⊆C is required (not the whole ball).
Convex
-
Convex sets
A set is convex if for any x1,x2∈C,0≤θ≤1, we have θx1+(1−θ)x2∈C. (i.e. line segment between any 2 points lies in the set)
- Not necessary to be closed
- ∅ is also convex
-
Practice. How to determine whether a set is convex?
-
By definition
Example: {x∈Rn∣α≤a⊤x≤β} is convex:
a⊤(θx1+(1−θ)x2)=θa⊤x1+(1−θ)a⊤x2θα≤θa⊤x1≤θβ,(1−θ)α≤(1−θ)a⊤x2≤(1−θ)β⇒α≤a⊤(θx1+(1−θ)x2)≤β
-
By constructing the condition of the set to be the form of a convex set.
Example: {x∣∥x−a∥2≤θ∥x−b∥2}(a=b,0≤θ≤1) is convex.
∥x−a∥2≤(x−a)⊤(x−a)≤(1−θ2)x⊤x−2(a⊤−θ2b⊤)x≤x⊤x−1−θ22(a⊤−θ2b⊤)x≤∥∥∥∥∥x−1−θ2a−θ2b∥∥∥∥∥22≤∥∥∥∥∥x−1−θ2a−θ2b∥∥∥∥∥2≤θ∥x−b∥2θ2(x−b)⊤(x−b)θ2b⊤b−a⊤a1−θ2θ2b⊤b−a⊤a(1−θ2)2θ2(a−b)21−θ2θ∥a−b∥2
The set is a n-dimensional ball, which is convex.
-
Convex hull
Convex hull of a set C is the set of all convex combinations of points in C.
Examples:
-
conv{eiej⊤}=xij≥0,∑ijxij=1
-
conv{uu⊤∣∥u∥=1}
By definition of convex hull, S={∑iλiuiui⊤∣λi≥0,∑iλi=1}. If X∈S, then X⪰0,tr(X)=1.
- Sum of semidefinite matrices is still semidefinite.
- tr(∑iλiuiui⊤)=∑iλitr(uiui⊤)=∑iλi=1.
Now we have another set S~={X∣X⪰0,tr(X)=1}. We have already proved that S⊂S~.
Suppose that X∈S~. By eigenvalue decomposition, X=∑iλiuiui⊤. ∑iλi=tr(X)=1. Thus, S~⊂S, S~=S.
-
conv{uv⊤∣∥u∥=∥v∥=1}
S={∑iσuivi⊤∣σ≥0,∑iσi=1}. S~=S~={X∣∥X∥∗≤1}.
-
S⊂S~
∥X∥∗=≤=∥i∑σuivi⊤∥∗by triangle inequality of norm,i∑σi∥uivi⊤∥∗the rank is 1, so its SVD=itself, eigenvalue is (1,0,⋯)i∑σi=1
-
S~⊂S
∥X∥=UΣV⊤=i∑pσiuivi⊤,p=min(m,n)
Since σi is singular value, it satisfies σi≥0. Let δ=1−∑i=1pσi. Take σp+1=σp+2=δ/2, take up+2=−up+1,vp+1=vp+1. Now we have:
X=i=1∑pσiuivi⊤+2δup+1vp+1⊤−2δup+1vp+1⊤
Cone
A set C is a cone if for every x∈C,θ>0, we have θx∈C.
A set C is a convex cone if for any x1,x2∈C,θ1,θ2>0, we have θ1x1+θ2x2∈C.
The conic hull of a set C is the set of all conic combinations of points in C.
{θ1x1+…+θkxk∣xi∈C,θi≥0,i=1,…,k}
-
Dual cone
Let K be a cone, its dual cone K∗ satisfies:
K∗={y∣x⊤y≥0,∀x∈K}
K∗ is always closed convex. K1⊆K2 implies K2∗⊆K1∗. K∗∗ is the closure of the convex hull of K.
Examples:
-
Subspace. K={x∣Ax=0}. K∗={y∣(A⊥)⊤y=0}. Because x=A⊥z.
-
Positive semidefinite cone. Dual cone is itself. tr(XY)≥0 for all X⪰0⟺Y⪰0
-
⇒.
If Y∈/S+n, then there exists q that q⊤Yq=tr(qq⊤Y)<0. Let X=qq⊤⪰0, we have tr(XY)<0. Thus, tr(XY)≥0 for all X⪰0⟹Y⪰0
-
⇐.
tr(YX)=tr(Yi=1∑nλiqiqiT)=i=1∑nλiqiTYqi≥0
-
Norm cone. K={(x,t)∈Rn+1∣∥x∥≤t}. Dual cone: K∗={(u,v)∈Rn+1∣∥u∥∗≤v}
We need to prove that x⊤u+tv≥0 for all ∥x∥≤t⟺∥u∥∗≤v.
-
⇐.
For some t>0, ∥x∥≤t. By definition of dual norm, u⊤(−x/t)≤∥u∥∗≤v, and thus u⊤x+tv≥0.
-
⇒.
If ∥u∥∗>v, then by definition of dual norm, there exists an x that ∥x∥≤1 and x⊤u>v.
Then we have ∥−x∥=∥x∥≤1 and take t=1: u⊤(−x)+v<0, which contradicts with the lefthand side(assumption). Thus, ∥u∥∗≤v.
Given K, we usually guess a dual cone K∗. If it is hard to prove that x⊤y≥0,∀x∈K⇔y∈K∗, we can prove x⊤y≥0,∀y∈K∗⇔x∈K instead.
Operators that preserve convexity
Intersection
Sα is convex ∀α∈A⇒⋂α∈ASα is convex (can be intersection of an infinite number of sets)
Affine function
S⊆Rn is convex, f(x)=Ax+b:Rn→Rm⇒f(S)={f(x)∣x∈S} is convex
Similarly, the inverse image of S under f: f−1(S)={x∣f(x)∈S} is also convex.
Sum and Cartesian product
S1,S2 are convex, then S1+S2={x+y∣x∈S1,y∈S2} and S1×S2={(x1,x2)∣x1∈S1,x2∈S2} are convex.
Examples:
-
Polyhedron
{x∣Ax⪯b,Cx=d}={x∣f(x)∈R+m×{0}},f(x)=(b−Ax,d−Cx)
-
Solution set of A(x)=∑i=1nxiAi⪯B
the inverse image of positive semidefinite cone under f(x)=B−A(x)
-
Hyperbolic cone {x∣x⊤Px≤(c⊤x)2,c⊤x≥0}, P∈S+n
the inverse image of the second order cone {(z,t)∣z⊤z≤t2.t≥0} under f(x)=(P1/2x,c⊤x)
-
Ellipsoid {x∣(x−xc)⊤P−1(x−xc)≤1}
The image of Euclidean ball {u∣∥u∥2≤1} under x=f(u)=P1/2u+xc
The inverse image of the unit ball under u=g(x)=P−1/2(x−xc)
Perspective function
P:Rn+1→Rn,P(z,t)=z/t,z∈Rn,t>0. The perspective function actually scales the vector so that the last component becomes 1, and can then be dropped.
If C Is convex, then P(C)={P(x)∣x∈C} and P−1(C)={(x,t)∈Rn+1∣x/t∈C,t>0} are convex.
Linear-fractional functions
Applying perspective function to affine functions. If g(x)=[AcT]x+[bd], then f(x)=(Ax+b)/(c⊤x+d),domf={x∣c⊤x+d>0}.
Separating and supporting hyperplanes
Separating hyperplane theorem
Two convex sets C∩D=∅⇔∃a=0,b that a⊤x≤b,∀x∈C,a⊤x≥b,∀x∈D.
Example: Separation of an affine and a convex set
C Is convex and D={Fu+g∣u∈Rm} is affine. Suppose that they are disjoint, then there exist a=0 and b such that a⊤x≥b,∀x∈D. Thus, a⊤Fu≥b−a⊤g. Since the righthand side is a number, the linear function a⊤Fu is bounded below iff a⊤F=0.
Thus, there exist a=0 such that F⊤a=0,a⊤x≤a⊤g,∀x∈C.
Strict separation
Two convex sets C∩D=∅⇔∃a=0,b that a⊤x<b,∀x∈C,a⊤x>b,∀x∈D.
Example: Separation of a point and a closed convex set
That is to say, the two sets C,B(x0,ϵ) do not intersect for some ϵ>0. Since B(x0,ϵ)={x0+u∣∥u∣2≤ϵ}, we have
a⊤x≤b,∀x∈Ca⊤(x0+u)≥b,∀∥u∥2≤ϵ
The u minimizes the lefthand side is u=ϵa/∥a∥2, so a⊤x0−ϵ∥a∥2≥b. Therefore, f(x)=a⊤x−ϵ∥a∥2/2−b is negative on C and positive at x0.
Corollary: A closed convex set is the intersection of all half spaces that contain it.
Obviously if a point is in the closed convex, it is in all the half spaces that contain it.
In reverse, if there exists a point x that is of the intersection S but not in the convex set C, then by strict separation, there exists a hyperplane that separates x and the C. Thus, this half space contains C but not x, so x∈/S.
Converse separating hyperplane theorems
Any two convex sets C,D, at least one of which is open, are disjoint iff there exists a separating plane.
Farkas Lemma
For A∈Rm×n,b∈Rm, the following are strong alternatives:
- ∃x∈R+n,Ax=b
- ∃y∈Rm,A⊤y≥0,b⊤y<0
Proof.
1⇒¬2: b⊤y=(x⊤A⊤)y≥0
¬1⇒2: C=cone(A) is a closed convex cone that does not contain b. Hence by separating hyperplane theorem, there exists a y∈Rm that ⟨y,x⟩≥0>⟨y,b⟩,∀x∈C, i.e. A⊤y≥0,b⊤y<0.
Supporting hyperplanes
If a=0 satisfies a⊤x≤a⊤x0,∀x∈C, then the hyperplane {x∣a⊤x=a⊤x0} is the supporting hyperplane to C at point x0. Geometrically, the hyperplane is tangent to C at point x0.
Supporting hyperplane theorem
For any nonempty convex set C and any x0∈∂C, there exists a supporting hyperplane to C at point x0.