affine functions are convex and concave; all norms are convex
Examples on R
affine : ax+b on R, for any a,b∈R
exponential : eax, for any a∈R
powers : xα on R++, for α≥1 or α≤0
powers of absolute value : ∣x∣p on R, for p≥1
negative entropy : −xlogx on R++
Examples on Rn
affine function : f(x)=aTx+b
norms : ∥x∥p=(∑i=1n∣xi∣p)1/p for p≥1; ∥x∥∞=maxk∣xk∣
Proof
Examples on Rm×n
affine function
spectral (maximum singular value) norm
Restriction of a convex function to a line
f:Rn→R is convex if and only if the function g:R→R
is convex in t for any x∈dom f,v∈Rn
We can check convexity of f by checking convexity of functions of one variable
Example
f:Sn→R with f(X)=logdetX,dom f=S++n
where λi are the eigenvalues of X−1/2VX−1/2
g is concave in t (for any choice of X≻0,V); hence f is concave.
Note
Extended-value extension
It is often convenient to extend a convex function to all of Rn by defining its value to be ∞ outside its domain. If f is convex we define its extended-value extension f~ of f is
The extension can simplify notation, since we don’t need to explicitly describe the domain.
First-order condition
f is differentiable if dom f is open and the gradient
exists at each x∈dom f
1st-order condition : differentiable f with convex domain is convex if and only if
Second-order conditions
f is twice differentiable if dom f is open and the Hessian ∇2f(x)∈Sn,
or equivalently
where Hx is a hessian matrix at x.
2nd-order condition : for twice differentiable f with convex domain
f is convex if and only if
if ∇2f(x)≻0,∀x∈dom f, then f is strictly convex.
Example
Quadratic function : f(x)=21xTPx+qTx+r (with P∈Sn)
convex if P⪰0
Least-squares objective : f(x)=∥Ax−b∥22
convex for any A since ATA is always positive semi-definite.
quadratic-over-linear : f(x,y)=x2/y
convex for y>0
log-sum-exp : f(x)=log∑k=1nexpxk
Then,
Therefore,
to show ∇2f(x) is positive semi-definite, we must verify that vT∇2f(x)v≥0,∀v
By Cauchy-Schwarz inequality,
So, vT∇2f(x)v≥0,∀v.
Therefore, ∇2f(x) is a positive semi-definite matrix.
geometric mean : f(x)=(∏k=1nxk)1/n on R++n is concave
Epigraph and sublevel set
α-sublevel set of f:Rn→R:
sublevel sets of convex functions are convex
epigraph of f:Rn→R
Jensen’s inequality
basic inequality : if f is convex, then for 0≤θ≤1
extension : if f is convex, then
for any random variable z
Operations that preserve convexity
practical methods for establishing convexity of a function
by using definition
for twice differentiable functions, show ∇2f(x)⪰0,∀x
show that f is obtained from simple convex functions by operations that preserve convexity
nonnegative weighted sum
composition with affine function
point-wise maximum and supremum
minimization
composition with scalar function
vector composition
perspective
Composition with affine function
is convex if f is convex
Proof
Given x,y∈dom A,θ≥0
Therefore f(Ax+b) is a convex function.
Examples
log barrier for linear inequalities
where dom f={x∣aiTx<bi,i=1,…,m}
any norm of affine function
Pointwise maximum
if f1,…,fm are convex, then f(x)=max{f1(x),…,fm(x)} is convex
Proof
Let x,y∈∩i=1mdom fi,θ≥0
Since fi is a convex function,
for all i=1,…,m
Without loss of generality, let f(θx+(1−θ)y)=fk(θx+(1−θ)y)
Then,
Therefore, f is a convex function.
Examples
point-wise linear function
is a convex function
sum of r largest components of x∈Rn
where x[i] is i’th largest component of x
is a convex function
Actually,
Pointwise supremum
if f(x,y) is convex in x for each y∈A, then
is convex
Proof
Let x1,x2∈dom g,y∈A,θ≥0
Since f is convex in x for each y∈A,
Therefore, g is a convex set.
Actually, we use the property of supremum
Let supf(x)=α,supg(x)=β
Since α≥f(x1) for all x1∈dom f and β≥f(x2) for all x2∈dom g, α+β≥f(x)+g(x) for all x∈dom f+g
So, α+β is a upper bound of f+g
Therefore, supx(f(x)+g(x))≤supxf(x)+supxg(x)
Epigraph interpretation
In terms of epigraphs, the point-wise supremum of functions corresponds to the intersection of epigraphs: with f,g. we have
Thus, the result follows from the fact that the intersection of a family of convex set is convex.
Red and green areas represent the epigraph of f(x,y) for a fixed x. As f is convex in x, both the red and green areas are convex sets. Our objective is to intersect the epigraph with a given point x to determine the maximum (supremum) value for the given x.
Examples
support function of a set C
distance to farthest point in a set C
maximum eigenvalue of symmetric matrix
where X∈Sn
Proof
Since X∈Sn, there exists an orthonormal basis by spectral theorem.
Therefore,
First we want to show that max{yi∣i=1,…,n} is a upper-bound of yTXy
Without loss of generality, max{λi∣i=1,…,n}=λk
Given ϵ>0, we want to show that there exists y∗ such that
where ∥y∗∥2=1
Without loss of generality, λl is the second largest eigenvalue (k=l)
Take yl=2(λk−λl)ϵ,yk=2(λk−λl)2(λk−λl)−ϵ,yj=0 if j=k,l
Then,
Moreover,
Therefore,
Minimization
We have seen that the maximum and supremum of an arbitrary family of convex functions is convex. It turns out the some special forms of minimization also yields convex functions.
if f(x,y) is convex in (x,y) and C is a convex set, then
where g(x)>−∞ for all x
is convex.
We have to compare the condition for the point-wise supremum case. Note that f(x,y) is convex in (x,y). In point-wise supremum case, it is enough to satisfy only x
Proof
Let x1,x2∈dom g,θ≥0
For given ϵ>0, ∃y1,y2∈C such that
Therefore,
Since ϵ is arbitrary,
Therefore, g is a convex function.
Epigraph interpretation
Assume the infimum over y∈C is attained for each x (i.e. {f(x,y)∣y∈C} is closed set for each x). We have
Since the projection of a convex set on some of its components is convex, epi g is a convex set.
Example
f(x,y)=xTAx+2xTBy+yTCy with
We can express g(x)=infyf(x,y) as
where C† is the pseudo-inverse of C. Since g is convex, A−BC†BT⪰0
If C is invertible, then A−BC−1BT is called the Schur complement of C in the matrix
distance to a set : dist(x,S)=infy∈S∥x−y∥ is convex if S is convex.
Composition with scalar functions
composition of g:Rn→R and h:R→R
f is convex if
g convex, h convex, h~ non-decreasing
g concave, h convex, h~ non-increasing
where h~ is a extended value extension
If we assume that n=1 and twice differentiable for g,h we can get an intuition
Think about this example,
This function is trivially increasing. However,
this function is not a increasing function.
Interpretation of h~
To say that h~ is nondecreasing means that for any x,y∈R (x<y), we have h~(x)≤h~(y).
That means y∈dom h⇒x∈dom h
Therefore, the domain of h extends infinitely in the negative direction; it is either R or an interval like (−∞,a) or (∞,a]
Similarly, h~ is non-increasing means that the domain of h extends infinitely in the positive direction; it is either R or an interval like (a,∞) or [a,∞)
Proof
Let g is convex, h is convex, h~ is nondecreasing and x,y∈dom f,θ≥0.
Since x,y∈dom f, x,y∈dom g
Moreover we already know that g is convex,
Similarly since x,y∈dom f,
Since h~ is non-decreasing,
In addition, since θg(x)+(1−θ)g(y)∈dom h and h is a convex function,
Therefore,
Therefore h∘g is a convex function.
Examples
expg(x) is convex if g is convex
1/g(x) is convex if g is concave and positive.
Vector composition
composition of g:Rn→Rk and h:Rk→R
f is convex if
gi affine, h convex
gi convex, h convex, h~ non-decreasing in each argument
gi concave, h convex, h~ non-increasing in each argument
where h~ is a extended value extension
Proof
Without loss of generality, assume that gi convex, h convex, h~ non-decreasing in each argument.
Let x,y∈dom f,θ≥0 (i.e. x,y∈dom gi for each i and g(x),g(y)∈dom h)
Since gi is a convex function and x,y∈dom gi
Since h is convex, πi(dom h) is a convex set for each i.
So,
Therefore,
Moreover, h~ is non-decreasing in each argument and θgi(x)+(1−θ)gi(y)∈πi(dom h),∀i,
Therefore,
In addition,
Therefore h∘g is a convex function.
If gi is a affine function,
Therefore, there is no requirement of non-increasing or non-decreasing condition of h in this case.
We can actually interpret some convex preserving operations by using vector composition.
adding convex function
We can interpret h as a summation of each argument. We already know that that kind of function is convex. (Actually, it is affine)
Moreover h~ nondecreasing in each argument.
Therefore, it is a convex function
maximum values within convex functions
We can interpret h as a maximum of each argument. We already know that max function is convex. Moreover h~ is non-decreasing in each argument.
Therefore, it is a convex function.
Examples
∑i=1mloggi(x) is concave if gi are concave and positive
log∑i=1mexpgi(x) is convex if gi are convex.
Perspective
the perspective of a function f:Rn→R is the function g:Rn×R→R
if f is convex, then g is convex.
Proof
Since f is a convex function, epi f is a convex sets. Moreover, perspective function preserve a convexity. So, epi g is a convex set.
Therefore, g is a convex function.
Example
f(x)=xTx is convex, g(x,t)=xTx/t is convex for t>0
Let f(x) is a making cost, x is the number of product, y is the price for each product.
We want to maximize the profit. So our goal is to find the number of product maximize our benefit.
Note that f∗ is convex (even if f is not)
Proof
yTx−f(x) is affine with respect to y.
Moreover, it is the point-wise supremum of a family of convex (indeed, affine) functions of y. Therefore, it is actually a corollary of the supremum of the convex function is a convex function.
Examples
negative logarithm : f(x)=−logx
strictly convex quadratic : f(x)=(1/2)xTQx with Q∈S++n
Quasi-convex functions
f:Rn→R is quasi-convex if dom f is convex and the sublevel sets
are convex for all α
A function is quasi-concave if −f is quasi-convex. A function that is both quasi-convex and quasi-concave is called quasi-linear.
Convex functions have convex sub-level sets, and so are quasi-convex.
Proof
We already know that epigraph of a convex function is a convex set. Moreover, {(x,y)∣y≤α} is a convex set.
Since the intersection of convex sets is also a convex set.
Therefore,
is a convex set.
Moreover, we already know that projection of certain coordinate is also a convex set. Therefore,
is a convex set.
Therefore, every convex function is a quasi-convex function.
Examples
∣x∣ is quasi-convex on R
ceil(x)=inf{z∈Z∣z≥x} is quasi-linear
logx is quasi-linear on R++
f(x1,x2)=x1x2 is quasi-concave on R++2
linear-fractional function
is quasi-linear
distance ratio
is quasi-convex.
Proof
We want to know whether {x∈dom f∣f(x)≤t} is a convex set or not for all a. (For simplicity, we assume that t is non-negative.)
Take g:R2→R, h1:R→R,h2:R→R as follows
Since h1,h2 are convex and g~ are non-decreasing and non-increasing respectively (if t is negative, g~ are both non-decreasing), g(h1(x),h2(x)) is a convex function.
As we proved above, every convex function is also a quasi-convex funciton.
Therefore,
is a convex set.
Since t is arbitrary,
is a convex set.
Therefore,
Therefore, f is a quasi-convex function.
Properties
modified Jensen’s inequality : for quasi-convex f
Proof
Since f is a quasi-convex function,
are a convex set.
We already know that intersection of two convex sets is also a convex set. Therefore,
is a convex set.
For simplicity, let’s say
Trivially, x,y∈A. Therefore, for given θ≥0,
So,
first-order condition : Suppose f:Rn→R is differentiable with dom f is convex. Then f is quasi-convex if and only if
Proof
Let f is quasi-convex. Then
is a convex set.
Suppose ∃y∈A such that
Then, ∃0<ϵ<1 such that
Since x,y∈A and A is a convex set,
It is contradicts to the definition of A.
Therefore,
We want to show that for given a
is a convex set
case 1 : ∃x∈dom f such that f(x)=a
By assumption,
Since ∇2f(x)T(y−x) is affine with respect to y, {y∣∇2f(x)T(y−x)≤0} is a convex set.
Therefore {x∈dom f∣f(x)≤a} is a convex set.
case 2 : ∄x∈dom f such that f(x)=a
Since f is differentiable, f is continuous for every x. Therefore, there are only two cases.
f(x)>a,∀x
In this case, there is no element in sub-level set. Therefore, it is a vacuously true to satisfy the convexity.
f(x)<a,∀x
In this case, {x∈dom f∣f(x)≤a}=dom f. We already know that dom f is a convex set.
Therefore {x∈dom f∣f(x)≤a} is a convex set.
Log-concave and log-convex functions
a positive function f is log-concave if logf
for 0≤θ≤1
f is log-convex if logf is convex.
Examples
powers : xa on R++ is log-convex for α≤0, log-concave for α≥0
proof
if a≥0,
the above function is log-concave.
if α≤0,
the above function is log-convex.
many common probability densities are log-concave (ex : normal dist)
cumulative Gaussian distribution Φ is log-concave.
Properties
twice differentiable f with convex domain is log-concave if and only if
for all x∈dom f
or equivalently
for all x∈dom f
The condition for concave function is 0 for the right side of the above inequality. In log-concave case, the hessian matrix is less than equal to the rank-one matrix. That means we can have at most one positive eigenvalue of the hessian.
product of log-concave functions is log-concave
sum of log-concave functions is not always log-concave
integration : if f:Rn×Rm→R is log-concave, then
is log-concave.
Consequences of integration property
convolution f∗g of log-concave functions f,g is log-concave
if C⊂Rn convex and y is a random variable with log-concave pdf then
is log-concave
Proof
Write f(x) as integral of product of log-concave functions
where
p is pdf of y
Convexity with respect to generalized inequalities
Suppose K⊂Rm is a proper cone with associated generalized inequality ⪯K. We say f:Rn→Rm is K-convex if for all x,y, and 0≤θ≤1,