Statement of the First-Order Condition for Convexity
For a differentiable function $ f: \mathbb{R}^n \to \mathbb{R} $, $ f $ is convex on a convex set $ C \subseteq \mathbb{R}^n $ if and only if for all $ \mathbf{x}, \mathbf{y} \in C $ the following inequality holds:
\[f(\mathbf{y}) \geq f(\mathbf{x}) + \nabla f(\mathbf{x})^\top (\mathbf{y} - \mathbf{x}) \]Proof
Part 1: Necessity
Assume $ f $ is convex. We need to show that the first-order condition holds.
-
Convexity of $ f $: By definition of convexity, for any $ \mathbf{x}, \mathbf{y} \in C $ and $ \theta $ with $ 0 \leq \theta \leq 1 $, we have:
\[f(\theta \mathbf{y} + (1 - \theta) \mathbf{x}) \leq \theta f(\mathbf{y}) + (1 - \theta) f(\mathbf{x}) \] -
Apply the Definition to a Point Slightly Away from $ \mathbf{x} $: Choose $ \theta $ to be a small positive number $ h $, and set $ \mathbf{z} = \mathbf{x} + h(\mathbf{y} - \mathbf{x}) $. Then, the above inequality becomes:
\[f(\mathbf{x} + h(\mathbf{y} - \mathbf{x})) \leq hf(\mathbf{y}) + (1 - h)f(\mathbf{x}) \] -
Rearrange and Divide by $ h $:
\[\frac{f(\mathbf{x} + h(\mathbf{y} - \mathbf{x})) - f(\mathbf{x})}{h} \leq f(\mathbf{y}) - f(\mathbf{x}) \] -
Take the Limit as $ h \to 0 $:
\[\lim_{h \to 0} \frac{f(\mathbf{x} + h(\mathbf{y} - \mathbf{x})) - f(\mathbf{x})}{h} = \nabla f(\mathbf{x})^\top (\mathbf{y} - \mathbf{x}) \]Therefore, $ f(\mathbf{y}) \geq f(\mathbf{x}) + \nabla f(\mathbf{x})^\top (\mathbf{y} - \mathbf{x}) $.
Part 2: Sufficiency
Given:
The first-order condition states that for all $ \mathbf{x}, \mathbf{y} \in C $:
\[f(\mathbf{y}) \geq f(\mathbf{x}) + \nabla f(\mathbf{x})^\top (\mathbf{y} - \mathbf{x}) \]Goal:
To prove that $ f $ is convex, i.e., for any $ \mathbf{x}, \mathbf{y} \in C $ and for any $ \theta $ with $ 0 \leq \theta \leq 1 $, the following inequality holds:
\[f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) \leq \theta f(\mathbf{x}) + (1 - \theta) f(\mathbf{y}) \]Proof Steps:
Step 1: Apply First-Order Condition with $ \mathbf{x} $ and $ \mathbf{y} $
For $ \mathbf{z} = \theta \mathbf{x} + (1 - \theta) \mathbf{y} $ and using $ \mathbf{y} $ in the first-order condition, we have:
\[f(\mathbf{y}) \geq f(\mathbf{z}) + \nabla f(\mathbf{z})^\top (\mathbf{y} - \mathbf{z}) \]Step 2: Substitute $ \mathbf{z} $
\[f(\mathbf{y}) \geq f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \nabla f(\theta \mathbf{x} + (1 - \theta) \mathbf{y})^\top (\mathbf{y} - (\theta \mathbf{x} + (1 - \theta) \mathbf{y})) \]Step 3: Expand and Rearrange
\[f(\mathbf{y}) \geq f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \theta\nabla f(\theta \mathbf{x} + (1 - \theta) \mathbf{y})^\top (\mathbf{y} - \mathbf{x}) \]Step 4: Multiply by $ (1 - \theta) $
\[(1 - \theta) f(\mathbf{y}) \geq (1 - \theta) f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \theta(1 - \theta) \nabla f(\theta \mathbf{x} + (1 - \theta) \mathbf{y})^\top (\mathbf{y} - \mathbf{x}) \]Step 5: Apply First-Order Condition with $ \mathbf{x} $ and $ \mathbf{z} $
Similarly, applying the condition with $ \mathbf{x} $ and $ \mathbf{z} $:
\[f(\mathbf{x}) \geq f(\mathbf{z}) + \nabla f(\mathbf{z})^\top (\mathbf{x} - \mathbf{z}) \]Step 6: Substitute $ \mathbf{z} $ and Multiply by $ \theta $
\[\theta f(\mathbf{x}) \geq \theta f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \theta(1 - \theta) \nabla f(\theta \mathbf{x} + (1 - \theta) \mathbf{y})^\top (\mathbf{x} - \mathbf{y}) \]Step 7: Combine and Simplify
Add the inequalities from Steps 4 and 6:
\[\theta f(\mathbf{x}) + (1 - \theta) f(\mathbf{y}) \geq \theta f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + (1 - \theta) f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) + \text{[terms involving gradients]} \]The terms involving the gradients will cancel
out, leaving:
\[\theta f(\mathbf{x}) + (1 - \theta) f(\mathbf{y}) \geq f(\theta \mathbf{x} + (1 - \theta) \mathbf{y}) \] 标签:geq,mathbf,top,nabla,leq,theta,Conditions,Order,Convexity From: https://www.cnblogs.com/maluyelang/p/17834150.html