Abstract:It is well known that the traditional Jensen inequality is proved by lower bounding the given convex function, $f(x)$, by the tangential affine function that passes through the point $(E\{X\},f(E\{X\}))$, where $E\{X\}$ is the expectation of the random variable $X$. While this tangential affine function yields the tightest lower bound among all lower bounds induced by affine functions that are tangential to $f$, it turns out that when the function $f$ is just part of a more complicated expression whose expectation is to be bounded, the tightest lower bound might belong to a tangential affine function that passes through a point different than $(E\{X\},f(E\{X\}))$. In this paper, we take advantage of this observation, by optimizing the point of tangency with regard to the specific given expression, in a variety of cases, and thereby derive several families of inequalities, henceforth referred to as ``Jensen-like'' inequalities, which are new to the best knowledge of the author. The degree of tightness and the potential usefulness of these inequalities is demonstrated in several application examples related to information theory.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that the traditional Jensen's inequality is not precise enough when dealing with the expected values of complex expressions. Specifically, when the convex function \(f(x)\) is only part of a more complex expression, the traditional Jensen's inequality may not provide the tightest lower bound. By optimizing the tangent point (i.e., choosing the optimal value of \(a\)), the author proposes a new class of "Jensen - like" inequalities, hoping to obtain more precise lower bounds in these cases.
### Detailed Explanation:
1. **Limitations of the Traditional Jensen's Inequality**:
- The traditional Jensen's inequality estimates the lower bound of the convex function \(f(x)\) by using the tangent line passing through the point \((E[X], f(E[X]))\).
- When \(f(X)\) is only part of a complex expression, this tangent method may not be optimal because the best tangent point may not be \(E[X]\).
2. **Core Idea of the Paper**:
- The author observes that for complex expressions, the best tangent point may be a point different from \(E[X]\).
- By optimizing this tangent point, a tighter lower bound can be obtained, thus deriving a series of new "Jensen - like" inequalities.
3. **Application Background**:
- These new inequalities have wide applications in information theory, such as being used to derive information inequalities, data - processing inequalities, and the relationships between conditional entropy and unconditional entropy.
- In addition, these inequalities can also be used for the derivation of single - letter formulas in Shannon theory and the maximum entropy problem.
4. **Summary of Contributions**:
- Multiple new families of "Jensen - like" inequalities applicable to different forms of complex expressions are proposed.
- In many cases, the optimal parameter values can be found in an analytic form.
- Two types of bounds are provided: bounds based on the first two moments and bounds based on the moment - generating function and its derivatives.
- These inequalities are applicable not only to convex functions but also to some non - convex and even concave functions.
### Example Illustrations:
- **Example 1**: For an expression of the form \(E[f(X)g(X)]\), where \(g(x)\) is a non - negative function, a tighter lower bound can be obtained by optimizing the tangent point.
- **Example 2**: For the random guessing problem, better bounds can be obtained by applying these new inequalities.
- **Example 3**: For the Gaussian channel capacity problem, these inequalities can help evaluate the capacity fluctuations under random signal - to - noise ratios.
In conclusion, this paper proposes a series of new "Jensen - like" inequalities by optimizing the tangent point method, solves the limitations of the traditional Jensen's inequality when dealing with complex expressions, and demonstrates its wide applications in information theory.