Abstract:Consider the optimal stopping problem of maximising the expected payoff from selecting the last success in a sequence of independent Bernoulli trials. The total positivity of the Markov chain embedded in the success epochs of the trials is exploited to prove the optimality of the myopic strategy for both unimodal stopping and continuation payoffs. In contrast, the problem is shown to be nonmonotone for oscillating payoffs, alternating between two values. The myopic strategy is demonstrated not to be a threshold rule. Lastly, the optimality of the myopic policy is established for the $\ell$th-to-$m$th last-success problem based on a total positivity argument. An illustrative example is given to compute the asymptotic threshold and winning probability.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is related to the last - success - selection problem in the optimal stopping problem. Specifically, the author studied how to maximize the expected return of selecting the last success from a series of independent Bernoulli trials. By using the total positivity of the Markov chain embedded at the moment of success, the author proved that for unimodal stop - and - continue returns, the myopic strategy is optimal. For fluctuating returns, it is shown that the problem is non - monotonic and the myopic strategy is not a threshold rule. In addition, based on the total - positivity argument, the author also established the optimality of selecting the $\ell$ - th to the $m$ - th last success.
### Core Problems of the Paper
1. **Maximizing the Expected Return of the Last Success**: Research on how to select the last success in a series of independent Bernoulli trials to maximize the expected return.
2. **Optimality of the Myopic Strategy**: Prove that under certain conditions (such as unimodal returns), the myopic strategy is optimal.
3. **Non - Monotonic Problem**: Show that when the return function is fluctuating, the problem is non - monotonic and the myopic strategy is not a threshold rule.
4. **Optimality of the $\ell$ - th to the $m$ - th Last Success**: Establish the optimality conditions for selecting the $\ell$ - th to the $m$ - th last success.
### Specific Methods and Conclusions
- **Total - Positivity Theory**: Use the total - positivity theory to analyze the properties of the Markov chain and prove the optimality of the myopic strategy under certain conditions.
- **Unimodality and Non - Monotonicity**: Further explore the effectiveness and limitations of the myopic strategy by verifying the unimodality and non - monotonicity of the return function.
- **Application Examples**: Give specific examples to calculate the asymptotic threshold and the winning probability, and verify the theoretical results.
### Key Formulas
1. **Definition of the Myopic Strategy**:
\[
\tau^*=\min \{ k : X_k = 1 \text{ and } f_k \geq g_k \}
\]
where $f_k$ is the return of stopping at the $k$ - th success, and $g_k$ is the return of skipping the first $k$ trials and stopping at the next success.
2. **Success - Probability Model**:
\[
p_k=\frac{w_1}{w_1 + w_2 + k - 1}
\]
3. **One - Step - Look - Ahead Return**:
\[
g_k=\sum_{j > k} P(k, j) f_j
\]
4. **Total - Positivity Matrix Transformation**:
\[
O := D_1 P D_2^{-1}
\]
where $D_1$ and $D_2$ are diagonal matrices used to maintain the total - positivity of the matrix.
5. **Threshold Rule**:
\[
j_m=\min \left\{ 1 \leq i \leq n - m + 1 : R_m(i, n) \leq R_{\ell-1}(i, n) \right\} \vee 1
\]
Through these methods and conclusions, the author deeply explored the optimality conditions of the last - success - selection problem and provided a combination of theory and practical application.