Nearly Optimal Bounds for Sample-Based Testing and Learning of $k$-Monotone Functions

Hadley Black
2024-08-20
Abstract:We study monotonicity testing of functions $f \colon \{0,1\}^d \to \{0,1\}$ using sample-based algorithms, which are only allowed to observe the value of $f$ on points drawn independently from the uniform distribution. A classic result by Bshouty-Tamon (J. ACM 1996) proved that monotone functions can be learned with $\exp(\widetilde{O}(\min\{\frac{1}{\varepsilon}\sqrt{d},d\}))$ samples and it is not hard to show that this bound extends to testing. Prior to our work the only lower bound for this problem was $\Omega(\sqrt{\exp(d)/\varepsilon})$ in the small $\varepsilon$ parameter regime, when $\varepsilon = O(d^{-3/2})$, due to Goldreich-Goldwasser-Lehman-Ron-Samorodnitsky (Combinatorica 2000). Thus, the sample complexity of monotonicity testing was wide open for $\varepsilon \gg d^{-3/2}$. We resolve this question, obtaining a nearly tight lower bound of $\exp(\Omega(\min\{\frac{1}{\varepsilon}\sqrt{d},d\}))$ for all $\varepsilon$ at most a sufficiently small constant. In fact, we prove a much more general result, showing that the sample complexity of $k$-monotonicity testing and learning for functions $f \colon \{0,1\}^d \to [r]$ is $\exp(\Omega(\min\{\frac{rk}{\varepsilon}\sqrt{d},d\}))$. For testing with one-sided error we show that the sample complexity is $\exp(\Theta(d))$. Beyond the hypercube, we prove nearly tight bounds (up to polylog factors of $d,k,r,1/\varepsilon$ in the exponent) of $\exp(\widetilde{\Theta}(\min\{\frac{rk}{\varepsilon}\sqrt{d},d\}))$ on the sample complexity of testing and learning measurable $k$-monotone functions $f \colon \mathbb{R}^d \to [r]$ under product distributions. Our upper bound improves upon the previous bound of $\exp(\widetilde{O}(\min\{\frac{k}{\varepsilon^2}\sqrt{d},d\}))$ by Harms-Yoshida (ICALP 2022) for Boolean functions ($r=2$).
Data Structures and Algorithms
What problem does this paper attempt to address?
This paper aims to solve the problems of testing and learning monotonic functions and their generalized form (k - monotonic functions) based on samples. Specifically, the paper mainly focuses on the following points: 1. **Lower bound of sample complexity**: - The paper addresses the sample complexity problem of monotonicity testing when the error parameter \(\epsilon\) is greater than \(d^{-3/2}\). Before this, there was a large gap in the research on sample complexity for the case of \(\epsilon \gg d^{-3/2}\). - The author proves that for the function \(f:\{0, 1\}^d\rightarrow [r]\), the lower bound of sample complexity is \(\exp\left(\Omega\left(\min\left\{\frac{rk}{\epsilon\sqrt{d}}, d\right\}\right)\right)\). This result is applicable not only to Boolean functions (i.e., \(r = 2\)) but also to functions with an arbitrary image size \(r\). 2. **Upper bound of sample complexity**: - The paper provides the upper bound of sample complexity, which almost matches the lower bound. Specifically, for the function \(f:\{0, 1\}^d\rightarrow [r]\), the upper bound of sample complexity is \(\exp\left(O\left(\min\left\{\frac{rk}{\epsilon\sqrt{d}}\log d, d\right\}\right)\right)\). - This result is achieved by applying the learning algorithm to the testing problem (i.e., test - learning reduction). 3. **Testing and learning in continuous product spaces**: - The paper also studies the sample complexity in continuous product spaces (i.e., the function \(f:\mathbb{R}^d\rightarrow [r]\)). The author proves that under the product distribution, the upper bound of sample complexity is \(\exp\left(e^{O\left(\min\left\{\frac{rk}{\epsilon\sqrt{d}}, d\right\}\right)}\right)\). - This result almost matches the lower bound on the hypercube, only differing in the logarithmic factor in the exponential part. 4. **One - sided error testing**: - The paper also explores the sample complexity of one - sided error testing. For monotonicity testing, when the error parameter \(\epsilon\) is small, the number of samples required for one - sided error testing is \(\exp(\Theta(d))\), which is in contrast to the sample complexity of two - sided error testing \(\exp(e^{\Theta(\min\{\frac{1}{\epsilon\sqrt{d}}, d\})})\). ### Main contributions - **Lower bound results**: The paper gives the lower bound of sample complexity for the first time when \(\epsilon \gg d^{-3/2}\), filling the gap in this field. - **Upper bound results**: By improving the existing learning algorithms, the paper provides the upper bound of sample complexity that almost matches the lower bound. - **Continuous product spaces**: The paper extends the results to make them applicable to functions in continuous product spaces, further verifying the wide applicability of the theory. - **One - sided error testing**: The paper reveals the significant difference in sample complexity between one - sided error testing and two - sided error testing. ### Conclusion Through rigorous mathematical analysis, this paper provides a comprehensive understanding of testing and learning monotonic functions and their generalized forms based on samples. These results are not only of great theoretical significance but also provide guidance for algorithm design in practical applications.