Handling Varied Objectives by Online Decision Making

Lanjihong Ma,Zhen-Yu Zhang,Yao-Xiang Ding,Zhi-Hua Zhou
DOI: https://doi.org/10.1145/3637528.3671812
2024-01-01
Abstract:Conventional machine learning typically assume a fixed learning objective throughout the learning process. However, for real-world tasks in open and dynamic environments, objectives can change frequently. For example, in autonomous driving, a car has several default modes, but a user's concern for speed and fuel consumption varies depending on road conditions and personal needs. We formulate this problem as learning with varied objectives (LVO), where the goal is to optimize a dynamic weighted combination of multiple sub-objectives by sequentially selecting actions that incur different losses on these sub-objectives. We propose the VaRons algorithm, which estimates the action-wise performance on each sub-objective and adaptively selects decisions according to the dynamic requirements on different sub-objectives. Further, we extend our approach to cases involving contextual representations and propose the ConVaRons algorithm, assuming parameterized linear structure that links contextual features to the main objective. Both the VaRons and ConVaRons are provably minimax optimal with respect to the time horizon T, with ConVaRons showing better dependency with the number of sub-objectives K. Experiments on dynamic classifier and real-world cluster service allocation tasks validate the effectiveness of our methods and support our theoretical findings.
What problem does this paper attempt to address?