Abstract:– Devising a dynamic pricing policy with always valid online statistical learning procedures is an important and as yet unresolved problem. Most existing dynamic pricing policies, which focus on the faithfulness of adopted customer choice models, exhibit a limited capability for adapting to the online uncertainty of learned statistical models during the pricing process. In this article, we propose a novel approach for designing a dynamic pricing policy based on regularized online statistical learning with theoretical guarantees. The new approach overcomes the challenge of continuous monitoring of the online Lasso procedure and possesses several appealing properties. In particular, we make the decisive observation that the always-validity of pricing decisions builds and thrives on the online regularization scheme. Our proposed online regularization scheme equips the proposed optimistic online regularized maximum likelihood pricing ( OORMLP ) pricing policy with three major advantages: encode market noise knowledge into pricing process optimism; empower online statistical learning with always-validity overall decision points; envelope prediction error process with time-uniform non-asymptotic oracle inequalities. This type of non-asymptotic inference results allows us to design more sample-efficient and robust dynamic pricing algorithms in practice. In theory, the proposed OORMLP algorithm exploits the sparsity structure of high-dimensional models and secures a logarithmic regret in a decision horizon. These theoretical advances are made possible by proposing an optimistic online Lasso procedure that resolves dynamic pricing problems at the process level, based on a novel use of non-asymptotic martingale concentration. In experiments, we evaluate OORMLP in different synthetic and real pricing problem settings and demonstrate that OORMLP advances the state-of-the-art methods. Supplementary materials for this article are available online.

Dynamic Pricing in High-Dimensions

Online Learning and Pricing for Multiple Products with Reference Price Effects

Perishability of Data: Dynamic Pricing under Varying-Coefficient Models

Logarithmic Regret in Feature-based Dynamic Pricing

High-Dimensional Dynamic Pricing under Non-Stationarity: Learning and Earning with Change-Point Detection

Low-Rank Bandit Methods for High-Dimensional Dynamic Pricing

Contextual Dynamic Pricing: Algorithms, Optimality, and Local Differential Privacy Constraints

Policy Optimization Using Semi-parametric Models for Dynamic Pricing

Online Regularization toward Always-Valid High-Dimensional Dynamic Pricing

On Dynamic Pricing with Covariates

Dynamic Pricing and Demand Learning on a Large Network of Products: A PAC-Bayesian Approach

Dynamic Pricing and Advertising with Demand Learning

Dynamic Pricing with Demand Covariates

Contextual Dynamic Pricing with Strategic Buyers

Dynamic Pricing with External Information and Inventory Constraint

Dynamic Pricing and Learning with Long-term Reference Effects

Multi-Task Dynamic Pricing in Credit Market with Contextual Information

Dynamic pricing under nested logit demand

Fairness-aware Online Price Discrimination with Nonparametric Demand Models

Dealing with the dimensionality curse in dynamic pricing competition: Using frequent repricing to compensate imperfect market anticipations