Abstract:This technical note introduces business students to the concepts of modeling discrete choice (e.g., a consumer purchasing brand A versus brand B) using logistic regression and maximum-likelihood estimation. It draws the analogy between modeling discrete choice and building a regression model with a dummy dependent variable and on an example illustrates the need for estimating the probability of a choice rather than the choice itself, which leads to a special kind of regression – logistic regression. The note presents the concepts of utility and a random utility choice model, of which the logistic regression model is the most commonly used. It shows how choice probabilities can be constructed from utilities leading to the logit model. It then presents the maximum-likelihood estimation (MLE) method of fitting the logit model to the choice data. Working through a detailed example using Solver and accompanying spreadsheet model, the note gives students deep understanding for how MLE works and how it is similar and different to the standard least-squared estimation in linear regression. The note concludes by presenting the results of estimation using StatTools, a commercial statistical software. The note avoids the use of heavy mathematical machinery but still requires rudimentary knowledge of exponent and logarithmic functions, probability, and optimization with Solver, as well as familiarity with the “standard” linear regression. Applications include building of models for consumer choice, estimating price elasticity, price optimization, product versioning, product line design, and conjoint analysis. Excerpt UVA-QA-0779 Nov. 7, 2011 MODELING DISCRETE CHOICE: CATEGORICAL DEPENDENT VARIABLES, LOGISTIC REGRESSION, AND MAXIMUM LIKELIHOOD ESTIMATION Consider an individual choosing between two or more discrete alternatives: a shopper in a grocery store deciding between apple or orange juice, or a prospective student determining which of several university offers he ought to accept. For the juice manufacturer and the university, the ability to predict the outcome of such choices is of vital importance. In this note, we will discuss how this might be done. The process we will follow bears some similarity to a regular linear regression but also has substantial differences, primarily due to the fact that the choices are discrete; that is, they correspond to a categorical dependent variable in regression. The Concept of Utility . . .

A Discrete Choice Model for Subset Selection

Modeling Discrete Choice: Categorical Dependent Variables, Logistic Regression, and Maximum Likelihood Estimation

A model of discrete choice based on reinforcement learning under short-term memory

Choice Set Optimization Under Discrete Choice Models of Group Decisions

Assortment Optimization under the Multi-Purchase Multinomial Logit Choice Model

A survey of assortment optimization problems under logit-based discrete choice models

Subset selection for multiple linear regression via optimization

Cost-Sensitive Best Subset Selection for Logistic Regression: A Mixed-Integer Conic Optimization Perspective

Sparse Choice Models

Best-item Learning in Random Utility Models with Subset Choices

Identification and Estimation of Discrete Choice Models with Unobserved Choice Sets

Estimation of Discrete Choice Models: A Machine Learning Approach

Multi-Model Subset Selection

Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models.

On the Estimation of Discrete Choice Models to Capture Irrational Customer Behaviors

Price Discounts and Personalized Product Assortments under Multinomial Logit Choice Model: A Robust Approach.

Scalable Estimation of Multinomial Response Models with Random Consideration Sets

Discrete Choice under Risk with Limited Consideration

A mathematical programming approach for integrated multiple linear regression subset selection and validation

Revealed Preference at Scale: Learning Personalized Preferences from Assortment Choices

Heterogeneous Choice Sets and Preferences