A PREDICTIVE TEST OF THE NBD MODEL THAT CONTROLS FOR NON — STATIONARITY
Vikas Tibrewala,Bruce Buchanan
Abstract:In this paper a technique is proposed that largely corrects for the effects of non-stationarity in purchase incidence studies that use the Negative Binomial Distribution model. With non-stationarity controlled in this way, it is possible to assess the validity of the NBD mode% remaining assumptions of Poisson purchasing at the individual level and a gamma mixing of Poisson rates across the population. A test of the predictive fit of the NBD model is also derived. Using this predictive test and the proposed technique, the authors show how to diagnose some failures of the NBD assumptions. The effects of the technique on the resu/ts of conditional trend analyses are explored. The technique is illustrated using simulations and then applied to consumer panel data for seven product classes. A Predictive Test of the NBD Model that Controls for Non-stationarity The Negative Binomial Distribution (NBD) model was introduced to marketing by Ehrenberg (1959) as a way of describing the pattern of consumer purchases. Almost thirty years later, the model continues to be of interest, as is evidenced by its appearance in recent articles by Morrison and Schmittlein (1981,1988), Dunn, Reader, and Wrigley (1983) and Wagner and Taudes (1986). It has also been used as a component in composite models of brand choice and purchase timing, including those of Bass, Jeuland and Wright (1980), Zufryden (1978, 1981) and Schmittlein, Bemmaor, and Morrison (1985). Thus, perhaps because of its relative simplicity and ease of estimation, the NBD enjoys a continued presence in the marketing research literature. Goodhardt and Ehrenberg (1967) use the NBD to perform "conditional trend analysis." Here buyers are first segmented by their observed number of purchases in a predictor period; their expected purchase frequencies are then predicted for a subsequent criterion period. As discussed by Morrison and Schmittlein (1988), the NBD assumes that purchase frequencies are in part random, so these conditional predictions include a measure of regression to the mean. Conditional trend analysis thus allows a researcher to make baseline predictions for both heavy and light buyers, and these predictions are corrected for an assumed randomness in the purchase process. Deviations from these baseline predictions might then be interpreted as the results of specific marketing efforts. In this way the differential effects of a marketing activity on heavy and light buyers can be examined. To use the NBD we must make three assumptions. The first is that each individuels purchase in a Poisson fashion with unobservable rate X. The second is that the dispersion in these individual rates across the population, f(1), is accurately described by the gamma distribution; though fairly flexible, the gamma is constrained to being uni-modal. The last assumption is stationarity, namely, that each individual retains the same purchase rate in the criterion period that he or she held in the predictor period. The first two assumptions have drawn much comment and some criticism, and various gereralizations to them have been proposed. With the exception of a fine review by Monison and Schmittlein (1988), however, the stationarity assumption has received little attention in the literature. In this paper, we examine the stationarity assumption and its implications for assessing the performance of the NBD as a predictive model of consumer purchase incidence. There is good reason for doing this. Marketing activities are usually designed to accelerate consumers' purchases, that is, to induce non-stationarities. 2 These non-stationarities produce deviations between predicted and actual conditional purchase frequencies, causing the NBD predictions to fail. But any non-stationarity, not just those induced by a marketing activity, will cause the NBD predictions to fail. And, given the usual level of promette activity, non-stationarities are probably present in most data sets. Indeed Sabavala (1988) recently noted that "There is no stable period!" on which to validate NBD predictions. How, then, are we to test the NBD so that we can put some faith in its conditional predictions? Rather than search for a stable period, our approach is to control for possible non-stationarities in the data so that we can test the remaining Poisson and gamma assumptions. If the NBD performs well in this context, then we can ascribe its failure in a non-controlled context to non-stationarity. Of course, such non-stationarity may or may not be the result of a particular marketing activity. That judgment we leave to the researcher. Our procedure consists of two parts: i) We develop the sampling properties of NBD conditional predictions under conditions of stationarity. These allow us to perform a formai statistical test of the NBD predictions, as was suggested by Morrison and Schmittlein (1981). ii) We devise a data manipulation that largely corrects for the effects of non-stationarity in conditional trend analyses. This allows us to create data sets that appear to have corne from stationary markets. With non-stationarity controlled for, we can use the result from part (i) to test the remaining Poisson and gamma assumptions. But, as we will show, our technique does more than just control for non-stationarities. We can use it to identify which of the three NBD assumptions is being violated, if indeed any one of them is. Thus we can use it to make a more informed choice between the simple NBD and more complex models like the Condensed NBD of Chatfield and Goodhardt (1973). In this sense our technique builds on the analytical resuits presented by Morrison and Schmittlein (1988). The rest of this paper is organized as follows: First, we discuss more fully the NBD assumptions and their implications. Next, we present our method to control for the effects of non-stationarity. In essence, this method uses die memoryless property of the Poisson purchase assumption to average individual non-stationarities across the sample. We illustrate the method on four simulated data sets, then apply it to consumer panel data from seven product classes. We are able to show which of these data sets do, and dont, meet the NBD assumptions. Discussion and conclusions follow. 3 2.The NBD: Assumptions and Issues The requisite assomptions of the NBD model are well known, but a brief review here will enhance the reader's appreciation of the discussion to follow. Poisson Purchases For a Poisson individual with unobservable rate X, the probability of a purchase occuring within any very small interval of tirne, dt, is simply the product Idt Further, the behavior of the process on any one such interval of time is independent of that on any other, so what happens in the prior instant does not affect the present. In this sense the process is memoryless, as discussed by Meyer (1970, pp 165-66) and Heyman and Sobel (1982, p 511). We can characterize the observable results of a Poisson purchase process in three ways: (i) by the distribution of purchase frequencies for a unit time, (ii) by the density function of interpurchase times, and (iii) by the distribution of a fixed number of purchases across any given interval. The Poisson process implies certain properties for each. To begin, the distribution of observed purchase frequencies in any given unit time follows the Poisson distribution P(x; ›.) = e-1 X.x/x!, (1) where 1l is the rate of purchase and x is the observed frequency. The mean of this distribution equals X, as does the variance, so purchases are highly irregular. This irregularity is also evident in the density of interpurchase times implied by the Poisson process, which is the exponential f(t; = xe-M, (2) where t is the interpurchase time. This density is memoryless: for a given rate, the timing of future purchases is independent of those in the past. Thus it implies no purchase regularity. Also, its mode is at zero, implying that the most likely time for the next purchase, given that one has just occurred, is the following instant. Finally, because the probability of a Poisson event on any very small interval equals X.dt, the distribution of a fixed number of Poisson purchases across an observation period can be viewed as a number of iid draws from a uniform distribution. This property is discussed more fully by Parzen (1962, pp 140-41) and by Schmittlein and Morrison (1985). The major criticism of the Poisson assumption is that it implies too little regularity in purchase timing (Herniter 1971). If people consume goods in a regular fashion, then we would expect them to purchase goods in a regular fashion as well. 4 With this in mind, Chatfield and Goodhardt (1973) proposed the "condensed Poisson model," where the density of individual interpurchase times is Erlang-2, which has a mode greater than zero and implies some purchase regularity. In many situations, the condensed Poisson process would seem to be quite plausible. However, neither Chatfield and Goodhardt (1973) nor Mortier and Schmittlein (1981) found it to yield consistently better fit or predictions than the Poisson model. For more discussion on this point see Morrison and Schmittlein (1988). Gamma Heterogeneity The heterogeneity in rates across the population is assumed to be described by the gamma density fa; r a) = (1/F(r)) ar e l e**1 (3) where r and a are known as the chape and scale parameters, respectively. Because the gamma density is uni-modal, there are certain forms of heterogeneity that it will not capture very well. Accordingly, generalizations of this mixing model have been proposed. Morrison (1969) combined a gamma mixing density with a "spike" at zero to describe the presence of hard-core non-buyers in the sample. Robbins (1977) proposed an empirical Bayes estimator that allows for an arbitrary mixing of Poisson processes. A recent analysis of Navy recruiter productivity by Carroll, Lee, and Rao (1986) applies both of these generalizations. NBD Mode! For a fixed t