Q 39Q 39THE LYDIA E. PINKHAM MEDICINE COMPANY
The Lydia E. Pinkham Medicine Company was a family-owned concern whose income was derived largely from the sale of Lydia Pinkham's Vegetable Compound. Perhaps students today could use some of the compound to relieve stress; unfortunately, it is no longer sold. Lydia Pinkham's picture was on the label, and the compound was marketed to women. Ads for the compound included this invitation: "write freely and fully to Mrs. Pinkham, at Lynn, Mass., and secure the advice which she offers free of charge to all women. This is the advice that has brought sunshine into many homes which nervousness and irritability had nearly wrecked." In fact, the company ensured that a female employee answered every letter. Women did write to Mrs. Pinkham. Their claims included this one: "Without [Lydia Pinkham's Vegetable Compound] I would by this time have been dead or, worse, insane.... I had given up on myself; as I had tried so many things, I believed nothing would ever do me any good. But, thanks to your medicine, I am now well and strong; in fact, another person entirely." This testimonial and others were reproduced in print ads for the compound.
The unique nature of the company-one dominant product that accounted for most of the company's sales, no sales staff, and a large proportion of sales revenues invested in advertising-and the availability of data on both sales and advertising led the Committee on Price Determination of the National Bureau of Economic Research (NBER) in 1943 to recommend that the data be subjected to thorough analysis. The research was not undertaken for several years (Palda 1964). Analysts have studied the data using causal models that include the advertising data and other economic variables (similar to those presented in Chapter 8). However, several researchers have suggested that Box-Jenkins approaches using only the sales data provide comparable, or even superior, predictions when compared with the causal approaches (see, e.g., Kyle 1978). The sales data are appealing to study for two reasons:
1. The product itself was unchanged for the span of the data; that is, there are no shifts in the series caused by changes in the product.
2. There was no change in the sales force over the span of the data, and the proportion of revenues spent on advertising was fairly constant. Thus, there are no shifts in the data caused by special promotions or other marketing phenomena.
Typically, actual data are not this "clean" in terms of product and marketing continuity.
The task at hand, then, is to determine which Box-Jenkins (ARIMA) model is the "best" for these data. The model will be developed using the 1907-1948 data and tested using the 1949-1960 data shown in Table 9-18.
MODEL IDENTIFICATION
A computer program capable of ARIMA modeling was used to examine the data for 1907 through 1948; the data for 1949 through 1960 are used to examine the forecasting ability of the selected model. Preliminary tests suggest that the data are stationary (that is, there is no apparent trend), so differencing is not employed. After examining the autocorrelations and partial autocorrelations, it was determined that an AR model was most appropriate for the data. (The autocorrelations (ACF) and partial autocorrelations (PACF) for 10 periods are given in Table 9-19.) The autocorrelations and partial autocorrelations seemed to be consistent with those of an AR(2) process. To verify the order p of the AR component, Akaike's information criterion ( AIC ) (see Equation 9.7) was used with autoregressive models of orders, p = 2, and 3. The AIC leads to the choice of an AR(2) model for the Lydia Pinkham data.
TABLE 9-18 Lydia E. Pinkham Medicine Data
TABLE 9-19
MODEL ESTIMATION AND TESTING OF MODEL ADEQUACY
A computer program was used to estimate the parameters (including a constant term) of the AR(2) model using the data for 1907 through 1948. Using the estimated parameters, the resulting model is
where the numbers in parentheses beneath the autoregressive coefficients are their estimated standard deviations. Each autoregressive coefficient is significantly different from zero for any reasonable significance level. The residual autocorrelations were all small, and each was within its 95% error limits. The Ljung-Box chi-square statistics had p -values of.63,.21, and.64 for groups of lags m = 24, and 36, respectively. The AR(2) model appears to be adequate for the Lydia Pinkham sales data.
FORECASTING WITH THE MODEL
The final step in the analysis of the AR(2) model for these data is to forecast the values for 1949 through 960 one period ahead. (For example, data through 1958 are used in computing the forecast for 1959.) The forecasting equation is
and the one-step-ahead forecasts and the forecast errors are shown in Table 9-20.
Table 9-20
In addition to the one-step-ahead forecasts, some accuracy measures were computed. The forecasts from the AR(2) model have a mean absolute percentage error ( MAPE ) of 6.9% and a mean absolute deviation ( MAD ) of $112 (thousands of current dollars).These figures compare favorably to the accuracy measures from the causal models developed by other researchers.
SUMMARY AND CONCLUSIONS
A parsimonious (smallest number of parameters) AR(2) model has been fit to the Lydia Pinkham data for the years 1907 through 1948.This model has produced fairly accurate one-step-ahead forecasts for the years 1949 through 1960.
There is some evidence that the Lydia Pinkham data may be nonstationary. For example, the sample autocorrelations tend to be large (persist) for several lags. Difference the data. Construct a time series plot of the differences. Using Minitab or a similar program, fit an ARIMA(1, 1, 0) model to annual sales for 1907 to 1948. Generate one-step-ahead forecasts for the years 1949 to 1960. Which model, AR(2) or ARIMA(1,1,0), do you think is better for the Lydia Pinkham data?