Time Series Forecasting Tools

Richard A. Stanford

Professor of Economics, Emeritus
Furman University
Greenville, SC 29613

All rights reserved. No part of this work may be reproduced,
stored, or transmitted by any means without written permission
of the author except for brief excerpts used in critical analyses
and reviews. Unauthorized reproduction of any part of this work
is illegal and is punishable under the copyright laws of the
United States of America.

CONTENTS

NOTE: You may click on the symbol <> at the end of any section to return to the CONTENTS.

     1. Forecasting Tools
     2. Criteria for Model Selection
     3. Time Series Forecasting Rules
     4. Moving Average Forecasting Models
     5. Econometric Forecasting Techniques
     6. Time Series Decomposition
     7. Autocorrelation in Time Series
     8. ARIMA Modeling
     9. Illustration

Note: analyses described in this work may be conducted with the app RASstat that is available from rstanford@furman.edu.

<Blog Post Essays> <This Computer Essays>

1. FORECASTING TOOLS

This text surveys the tools and techniques of economic forecasting. These tools, together with those of simulation modeling, should provide the analyst with the ability both to assess the condition of the organization and plan strategically for its succssful operation. Our objective in the present chapter is to survey the field of economic forecasting and acquaint the reader with the possible applications in decision making settings.

The Need to Forecast

When and why is forecasting needed? Simply put, when the result of an action is of consequence, but cannot be known in advance with precision, forecasting may reduce decision risk by supplying additional information about the possible outcome. The potential benefit of forecasting lies in the realm of decision making to exert control over some process.

Decision theorists distinguish two types of control: passive and active. Active control may be exerted if there is some possibility of altering the direction or magnitude of some process under the firm's management. Even if such active control is not possible under known technologies (e.g., the weather, the implementation of a government policy, a competitor's R&D effort), there may be some possibility of exerting passive control. Passive control is anticipatory of the likely consequence--for example to move out of the way or to erect barriers or other means of protection from the phenomenon, or to move to take advantage of it.

Almost all of the firm's activities as well most natural phenomena are amenable to some form of active or passive control by the firm's management. Each alternative form and degree of control action is likely to result in a range of possible outcomes rather than a unique result which can be known with certainty in advance of choosing from among the possible control alternatives. Decision risk is inherent in the dispersion of the outcome possibilities.

This chapter examines a technique for reducing the risk associated with selection from among the various available control alternatives. The means for reducing risk is to attain more information about the possible outcomes by estimating or forecasting both the most likely outcome of a control action, and the range within which the outcome is likely to occur.

Types of Forecasts

There are basically three types of questions about future states which efforts at forecasting might address. When might an event occur? What are the qualitative characteristics of the outcome of an expected event? What will be the magnitude of a quantity at a future point in time? An example of the first question is "When will the next vacancy occur in the sales department?" An example of the second is "What will be the sex of the next salesperson employed?" And an example of the third is "What will be the likely volume of sales during the third quarter of the year?"

Event-timing forecasts can perhaps be best approached by seeking to identify some sort of leading indicator (e.g., the date at which a terminal illness is diagnosed). Qualitative-outcome forecasting is best approached by seeking to add to the stock of information upon which probability assessments may be based (e.g., almost 90 percent of all sales personnel in our industry are females). Quantitative-magnitude forecasts may be approached by examining series of historical data about the target phenomenon and related matters. This chapter focuses on the latter class of questions by describing analytical models which can analyze and extrapolate so-called time series of data.

Forecasting by Default

All rational decision makers engage in forecasting behavior, whether by intent or by default. Many times decision makers do not formally, intentionally, or even consciously, construct a forecast of likely outcomes before making their decisions. But they must have informally indulged in the implicit forecast that the future will be like the recent past. The default forecast, i.e., that the future will be like the recent past, may be entirely adequate for most of the simpler and less-consequential decision-making circumstances of daily life and commercial operation. Many aspects of the world are more complex and dynamic, however. The more complex and dynamic a decision situation, and the more consequential the likely outcome, the less likely is the default forecast to be adequate.

Forecasting by Intent

The dynamism of the world, the range of alternative courses of action, the consequence of outcomes, and the variability of outcomes have led analysts to develop a variety of time series forecasting techniques. The availability of these more formalized techniques has enabled decision makers to engage more readily in intentional forecasting analyses before having to make consequential decisions. While such time series forecasting techniques have in the past tended to remain within the preserves of professional analysts with whom the decision makers have contracted for consulting services, there is technically no reason why the ultimate decision makers themselves cannot grasp and wield the known and proven forecasting tools.

Computational vs. Judgmental Forecasting

A purely judgmental approach to forecasting might avoid the use of computational techniques altogether. The exclusively judgmental approach relies upon the perceptiveness, insight, and experience of the forecaster to produce the forecast of the future state. Depending upon how consequential the decision and how able the forecaster, a purely judgmental approach may yield satisfactory forecasts. In cases of more consequential decisions or more dynamic situations, computational forecasting methods may be in order. However, it does not necessarily follow that a computational approach will be able to improve upon informed judgment.

Science vs. Art

Lawrence Salzman has provided a meaningful distinction between science and art in the forecasting realm. Salzman suggests that after the usual adjustments are made to the data in a computational approach, "From then on the science melts to a degree, and the liquid part is called art. We can define the artist in most general terms as one who knows the science of his subject and is able to adapt it to his needs." (Computerized Economic Analysis, McGraw-Hill, 1968, p. 73) As Salzman further notes, computational forecasting techniques are little more than potentially valuable tools; they can enable the forecaster "to gain insight and help him to make more sophisticated value judgments." Three important points follow from Salzman's distinction:

a. Except for the nature of the tools used, there is nothing particularly "scientific" about forecasting.

b. The computational techniques and models which are described in the ensuing chapters are in fact only tools of analysis.

c. The exercise of judgment can never be escaped, even when following the most formal computational approach possible.

A successful forecaster who uses computational tools has to have exercised good judgment in selecting and adjusting the tools. In this the forecaster has utilized science in practicing the art.

Model Method

The forecasting techniques described in the ensuing pages of this chapter are those of model methodology. A brief review of model formats will provide a platform for extending them in the realm of forecasting. If continuous variation may be depicted in two- or three-dimensional graphic space, the model may be represented in mathematical form as a generalized functional-notation statement, e.g.,

(1) y = f(x),

or as a specific equation, such as

(2) y = a + bx,

which is a slope-intercept form of a linear relationship between variable y, the object of the analyst's interest, and variable x, which is thought or has been shown to be a predictor of the behavior of y. This equation would plot in two-dimensional graphic space as a straight line with y-axis intercept at value 1, and slope equal to the value b.

Perhaps very few real-world relationships can be represented so simply as a straight-line model incorp-orating only two variables. Thus, the model may be extended to encompass several other variables, e.g.,

(3) y = f(x₁,x₂,...,x_n)

or the equation format,

(4) y = a + b₁x₁ + b₂x₂ + ... + b_nx_n.

It may be extended to a non-linear form, such as

(5) y = a + b₁x₁ + b₂x₁² + b₃x₁³ + ...,

or to some combination of these forms.

In these models, the dependent variable y is related (by hypothesis, or as a matter of empirical verification) to one or more independent variables, which are represented by the x symbols. The other symbols in the equations (a, b₁, b₂, etc.) are the so-called parameters of the relationship. They specify the way in which the dependent variable is related to the independent variable(s).

Models for Forecasting

Equations (2), (4), and (5) above are possible formats (among many others) for forecasting models, but to be useful as such a couple of adjustments must be made. First, a lead-lag structure needs to be built into the independent-dependent variable relationship. Otherwise, we shall find ourselves in the difficult position of having to predict values of the independent variables before we can forecast values of the dependent variable. The structuring of a lead-lag relationship may be accomplished by pairing dependent variable values with independent variable observations occurring one or more periods earlier in time. Equation (2) may thus be recast as

(6) y_t = a + bx_t-1

(7) y_t+1 = a + bx_t

for a one-period gap between the dependent and independent variable. If we settle upon the convention of the equation (7) relationship, we can then say that the dependent variable (y) lags (after) the independent variable (x) by one period; an equivalent statement would be that the independent variable leads (is ahead of) the dependent variable by one period. We shall use the convention of referring to the dependent variable as lagging the independent variable. The lag structure can be generalized as an n-period lag by specifying a forecast model in the following format,

(8) y_t+n = f(x_t),

a specific example of which is

(9) y_t+n = a + bx_t.

The second crucial adjustment which it is necessary to make in order for the models to have forecasting usefulness is to convert any unspecified parameter values (a and b in equation 9) to specific numerical values. Statistical regression analysis is a common approach to determining the values of the parameters in a predictive equation. Suppose that the values 20 and -2 have been estimated for a and b, respectively, in equation (9). If we are interested in an dependent-to-independent variable lag of 3 periods, we can rewrite equation (9) as

(10) y_t+3 = 20 - 2x_t.

Then, for any value of x at time t, say the value 4, the forecast for y at time t+3 is

y_t+3 = 20 - 2(4) = 12.

The potential forecasting value of a model like equation (10) should be intuitively obvious. If the objective is to forecast the value of variable y three periods into the future, given the model specified in equation (10), all that is required is the current (time t) value of y and the exercise of a bit of arithmetic ability. The width of the interval within which the actual value of y is likely to fall is determined as a point estimate forecast plus and minus some number of standard errors of the estimate (2 for a 95 percent confidence interval, 3 for a 99 percent confidence interval).

We shall organize the time-series models examined in this chapter into two classes. Naive models are those which construct the forecast of a series by extrapolation of the same series. In effect, earlier values of the same series constitute the independent variable which is then used to forecast future values of the series as dependent variable. Multivariate models are those in which the object series is treated as a dependent variable. In multivariate models, forecasts are constructed with reference to one or more additional data series specified as independent variables.

Data Sources and Types

Where does one find data? What kinds of data are needed? How much data? Data may be obtained by conducting experiments, from surveying opinions, preferences, or expectations, or from historical sources. With notable exceptions, social scientists have generally shied from experimentalism, preferring to rely upon surveys and recorded history for their data. The on-going natural courses of human interaction constitute the most prolific of stochastic processes. But human experience becomes data only when human beings go to the trouble (and expense) of observing and recording that information for later consideration. There are cost implications to the recording of history, no less so than the conducting of surveys.

Whatever the source of information, the numerical data "cake" may be sliced in either of two possible directions. First, observations of various aspects of a phenomenon or process may be recorded at a point in time, but across the population of subjects. Data collected in this fashion are generally referred to as cross-sectional data. Alternately, observations of the behavior of a single entity may be recorded at various points in time, either at regular intervals or irregularly. If the time interval between subsequent observations is a constant, for example, a month or a year, the collection of data may be referred to as a time series.

Although some of the models described in this chapter, in particular those estimated by regression, may be made to address cross-sectional data, we shall take as our primary mission the application of forecasting models to time series data where the standard interval between observations is the month or the year.

Specification of a Forecasting Model

The process of specifying a forecasting model involves (1) selecting the variables to be included, (2) selecting the form of the equation of relationship, and (3) estimating the values of the parameters in that equation. After the model is specified, (4) its performance characteristics should be verified or validated by comparison of its forecasts with historical data for the phenomenon it was designed to forecast.

Although the forecasting analyst might simply guess at the optimal form of the equation of relationship and the likely values of the parameters (such as the naive forecasting rules described below), both the equation form and the parameter values can usually be estimated more accurately with reference to historical data for the phenomenon. Thus, an historical data base is useful both to the specification and to the validation phases of model construction.

The Data Matrix

As noted above, data may be collected from experiments, by conducting surveys, or from historical sources. Regardless of the source of the data, a convenient form in which to organize it for statistical analysis is the data matrix which is a rectangular array of numbers presented in a row-and-column format. The columns and rows may be given either possible identity as desired, but for our purposes it will be convenient to construe the columns as "variables," and the rows as "cases" or "observations."

If the data were collected as cross-sectional data (i.e., at a point in time but across the various subjects in the population), then the rows of the matrix would correspond to the included subjects. Alternately, if data were collected as time-series data (i.e., over time but for a single subject), the rows of the matrix would represent the succession of times at which observations were taken. The focus of this chapter is upon time-series data, so we shall normally presume that the data matrix is dimensioned for some number of variables horizontally, and some number of time-periods vertically.

Data Requirements

How much data are required in order to specify the forecasting models described in this chapter? Unfortunately, there is no single answer to this question. If after examining these models the reader decides that some of the naive models should be adequate, the maximum number of required columns in the data matrix is only one, that of the series to be forecasted. Should the reader want to try the multivariate models described in the last section of the chapter, the data matrix must have at least two columns, one to contain the so-called object series, and one or more additional columns to contain the independent variable or predictor series.

How many rows of data should a forecasting data matrix contain? Again, there is no single answer. The more observations available for analysis, the more reliable the results will likely be. An often used rule of thumb is that a regression model with one independent variable should span at least two dozen observations, and more if additional independent variables are included.

If the series to be forecasted is thought to exhibit a seasonal pattern, the data matrix should contain enough rows to accommodate several (three or more) years of monthly data so that the identified seasonal pattern may be confirmed several times over. If the object series is thought to exhibit a cyclical pattern, the matrix should have enough rows to accommodate several such cycles. Since U.S. business cycles over the past century have averaged around five years in duration (from peak through trough to the next peak), the matrix may have to extend to 180 or more rows containing monthly data to encompass three such cycles.

Irrespective of the number of rows and columns, the data matrix must be densely populated with data to be analyzed with forecasting models. "Densely populated" means that there may be no vacant cells in the matrix, that is, there may be no missing or absent data. Should some data be missing, either the missing values must be found or the columns (variables) containing the vacant cells must be deleted from the analysis. Alternately, the matrix may be shortened in length at either end to exclude the vacant cells. In any case, it is strictly not legitimate to interpolate, average, "guesstimate," or otherwise invent data to fill the vacant cells.

Transformations

After the matrix has been populated with original data, the analyst may find that other versions or transformations of the data in certain columns are needed in some of the forecasting models. For example, it may be desirable to lag or lead all of the values in one of the columns relative to data in the other columns. This would require that the data in that column be shifted upward or downward by the requisite number of rows. In a regression model, it may be desirable to use the squared or logarithmic values of the data in some column as a dependent or independent variable. If such data transformations are required, it will be convenient in structuring the data matrix to allow several vacant columns beyond the original data columns to receive the transformed data.

What's Ahead

Once a potentially forecastable data series has been entered into a matrix, the analyst must consider criteria for selecting appropriate forecasting models. Chapter 2 surveys the criteria which are available for this purpose.

<>

2. CRITERIA FOR MODEL SELECTION

Once data have been captured for the time series to be forecasted, the analyst's next step is to select a model (or models) which has potential for producing successful forecasts. Various statistical and graphic techniques may be useful to the analyst in the selection process.

Sequence Plots

The best place to start with any time series forecasting analysis is to graph sequence plots of the time series to be forecasted. A sequence plot is a graph of the data series values, usually on the vertical axis, against time (or the matrix row counter) usually on the horizontal axis. The purpose of the sequence plot is to give the analyst a visual impression of the nature of the time series. This visual impression should suggest to the analyst whether there are certain behavioral "components" present within the time series. The conventional approach to time series analysis is to presume that any time series may consist of several possible components, depending upon the standard intervals over which observations were recorded.

For illustrative purposes, we shall analyze sequence plots of time series components with a seven-year collection of monthly data for a series which we shall denote Y1. Data for the Y1 series are reproduced in Table 2-1 at the end of this chapter.

Random Noise. Every time series which is not subject to administrative control can be expected to exhibit dispersion about its mean (arithmetic average) when graphed on a sequence plot. This dispersion may exhibit certain discernible patterns, several of which are described below and which may be isolated from the series employing commonly available techniques. The dispersion which remains after all discernible patterns have been isolated and removed from a series may be described as an irregular or random noise component. If the series contained no discernible patterns to be isolated and removed, then the series itself may be described as a random noise series.

The technical specification of a purely random noise series is that there is no correlation among observations within the series (i.e., autocorrelation) so that no observation within the series can be used to forecast the values of other observations within the same series. The only relevant forecast which can be made for future values of a purely random noise series is the mean of the series.

Figure 2-1 illustrates the sequence plot for the Y1 raw-data series. While series Y1, identified by the asterisk (*) symbols, obviously exhibits a great deal of random noise behavior, there are also present within the series one or more other behavioral patterns.

Figure 2-1. Sequence plot of trend for Series Y1.


VARIABLE: ORIGINAL = *
STANDARD DEVIATION =    9.9184
MEAN =                248.2333

VARIABLE: TREND = .
STANDARD DEVIATION =    6.6418
MEAN =                248.6420

     -3      -2      -1       0       1       2       3
      I-------+-------+-------+-------+-------+-------+
7  7  I        .              I   *
8  8  I        .              I  *
9  9  I        .              I *
10 10 I         .             I *
11 11 I         .             I  *
12 12 I          .            I  *
13 1  I          .            I*
14 2  I       *  .            I
15 3  I        *  .           I
16 4  I         * .           I
17 5  I           *.          I
18 6  I            . *        I
19 7  I            .  *       I
20 8  I             .  *      I
21 9  I             x         I
22 10 I              . *      I
23 11 I              .*       I
24 12 I              .  *     I
25 1  I           *   .       I
26 2  I           *   .       I
27 3  I        *       .      I
28 4  I           *    .      I
29 5  I           *    .      I
30 6  I              *  .     I
31 7  I             *   .     I
32 8  I            *     .    I
33 9  I             *    .    I
34 10 I            *     .    I
35 11 I            *      .   I
36 12 I          *        .   I
37 1  I                *   .  I
38 2  I        *           .  I
39 3  I                    X  I
40 4  I                    *. I
41 5  I                     . *
42 6  I                      .I*
43 7  I                      XI
44 8  I                      .*
45 9  I                       X
46 10 I                      *.
47 11 I                      *I.
48 12 I                     * I.
49 1  I                    *  I.
50 2  I                   *   I .
51 3  I                       X .
52 4  I                       I .*
53 5  I                       I  . *
54 6  I                       I  .      *
55 7  I                       I   .*
56 8  I                       I   . *
57 9  I                       I   . *
58 10 I                       I    .*
59 11 I                       I    X
60 12 I                       I    *.
61 1  I                       I  *  .
62 2  I                       I   * .
63 3  I                       I     *.
64 4  I                       I      X
65 5  I                       I       X
66 6  I                       I       . *
67 7  I                       I       X
68 8  I                       I        . *
69 9  I                       I       *.
70 10 I                       I      *  .
71 11 I                       I    *    .
72 12 I                       I       * .
73 1  I                       I    *     .
74 2  I                       I  *       .
75 3  I                       I      *    .
76 4  I                       I      *    .
77 5  I                       I         * .
78 6  I                       I            . *
79 7  I                       I         *  .
80 8  I                       I           * .
81 9  I                       I             .*
82 1  I                       I             X
83 1  I                       I            * .
84 1  I                       I             *.

VAR 1 EXPECTED NUMBER OF RUNS IF SERIES IS RANDOM:  40
VAR 1 ACTUAL NUMBER OF RUNS:                         7

Trend. The sequence plot of a time series may be said to exhibit trend if the data path is not approximately level, but appears to change consistently in the same direction (which may be upward or downward). It is common to fit a so-called "trend line" to the data path so as to minimize the deviations (actually the squares of the deviations) of the fitted line from the plot of the points. Regression techniques for fitting a trend line are described below. The trend is said to be linear if the slope of the fitted trend line does not change, and non-linear if the slope does change over the course of the series. The dot (.) symbols in Figure H1-1 identify the path of a fitted positive-slope, linear trend line.

Trend may be detected computationally by dividing the time series into a convenient number of ranges, say three or four, and computing the means for each of the ranges. Trend is present if the means for the successive ranges consistently increase (or decrease).

Trend in economic or business series may be attributed to a growth or contraction process, or to inflation or deflation if the series consists of money-value data which have not been deflated to eliminate the effects of change in the purchasing power of the unit of currency.

Cyclical Behavior. The term "cycle" perhaps suggests greater regularity or periodicity than should be expected in economic or business time series. Given the irregularity of duration of so-called "business cycles," some analysts have preferred to avoid the term "cycle" in favor of some alternative such as business fluctuations. Also, the term "cycle" when applied to a business or economic phenomenon implies some underlying causative mechanism, about which there is a notable lack of consensus among economists and business analysts. The debate has gone in the direction of considering whether there is a natural and irrevocable wave-like process in commercial contexts, or only discrete fluctuations resulting from impacts of exogenous occurrences or policy actions. Even so, we shall employ the term "cycle" to refer to the behavioral component of a time series consisting of alternations in runs of values above and below the mean of the series, or above and below the fitted trend line if trend is present.

Cyclical behavior may be detected computationally in either of two ways. One way is similar to that for detecting trend. The difference is that if cyclical behavior is present, the means of the successive ranges of the time series will not consistently increase or decrease. The means for four consecutive 21-month ranges of the 84-month series Y1 are 242.43, 238.21, 251.62, and 258.79, thus suggesting the presence of a cyclical pattern. Since the means do generally increase, the implication is that there is a positive trend present as well.

A second method of detecting cyclical behavior is to count the number of "runs" of values above and below the mean of the series, and compare the actual number of runs with the statistically expected number of runs if the series were a purely random noise series. The expected number of runs for a purely random noise series may be computed by the formula, 2p(n-p)/n-1, where n is the number of items in the series and p is the number of items above the mean of the series. If the actual number of runs is substantially smaller than the expected number of runs, there is reason to believe that some sort of cyclical behavior is present in the series. The actual and expected number of runs for the Y1 series are 7 and 40.66, respectively, thus confirming the presence of cyclical behavior within the series.

Seasonality. What has become known in U.S. commercial circumstances as a "business cycle" (in the U.K. it is known as a "trade cycle") is a fluctuation in business activity from trough through peak to the next trough of five to six years duration. Seasonality is a special type of cyclical behavior which comes closer to meeting the requisites of periodicity than do any of the longer-duration waves in commercial activity. Seasonality is cyclical behavior with a period of one year, and which repeats itself, possibly with differences in amplitude, year after year with little to no difference in timing.

Seasonality is attributable to the passing of the seasons, and to custom or convention in the timing of events such as the start of school and particular holidays. The seasonal character may change over time due to changing customs or conventions. For example, during the latter half of the twentieth century the start of school has been advancing from early September to late August in most parts of the U.S., thus causing the back-to-school spurt in retail sales activity to occur somewhat earlier during the year.

Seasonality cannot be observed at all in annual data, and can be seen only imperfectly in quarterly data. It is most often analyzed and identified in monthly data recorded in successive 12-month intervals, but can also be detected in weekly and daily data. Business firms that record in-house data over ten equal periods per year (instead of twelve months) may also expect to see seasonal behavior in their data.

It may be possible to discern other possible patterns of non-random behavior in time series data collected over shorter intervals, for example in weekly or daily data. To the extent that people are paid on a monthly basis (instead of weekly, biweekly, or semi-monthly), we should not be surprised to observe a spurt of retail activity during the week following the most common monthly pay day. Daily data can also exhibit so-called "trading-day" variation associated with mid- or end-of-week activity. Such variations of course cannot be observed in annual, quarterly, or monthly data.

Most monthly time series of economic or business data typically contain two or more of these components. It is often possible to discern the presence of some of the components by visual inspection of the sequence plot of the series. Trend will usually be quite conspicuous, as apparent in Figure 2-1. It may also be possible to discern cyclical swings around an imaginary or fitted trend line, but it typically is much more difficult to discern seasonality distinct from random noise. Computational techniques (e.g., time series decomposition) for separating out the components of a monthly time series are described in Chapter 5. Figure 2-2 illustrates a sequence plot of a combination trend-cycle series plotted from the data in Table 1-1.

Figure 2-2. Trend-cycle series derived from series Y1.


VARIABLE: TREND-CYCLE = +
STANDARD DEVIATION =    8.6293
MEAN =                247.9816

VARIABLE: TREND = .
STANDARD DEVIATION =    6.6418
MEAN =                248.6420

     -3      -2      -1       0       1       2       3
      I-------+-------+-------+-------+-------+-------+
7  7  I        .            + I
8  8  I        .              +
9  9  I        .             +I
10 10 I         .           + I
11 11 I         .          +  I
12 12 I          .        +   I
13 1  I          .      +     I
14 2  I          .     +      I
15 3  I           .   +       I
16 4  I           .  +        I
17 5  I            .+         I
18 6  I            X          I
19 7  I            X          I
20 8  I           + .         I
21 9  I            +.         I
22 10 I            + .        I
23 11 I            + .        I
24 12 I            + .        I
25 1  I            +  .       I
26 2  I            +  .       I
27 3  I           +    .      I
28 4  I          +     .      I
29 5  I          +     .      I
30 6  I         +       .     I
31 7  I        +        .     I
32 8  I         +        .    I
33 9  I         +        .    I
34 10 I          +       .    I
35 11 I           +       .   I
36 12 I             +     .   I
37 1  I               +    .  I
38 2  I                +   .  I
39 3  I                 +  .  I
40 4  I                  +  . I
41 5  I                   + . I
42 6  I                    + .I
43 7  I                     +.I
44 8  I                     +.I
45 9  I                      +.
46 10 I                       X
47 11 I                       +.
48 12 I                       +.
49 1  I                       IX
50 2  I                       I X
51 3  I                       I .+
52 4  I                       I  X
53 5  I                       I  .+
54 6  I                       I  . +
55 7  I                       I   .+
56 8  I                       I   . +
57 9  I                       I    .+
58 10 I                       I    .+
59 11 I                       I    . +
60 12 I                       I     .+
61 1  I                       I     .+
62 2  I                       I      X
63 3  I                       I      X
64 4  I                       I      .+
65 5  I                       I       X
66 6  I                       I       X
67 7  I                       I       +.
68 8  I                       I       +.
69 9  I                       I       +.
70 10 I                       I       + .
71 11 I                       I       + .
72 12 I                       I        + .
73 1  I                       I        + .
74 2  I                       I        + .
75 3  I                       I         + .
76 4  I                       I         + .
77 5  I                       I          + .
78 6  I                       I           +.
79 7  I                       I            X
80 8  I                       I            +.
81 9  I                       I             X
82 10 I                       I              X
83 11 I                       I              .+
84 12 I                       I              .+

Autocorrelation

In any time series containing non-random patterns of behavior, it is likely that any particular item in the series is related in some fashion to other items in the same series. If there is a consistent relationship between entries in the series, e.g., the 5th item is like the 1st, the 6th is like the 2nd, and so on, then it should be possible to use information about the relationship to forecast future values of the series, i.e., the 33rd item should be like the 29th. In this case we may say that the series has some ability to forecast itself because of autocorrelation (or self-correlation) among values within the series.

How can one identify and assess the strength of autocorrelation in a time series? One means is to compute the coefficients of autocorrelation between pairs of entries within the series. If the analyst is interested in the autocorrelation between adjacent entries, the autocorrelation should be specified to order k=1. For the correlation between every other entry in the series, the autocorrelation should be specified to order k=2. Autocorrelation order k=3 would be for the correlation between each entry and the third from it in the series, and so on.

Computed autocorrelation coefficeints may serve as a basis for judging whether a time series may be modeled by autoregressive (AR), moving average (MA), or some combination (ARMA or ARIMA). The reader may find a discussion of the computation of such coefficients and the criteria by which they may be interpreted in Chapter 7.

Other Independent Variables

Many time series are adequately self-predictive employing naive forecasting rules, moving average models, autoregressive models, or some combination. However, if none of these approaches can adequately forecast the time series, then there are two remaining possibilities. Either the series is characterized so extensively by random noise that its behavior is simply not forecastable, or the behavior of the series is influenced by some other or "outside" phenomena not included in the model. The latter case implies the existence of specification errors. Analysts are reluctant (perhaps as a matter of pride) to admit that a series is so random in behavior as to be unforecastable. So, ruling out this possibility for the moment, analysts are led to search for other data series which might explain the behavior of the object series.

Another type of graphic display which might be useful to the analyst in examining the possible influence of other phenomena on the object series is the scatter diagram. A scatter diagram can relate the object series to another series on the premise that the behavior of the object series is in some way governed or influenced by the behavior of the other series. The object series may then be construed as a "dependent" variable, and the other series as an "independent" or deterministic variable. It is conventional (but not essential) to plot the dependent variable on the vertical axis of the scatter diagram, and the independent variable on the horizontal axis.

The scatter diagram should convey a visual impression of whether or not there is indeed any relationship between the two variables. If there is no significant relationship, points plotted in the coordinate space will be randomly dispersed, exhibiting no discernible pattern or path. In this case, no further consideration need be given to that independent variable.

The closer the plotted points lie along a path with a discernible direction, the stronger is the relationship between the two variables. The relationship might be direct (indicated by an upward slope of the path) or inverse (shown by a downward slope), linear (best represented by a straight line) or curvilinear (a curved path). If all of the plotted points happened to lie precisely along a particular line (straight or curved), we could say that there is a "perfect" relationship between the two variables (but this would be so rare in economic and business contexts as to be suspect).

If the analyst has reason to believe that the object series is governed or influenced by two or more other phenomena for which comparable time series are available, then one scatter diagram relating the object series to each of the prospective deterministic variables should be constructed. However, it may occur that neither of the deterministic variables by itself will exhibit a visually identifiable relationship with the dependent variable in a scatter diagram.

The Correlation Matrix

A statistical correlation matrix, an example of which is illustrated in Figure 2-3, may enable the analyst to assess the strength of relationships between the object series and other possible series and among the other series. Each number in the correlation matrix, identified by a column header and a row descriptor, indicates the degree of relationship (i.e., the correlation) between the respective variables.

Figure 2-3. Correlation matrix for variables Y1, X1, and X2.


MATRIX OF COEFFICIENTS OF CORRELATION BETWEEN PAIRS OF VARIABLES IDENTIFIED HORIZONTALLY
AND VERTICALLY:

                    1       2       3

           1     1.0000  0.9087 -0.1977
           2     0.9087  1.0000 -0.5767
           3    -0.1977 -0.5767  1.0000

The range of the correlation coefficient, usually denoted by the symbol r or R, is from -1 to +1. Correlation coefficients close to zero (positive or negative) imply negligible relationship, and will correspond to a scatter diagram within which plotted points are dispersed across the coordinate space in no discernible pattern. Correlation coefficients approaching 1 (positive or negative) indicate a strong relationship, corresponding to a scatter diagram with plotted points lying close to a line which can represent the relationship. Positive correlation coefficients imply a direct relationship (i.e., both variables changing in the same direction); negative correlation coefficients suggest an inverse relationship (one variable increases while the other decreases). The principal diagonal of the correlation matrix is populated by unity numbers (1.0000), signifying perfect correlation between each variable and itself.

What's Ahead

Chapters 1 and 2 have examined the possibility of forecasting various aspects of the firm's situation, and the criteria for selection of a forecasting model. A range of naive time series forecasting rules and techniques which may be of use to the manager is surveyed in Chapters 3 and 4. Chapter 5 elaborates multivariate forecasting techniques. Chapters 6, 7 and 8 describe forecasting techniques which employ combinations of approaches.

Table 2-1. Monthly data for series Y1.


MTH  YEAR1  YEAR2  YEAR3  YEAR4  YEAR5  YEAR6  YEAR7

 1   233.4  249.4  234.1  240.7  245.6  252.1  253.3
 2   236.6  229.2  234.1  231.3  244.9  253.1  252.0
 3   239.8  229.5  229.7  245.3  248.3  254.3  256.2
 4   242.7  231.3  233.6  244.5  251.9  255.3  257.0
 5   245.2  233.7  234.1  249.1  253.9  256.4  259.3
 6   252.2  238.6  238.4  249.7  260.3  259.4  265.9
 7   252.3  239.7  237.0  248.1  255.1  257.5  259.9
 8   252.2  241.1  236.2  248.6  255.8  260.1  261.8
 9   251.1  238.0  237.1  249.1  255.8  257.3  264.6
10   251.1  240.8  235.5  247.4  255.5  256.8  264.4
11   252.2  240.2  236.0  248.2  255.1  253.6  262.7
12   251.7  242.3  232.8  246.6  254.3  257.2  263.9

3. TIME SERIES FORECASTING RULES

In Chapters 1 and 2 we examined the forecasting environment and the criteria for selecting potentially effective forecasting models. In this chapter we survey various types of time series forecasting rules which have found widespread use in the managerial decision context.

Naive Rules

Naive rules are simple but potentially effective time-series forecasting techniques. They are rules in the sense that they are prespecified so that no parameter values need be estimated. The naivete is implicit in the fact that the basis for any naive forecast of a time series is the time series itself. The series is used to predict itself, that is, historical values of the series are used to compose or construct future values of the same series. The technique of naive forecasting is therefore extrapolation.

Naive rules, in their simplicity, are relatively low-cost approaches to forecasting, but if a method is effective one should not hesitate to employ it because of its simplicity or naivete. Naive rules are more effective at short-term than at long-term forecasting. As the forecasting span or gap is lengthened, the less accurate is the naive forecast likely to be, and the greater the attendant risk in basing decisions upon such forecasts.

Classes of Simple, Naive Rules

Every time series exhibits variation in observed values between each observation and the next across the entire span of the series. Any time series may be presumed to consist of one or more types of variation: seasonal, cyclical, trend, and irregular. The method of analysis is to attempt to account for each of the types of variation present in the series. Some analysts describe the irregualr variation as "random noise" if it meets certain criteria. If a series is composed exclusively of such random noise variation, it may not be possible to forecast its values reliably using any of the rules described in this text.

The simple, naive rules described below may be made to address the trend and seasonality factors which may be present in a time series, but naive rules are rarely able to account for any cyclical behavior present in a series. There are four classes of simple, naive rules:

a. those which address neither seasonality nor trend (the default forecasting rules);

b. those which address the trend factor, but assume seasonality to be insignificant;

c. those which address the seasonality factor, but assume trend to be insignificant; and

d. those which attempt to address both the trend and seasonality components of the time series.

A. Default Forecast Rules

Rational people often make decisions without first engaging in any sort of explicit forecasting effort. When they do so, they engage in what we described in Chapter 1 as default forecasting, the presumption that the future state will be similar to the present or recent past. The default forecast may be adequate for dealing with many of the minimal-consequence decisions of daily life. Thus, default forecasting is not necessarily irrational.

It is possible to formalize the default forecasting approach into an algebraic rule which can be represented in the following format:

Rule A.1: y_t+1 = y_t,

where t is the point in time upon which the forecast is based, usually the most-recently available observation, and y_t+1 is the forecast of the next observation in the series. The rule may be generalized to a variable forecast gap, i, by respecifying it as

Rule A.2: y_t+i = y_t.

The sense of rule A.2 is that the state of the series some number of periods, i, into the future is expected to be like the observation upon which the forecast is based. However, these rules take into account neither trend nor seasonality.

Rules A.1 and A.2 are so naive and simple that one might doubt the wisdom of formalizing them. But they serve three purposes: (1) to reveal the use of the symbolic representations in the most rudimentary format; (2) to serve as a point of departure in the development of subsequent rules; and (3) to constitute a benchmark rule against which the performance effectiveness of other forecast rules may be compared.

B. Rules which Address the Trend Factor

Trend is the phenomenon of a long-term change in a recorded data series, generally in the same direction throughout the span of the series. The presence of trend may not be discernible in a few consecutive observations within the series, especially if other types of variation (seasonal, cyclical, or purely random) are present. A sequence plot of a time series (the time series values plotted vertically with respect to time itself on the horizontal axis) will usually reveal the presence of trend as a gentle upward or downward "drift" of the data path. An upward sloping trend path in a real-value time series may be indicative of a growth phenomenon; a downward-sloping path suggests contraction. In a money-value time series, an upward-sloping path may represent some combination of real growth and inflation; a downward-sloping trend path might indicate contraction with deflation.

In the class B rules surveyed following, the presumption is that no other type of variation (e.g., seasonal or cyclical) than trend is a significant factor within the series. The forecasting technique used in the class B rules is to construct a trend-adjustment factor to apply to the series observation which constitutes the basis for the forecast.

The simplest technique for attempting to account for change in a forecasting rule is to add the most recent absolute change between two observations, (y_t - y_t-1), to the most recent observation of the series, y_t, in order to compose the forecast of the next value of the series, y_t+1. If the analyst wishes to forecast a value several periods, i, beyond the most recent observation, all that is necessary is to multiply the computed change by i, which shall henceforth be referred to as the "forecast gap." This may be represented algebraically as

Rule B.1: y_t+i = y_t + i(y_t - y_t-1).

This is a highly simplistic approach to dealing with trend since it incorporates information about only the most recent change in the data. Since no other prior information is considered, it may not even be fair to call this a method for accounting for trend, which we earlier described as a phenomenon of long-term change. Yet, when this rule is compared to others, it may be deemed to yield sufficient results.

Suppose that there is reason to believe that the process of change in the series is that of growth, so that a relative change may be more meaningful than an absolute change. Rule B.2 is a modification of Rule B.1 to compose the forecasted value by applying the most recent relative change, y_t/y_t-1, to the most recent observation:

Rule B.2: y_t+i = y_t(y_t/y_t-1)ⁱ.

The application is accomplished by multiplication instead of addition as in Rule B.1. If the forecast gap, i, exceeds one period, the relative change factor will have to be applied i times to the most recent observation, hence the power i to which the rate of change is raised. Rule B.2 suffers the same conceptual difficulty concerning trend as does Rule B.1.

Rule B.3 represents an effort to address the problem of incorporating long-term change. Instead of using only the most recent absolute change as an adjustment factor, Rule B.3 uses the average of all successive observation increments in the series, and thereby employs information spanning the entire data series. The algebraic formulation of the rule is:

Rule B.3: y_t+i = y_t + i(S(y_k - y_k-1)/(m-1)),

summed from k=2 to m, where k is the summation counter, and m is the number of observations in the series. The average incremental change is computed by summing the successive single-period incremental changes over the entire series, then dividing by the number of such successive incremental changes. If forecasting more than one period beyond the most recent observation in the series (i.e., i greater than 1), it is necessary to multiply the computed trend adjustment factor by the number of periods in the forecast gap. Because rule B.3 uses information spanning the entire data series, it may be more properly regarded as dealing with trend than can Rule B.1.

Like Rule B.3, Rule B.4 utilizes information spanning the entire data series. Rule B.4 is a modification of Rule B.2 to use the average period-to-period relative changes over the entire data series (instead of the most-recent single-period relative change) as the trend adjustment factor. The algebraic formulation is as follows:

Rule B.4: y_t+i = y_t(S(y_k/y_k-1)/(m-1))ⁱ,

summed from k=2 to m. Computation of the trend adjustment factor requires the analyst first to compute all of the successive-period relative change ratios through the entire series (these could be written in an empty column of the scratch pad), sum them, then divide by the number of such ratios (one less than the number of elements in the object series). For a forecast gap greater than one period, the trend adjustment factor must be applied i times, that is, raised to the power i. Because Rule B.4 uses the relative change information spanning the entire data series, it is conceptually a better trend forecasting rule than is its predecessor, Rule B.2.

These four simple, Naive rules are purported to account for trend variation in a data series; they are not the only possible ways to treat trend variation, only the simplest. Moving average rules, described below, can also address the trend phenomenon.

C. Rules which Address the Seasonality Factor

Seasonality is a pattern of variation within a time series which repeats itself from year to year. Seasonality may be associated with agricultural functions, seasonal weather patterns, custom and convention, or religious or secular holidays. It is important to remember that a seasonal pattern in one time series may or may not resemble that in another time series.

Rule C.1 illustrates the simplest and most naive method of attempting to account for seasonality in a time series:

Rule C.1: y_t+i = y_t+i-12.

This rule makes no effort to account for trend. Its underlying premise is similar to that of the default forecasting rules, but with a minor adjustment: the value of the series in the forecasted month will be like the same month one year ago. If t is the month of the most recently observed value of the series and i is the forecast gap, a forecast of the value at period t+i is accomplished by finding (by counting back in the series) the value of the series for the corresponding month one year ago (at t+i-12). This relatively unsophisticated approach to seasonality uses information from only the most recent twelve months, and in effect throws away all earlier information.

D. Rules which Account for Both Trend and Seasonality

Even though the approach to seasonality employed in Rule C.1 is naive to the point of shortsightedness, it may enable a valuable modification to the trend adjustment factors of Rules B.3 and B.4. These rules modified to take into account the seasonal aspects of the most recent year may be recast as:

Rule D.1: y_t+i = y_t+i-12 + i(S(y_k - y_k-1)/(m-1)), summed from k=2 to m, and

Rule D.2: y_t+i = y_t+i-12(S(y_k/y_k-1)/(m-1))ⁱ, summed from k=2 to m.

While both rules utilize all of the available data to construct the trend adjustment factor, neither uses any more data than those within the most recent year for handling the seasonality factor in the time series.

We have described nine simple, naive rules, but we have not exhausted the possibilities for simple, naive format rules. Readers are encouraged to devise and try versions of these approaches which are specific to their own forecasting contexts.

The Mean Squared Error

Any of the simple, naive rules which we have described in this chapter could yield satisfactory forecasting results, even though each suffers some conceptual deficiencies. Some method is needed for making comparisons among them. The Mean Squared Error (MSE) for any forecasting rule is a measure of the average forecast error when the rule is applied to the original data series for which the rule was developed. Once a time-series rule has been conceptualized, it may be tested against any particular time series by using it to forecast as many values within the time series as possible. Although it would be more appropriate from a conceptual standpoint to validate the rule with data from outside the range used for rule construction, such additional data may not be readily available.

For Rules A.1 and B.4, if the original time series is m observations long and the forecast gap is i periods, then forecasts of the last m-i-1 observations can be made. For Rules C.1 through D.2, forecasts can be made for the last m-i-12 observations. For each of these observations, the forecast error is the difference between the forecasted value and the actual observation.

The MSE can thus be computed for any time-series rule or model which can be conceptualized. The rule or model with the smallest MSE would thus have the best forecasting record over the period encompassed by the original time series. Will it also be the most effective rule for forecasting beyond the end of the series? The answer depends upon whether the same conditions persist beyond the end of the series.

The Standard Error of the Estimate (SEE), the square root of the MSE, is the standard deviation of the forecast errors. Its usefulness is in specifying a tentative confidence interval for a point estimate forecast (we should note that many statisticians feel that confidence intervals estimated for time series are questionable). Assuming that the forecast errors are normally distributed and that past trends continue into the future, there is a 95 percent chance that a future value, when it occurs, will lie within approximately two standard errors of the point estimate made with the rule. Other approximate confidence intervals may also be computed.

On Horse Races

While the theoretical approach might permit economy in the specification of models and the empirical analysis, it may also result in a certain opportunity loss. What if a "satisfactory" model is identified before the best model available is discovered? The analyst may even run through the entire model "stable" before identifying a "satisfactory" model. In this latter situation, the horse-race approach might just as well have been used from the start.

The horse-race approach might permit us to avoid or minimize the amount of conceptual analysis, but require a larger volume of empirical computation. The resulting lowest-MSE model might still not yield satisfactory forecasting results, in which case the analyst must either give up or pursue the theoretical approach to develop yet other possible models.

So we are left with a dilemma: time each horse separately until a fast-enough horse is found; or run a horse race to find a winner? The analyst will have to make a procedural choice at this point. But we must offer one parting caution: a horse which is fast enough for one rider, or which wins one race, will not necessarily be fast enough for any other rider or to win another race. One should not jump to the conclusion that a universal forecasting rule or model has been found simply because it can do an adequate job on one time series.

What's Ahead

This brings us to an end of our survey of time series forecasting rules. A combination of these techniques, time-series decomposition, is described in Chapter 5. Another more sophisticated time series forecasting technique, ARIMA modeling, is described in Chapter 7. These and similar techniques, together with the capability of simulation modeling, may provide the informational basis of both strategic planning and tactical decision making. We now shift to moving average modeling in Chapter 3.

<>

4. MOVING AVERAGE FORECASTING MODELS

In this chapter we examine a class of naive forecasting models which are more complex than the forecasting rules described in Chapter 3. Moving average models, which function to generate a new series by computing moving averages of the original series, are oriented primarily toward removing the seasonal and irregular components or isolating the trend-cycle components of a time series. The newly generated series is a "smoothed" version of the original series.

The Smoothing Process

Moving average models function to smooth the original time series by averaging a rolling subset of elements of the original series. The subset of the original series consists of an arbitrarily selected number of consecutive observations. The subset "rolls" or "moves" forward through the series starting from the earliest observation in the series, adding a new element at the leading edge while deleting the earliest element at the trailing edge, with each successive averaging process.

The effect of the moving average process is to ameliorate the degree of variation within the original series by composing the new smoothed series. It is possible to follow a first smoothing of a series with another smoothing of the successor series. The second smoothing may be followed by yet other smoothings. The moving average process may be used for two purposes, to remove unwanted variation from a time series, and as a forecasting model.

Removing Unwanted Variation

Moving average routines may be designed to remove the seasonal and random noise variation within a time series. If the moving average routine is used repeatedly on each newly-generated series, it may succeed in removing most of any cyclical variation present. What is left of the original series after early smoothings to remove seasonal and random or irregular components is a successor series retaining some combination of trend and cyclical behavior. If no trend or cyclical behavior are present in the time series, the smoothings may leave a successor series which plots as a nearly horizontal line against time on the horizontal axis. Assuming the presence of trend and cyclical behavior in the original series, the moving average process provides a method of isolating it.

While successive applications of an efficient moving-average routine may result in filtering out all variation other than the trend and cyclical behavior from an original series, this may not be the objective. Rather, the analyst may wish to filter out only the seasonal or only the irregular variation. Either may be targeted by judiciously selecting the number of elements to be included in the moving average subset, and by designing an appropriate weighting system to accomplish his objective. For example, the U.S. Department of Commerce typically uses an unweighted moving average to filter out the seasonality from a series, then a judiciously designed weighted moving average to filter out the irregular variation.

An unweighted moving average with a relatively small number of elements (say five to seven) will have its smoothing effect without destroying the seasonality present in a series. A moving average with a larger number of elements (eleven or more) with weights designed to emphasize the elements toward the center of the subset will likely be even more efficient in removing the irregular variation, but will tend also to destroy any seasonality still present.

If the analyst's intention is to deseasonalize a time series, a number of moving-average elements in the neighborhood of eleven to thirteen is called for. An odd number of elements is more easily handled than is an even number due to the need to center the moving averages relative to the object series. Also, an appropriately-designed weighting scheme applied to the elements of the moving average may serve to improve the efficiency of the seasonality removal process.

Unweighted Moving Average Models

We shall designate all unweighted moving average models with number of elements to be specified by the analyst as Class U.k models. The general form of the unweighted, centered moving average model with an odd number of subset elements may be specified as,

Model U.k:

MA_t = S(y_j)/k,
j from t-((k-1)/2) to t+((k-1)/2),

where y is an observation in the original series at row t, k is the number of elements in the moving average, and j is the subset element counter.

Subjectively-Designed Weighting Factors

To this point we have made only passing references to the possibility of applying weighting factors to the elements of the moving average subset. If no explicit weights are used, then implicit weights of unity (value 1) are applied to each element in the subset, and the sum of the subset values must be divided by the sum of the weights (the number of elements times the weight of each) in computing each average.

The analyst may choose to use subjectively-determined, non-unitary weights to be applied to the subset elements in computing the averages. A typical scheme is to design the element weighting system so that the sum of the weights is unity (or 100 percent). In this case, each element is multiplied by its assigned fractional (or decimal value) weight, and it is unnecessary to divide the sum of the weighted values by the sum of the weights in order to compute the average, unless toward the end of the series the number of elements is diminishing.

For our purposes, all weighted moving average (WMA) models where the analyst both specifies the number of elements and subjectively determines the weights will be designated as Class W.k models. The general format of the Class W.k models may be specified as,

Model W.k:

WMA_t = S(y_jW_p)/k,
from t-((k-1)/2) to t+((k-1)/2),
p from 1 to k,

where W is an element weighting factor applied to the jth element in the moving average, and p is the element counter subscript.

Computed Weighting Systems

Instead of subjectively designing a set of weighting factors, the analyst might opt for any of several commonly-used computed weighting systems. One such system was designed around the turn of the twentieth century by an actuary, J. Spenser, to smooth insurance policy-holder data so that insurance companies could devise premium rate structures associated with policy-holder age. The so-called Spenser Weighted Moving Average technique has been used extensively in a wide variety of applications, and continues to be used today. We shall not in this text attempt to specify the formulae for generating a set of Spenser weights; the interested reader may consult Lawrence Salzman's Computerized Economic Analysis (McGraw-Hill, 1968) for a full exposition of the method.

Exponentially-Weighted Moving Averages

The moving average models described thus far may be satisfactory for treatment of a time series which, if divided into subsets, would exhibit approximately the same mean for each subset as for the series as a whole. Consistently changing means from one subset to the next would imply the presence of a gradual trend factor.

Suppose, however, that some outside influence has affected the data for the series so that consecutive subsets of the series would exhibit significantly different means which appear to have little or no relationship to each other. Exponentially weighted moving average (EWMA) models provide some ability to adapt to the changing-mean phenomenon. The basic form of an EWMA model is,

Model E.1:

EWMA_t = ay_t + a(1-a)y_t-1 + a(1-a)²y_t-2 + a(1-a)³y_t-3 + ...,

an equivalent version of which is,

Model E.2:

EWMA_t = ay_t + (1-a)EWMA_t-1.

Version E.2 is computed much more easily than is Version E.1. In Version E.2, each new computed mean (EWMA) is based on the previously computed mean and the current observation. The term a is a smoothing factor with value between 0 and 1. This permits division of the total weight between the current observation and the mean of all previous observations.

The EWMA process smooths seasonal and irregular variation out of an original series, and may cause loss of some of the trend/cyclical variation as well. For forecasting purposes, it may be desirable to modify the EWMA model to avoid the trend loss and attempt to account for seasonal behavior. While such modifications are feasible, there appears to be no way to make an EWMA forecasting model account for cyclical variation.

Moving Averages as Forecasting Models

Any of the moving average routines described in this section may be used as forecasting models with a variable forecasting gap (i.e., lag between the value forecasted and the base value upon which it is constructed). Using the symbol i for the forecast gap, t for the subscript of the observation upon which the forecast is based, y to represent the forecasted value of the original series, and MA to represent any of the moving averages described in this chapter, the forecast model may be specified as

y_t+i = MA_t,

or if seasonality is thought to be present in the series being forecasted,

y_t+i = MA_t+i-12.

What's Ahead

This brings us to an end of our survey of naive time series forecasting techniques. A combination of these techniques, time-series decomposition, is described in Chapter 6. Another more sophisticated time series forecasting technique, ARIMA modeling, is described in Chapter 8. These and similar techniques, together with the capability of simulation modeling, may provide the informational basis for reducing decision risk in both strategic planning and tactical decision making. We now shift to econometric forecasting techniques in Chapter 5.

<>

5. ECONOMETRIC FORECASTING TECHNIQUES

The term "econometric" refers to the application of statistical regression techniques to the process of economic modeling. The purpose of the regression analysis is to estimate the values of the parameters in a model which best fits the characteristics of a phenomenon which is the object of analysis.

Our approach to the forecasting capabilities of regression analysis is purely from a user perspective. Being a user of statistics requires knowing what a statistical procedure is supposed to do, how to enter data into its computational routine, how to capture the computed results, and how to interpret the results to give them contextual meaning. The subject of this chapter addresses only the user requirements; one who is interested in the mathematical theory behind the regression analysis should consult a statistics text. Our intent here is to examine its application only to the forecasting context.

We shall examine simple and multiple regression forecasting models in a subsequent section. Our initial task is to develop the concept of univariate regression, and demonstrate its applicability to forecasting. Within a time-series framework, a univariate regression analysis may be contrived from a simple regression context by (a) introducing an artificial independent variable associated with the sequence of observations in the object series, or (b) by deriving a series of observations from the object or dependent variable series to serve as independent variable. The former is referred to as trend regression; the latter as auto- (or self-) regression.

Trend Regression

In trend regression, the independent variable is taken to be the observation counter or any linearly-increasing numeric observation identifier. Suppose that for the time-series context we settle upon the convention of using the row-wise observation counter, represented by the symbol t. The trend regression model may then be represented in functional-notation format as

(1) y_t = f(t),

or in linear-equation format as

(2) y_t = a + bt

where a and b are the intercept and slope parameters of a "best-fit" trend equation estimated by linear regression analysis.

The perceptive reader may object that this is essentially a standard simple regression model into which a sequential observation identifier has been inserted as independent variable. While this is of course correct, it is none-the-less also true that no information other than the object series and its observation-sequence is necessary in order to accomplish the trend regression.

The sequence plot illustrated in Figure 1-1 of Chapter 1 exhibits a great deal of "scatter" about a gently upward-sloping path from left to right as time "passes" on the horizontal axis. If the plotted points are connected in sequence, the emerging line has a jagged appearance due to the presence of random noise within the series. If one were seeking a mathematical function to represent the behavior of the data series over time, i.e., the trend behavior of the series, it would be very difficult (likely impossible) to construct a single equation for the jagged line. Alternately, it is possible to draw a smooth line, free-hand style, through the data plot. The smooth line might be straight or curvilinear, but in either case it is easier to devise or construct a mathematical function to represent it than one to represent the jagged line formed by sequentially connecting the plotted points.

Depending upon the amount of variation in the object series, a trend regression equation may be able to estimate, with some error, values of existing observations within the series, and to predict values of hypothetical observations beyond the end of the series. This latter possibility constitutes the potential of trend regression to serve as a forecasting technique. Given such a trend regression equation developed from a "least-squares" regression procedure on data for a time series, future values of the series may be forecasted by inserting the observation counter corresponding to the target date into the regression equation and solving for the dependent variable value. The error involved in estimating or predicting such values constitutes a potentially serious problem, especially if there is much cyclical, seasonal, or purely random variation present within the series.

Trend regression used for forecasting purposes can attempt to account only for the long-term average change in a series. Trend regression by itself cannot account for any cyclical, seasonal, or random variation present in the historical data series. Forecasts made with the trend regression equation will thus diverge from the actual value of the series when it does occur by the amount of seasonal, cyclical, and irregular influence.

The Autoregressive Model

An alternative regression approach is based on the premise that each observation in a time series is related in a consistent and identifiable way to one or more previous observations of the same series. In other words, the best predictor of any particular observation of a time series may be some earlier value(s) of the same series. The simple statistical regression model may be employed to try to discover such a relationship if it exits. With the object series as dependent variable, the approach is to generate an independent variable series from the object series by shifting the dependent variable data downward in the data matrix by the number of rows corresponding to the required order of autoregression in order to compose the independent variable data series. The form of such a relationship may be expressed in functional-notation format as

(3) y_t = f(y_t-1),

where t is the observation subscript. The sense in which this is a univariate technique is that no information other than the object series itself is used to construct the model or generate the forecasted values. The specific regression equation form of this "first-order" (i.e., single-term) autoregressive model is

(4) y_t = a + by_t-1,

where a and b are the conventional intercept and slope parameters, respectively. A kth-order autoregressive model would have the form

(5) y_t = a + by_t-k,

and there can be as many autoregressive terms in the equation as the order of autoregression. The criteria for selecting the appropriate order of autoregression to include within a model are discussed in Chapter 6.

The autoregressive model may employ the structure of the multiple regression model to add terms for successively earlier observations of the object series. These will serve as independent variable values relative to each observation of the object series. The general form of the kth-order auto-regressive model is

(6) y_t = a + b₁y_t-1 + b₂y_t-2 + ... + b_ny_t-k,

where k is the number of earlier observations used to predict each object-series value. The number of rows of data available for conducting the regression analysis will be the number of rows in the original object series, less the number of autoregressive terms in the model. If any of the estimated regression coefficients (b₁,...,b_n) are not significantly different from zero, the corresponding autoregressive term will contribute little to the predictive ability of the model and could be deleted from the model. If the estimated coefficient of any term lies outside of the range of -1 to +1, the model will exhibit the explosive property, and will be of little forecasting value.

An autoregressive model within which the parameter values are neither zeros nor exceed unity in absolute value may have forecasting potential. In order to use such an autoregressive model for forecasting purposes, the analyst needs to know only the values of the n observations prior to the forecast target value. A forecast gap may be built into the relationship between the object-series dependent variable and its autoregressive terms. Once the appropriate prior-period values of the series are known, they may be entered into the autoregressive model so that it may be solved for the predicted value of the dependent variable. The final step in the process is then to assess the level of confidence which can be placed in the forecasts so constructed.

Multiple Regression Models

Some series can be adequately forecasted with reference to trend or earlier values of the same series. But other series can be forecasted only inadequately in this manner. As we noted above, there are two possibilities for these series: either they are characterized so extensively by random noise that they are unforecastable, or there are one or more other phenomena which govern or influence the behavior of the series. If comparable time series for these other phenomena can be acquired, then conventional simple or multiple regression procedures may be implemented to model and forecast the object series.

Once a multiple regression model has been specified and the parameter values estimated, the analyst may discern the predictive ability of each of the included independent variables by examining the inference statistics for each of them. Any independent variable which in the judgment of the analyst does not make a satisfactory contribution to the explanation of the behavior of the dependent variable series may then be deleted from the model when the model is respecified.

Some statistical software packages include options for stepwise deletion of inadequately contributing independent variables from the model according to some criterion specified by the programmer or the analyst. In the stepwise regression procedure, the full model including all variables selected by the analyst is first estimated. Then the model is automatically respecified in subsequent steps, omitting one variable in each step, until only one independent variable remains in the model. The analyst may then inspect the sequence of model specifications, looking for a significant drop in the overall level of explanation of the behavior of the dependent variable. Once this loss is identified, the model specified prior to the deletion of the independent variable resulting in the significant loss is the optimal model.

Non-linear Regression Models

The simple regression model, linear in its equation (2) format, can be extended to the nonlinear forms of exponential and geometric relationships by use of the logarithmic transformation. The exponential form,

(7) y = ae^bx

may be implemented by transforming the dependent variable data to logarithms for entry into the conventional simple regression model. Once the parameter a is estimated it must be converted to its exponential (or antilog). With appropriate lag structure built into the relationship, the exponential forecast can be computed by

(8) y_t+i = EXP(a) * EXP(bx_t).

The geometric form,

y_t+i = a / x_t^b or y_t+i = ax_t^-b,

may be implemented by transforming both the independent and dependent variables to logarithms before entry into the simple regression model. Again, once the parameter a is estimated, it must be transformed to its exponential. The geometric forecast may be computed by

(9) y_t+1 = EXP(a) * x_t^b.

How does one choose from among the linear, exponential, or geometric forms of the simple regression model? The best criterion is visual inspection of a scatter diagram for the two variables. If the scatter of the plotted points follows a straight-line path, then the linear form is appropriate. If the scatter of the data has the appearance of increase at an increasing rate, then the exponential format is most appropriate. If the scatter of the data follows a path of decrease at a decreasing rate, then the geometric model format should be selected. If the analyst is in doubt about the proper format, all three may be specified to see which yields the best predictive characteristics. But neither exponential nor geometric model formats should be expected to fit other curved-path shapes.

As we have already shown in regard to autoregression, the multiple regression model can be extended to other contexts. Another potentially productive extension of multiple regression is into the realm of the polynomial relationship. The polynomial equation includes one independent variable raised to successively higher powers. For example, a quadratic polynomial equation includes linear and second-order (or squared) terms in the format

(10) y_t+i = a + b₁x_t + b₂x_t².

Data for the second independent-variable term are generated by performing a squared transformation on the data for the first independent variable. The general format for an k-th order polynomial model is

(11) y_t+i = a + b₁x_t + b₂x_t² + b₃x_t³ + ... + b_kx_t^k,

where data for all terms beyond the linear term are generated by subsequent transformations. Some statistical software systems can automatically generate a specified k-th order polynomial regression model computationally (i.e., without having to go through the data transformation phase).

The analyst should consider a polynomial form of relationship when the scatter diagram exhibits a curved path which is not apparently amenable to exponential or geometric modeling. As a general criterion, the analyst should specify polynomial equation order k which is equal to the number of directions of curviture apparent in the scatter diagram, plus 1. For example, if the scatter diagram exhibits one direction of curvature, then a k=2, or second-order regression model should be specified. If the scatter diagram exhibits two directions of curviture, a k=3 or third-order (cubic) model of form

(12) y_t+i = a + b₁x_t + b₂x_t² + b₃x_t³

should be respecified. If the analyst is certain that the relationship is not linear but in doubt about the appropriate order of relationship, progressively higher-ordered models could be specified, and the one chosen with the smallest mean squared error.

Finally, we should note that the multiple regression format can accommodate a mixture of all of the formats described to this point. For example, suppose the analyst finds that trend is a significant predictor of the behavior of the object series, but that the explanation needs to be supplemented by the presence of two other independent variables, x₁ and x₂, the first linear and the other in a second-order relationship. Such a regression model might have the form

y = a + b₁t + b₂x₁ + b₃x₂ + b₄x₂².

Inferences about the Regression Model

Regression analysis purports to provide answers to a very specific question: "What is the nature of the relationship between the dependent variable and an independent variable?" The question is answered by estimating values of the parameters in a best-fit equation. But regression analysis begs two other very important questions: "Is there a significant relationship between the selected variables?" and, if so, "How strong (or close, or reliable) is the relationship?" If there is no significant relationship, or even if the existing relationship is only trivial, an automated regression analysis will dumbly estimate the values of the parameters. It is therefore necessary to delve into the significance of the estimated relationships. The existence and significance of an hypothesized relationship should perhaps be brought into question even before the regression analysis is conducted. It is the purpose of statistical inference analysis to assess the strength or quality of an estimated regression model.

Two statistics conventionally computed in inference analysis when regression models are estimated, the mean squarred error and the standard error of the estimate, may be used for comparing regression forecasting models with the naive models described in other sections in this chapter and for specification of forecast confidence intervals (e.g., 95 and 99 percent).

What's Ahead

This brings us to an end of our survey of econometric forecasting techniques. Regression techniques are employed time-series decomposition, as described in Chapter 6. Combined autoregressive and moving average forecasting techniques are described in Chapter 8. Regression techniques, together with the capability of simulation modeling, may provide the informational basis of both strategic planning and tactical decision making.

<>

6. TIME SERIES DECOMPOSITION

In this chapter we describe a forecasting approach, time series decomposition, which brings together approaches introduced in Chapters 3 and 4.

Time Series Decomposition

Pioneering development of the techniques for decomposing a time series into constituent components was conducted for and by the U. S. Bureaus of Census and Labor Statistics during the first half of the twentieth century. The object of such work was the seasonal adjustment of time series data. A beneficial spin-off has been the ability to analyze cyclical behavior. The two approaches emerging from the early pioneering work are today formally known as the Census Method II (in several variants) and the BLS method. The two methods are similar except in the ways in which they isolate one of the components. The ensuing discussion in this appendix follows the procedures of the BLS Method, but we will note how the Census Method II accomplishes the same end. While the objective of governmental agencies may be the seasonal adjustment of time series data, the techniques which have been developed can also constitute a powerful approach to forecasting time series behavior.

Components of a Time Series

The fundamental underlying assumption of this approach is that every time series is composed of a number of component parts which are in some way related to one another or the whole. As described in Chapter 2, the conventionally defined component parts are trend (T), cyclical (C), seasonal (S), and irregular (I) or random variation. It is possible that these four components are each independent of all of the others, so that the behavior of the series is simply the sum of its parts which are additively related. The majority of analysts familiar with the approach seem to be of the opinion that the component parts are unlikely to be perfectly independent of one another, and are therefore multiplicatively related.

Perhaps the easiest way to explain the process of time series decomposition is to describe the way in which a time series, known as the object series, may be decomposed into the component parts. From an object series written to column 1 of a data matrix are generated six additional series, four of which are the T, C, S, and I components. The seven series taken together constitute a "decomposition matrix."

Decomposition Techniques

The techniques used to decompose the object series are trend regression and "ratio to moving average" computations as developed by the Department of Commerce and the Bureau of the Census. The objective of these techniques is to identify both seasonality and cyclical behavior so that the former may be removed and the latter isolated for other purposes. Once the components of the time series have been separated into their own series, the reverse of the decomposition process, or recomposition, may be employed to construct forecasts of future values of the object series.

Before the decomposition process is started, the analyst should generate a sequence plot to identify the range over which trend is unidirectional. The trend estimation should then be conducted only over this range. The steps employed in the decomposition process are as follows:

a. The original or object series, regarded as containing all four components which are assumed to be multiplicatively related, i.e., OBJECT = T x C x S x I, is written to column 1 of the decomposition matrix.

b. A second series, written to column 2 of the decomposition matrix, is generated by smoothing the object series with a centered moving average. This is the first of two smoothing stages. If the smoothing is to be done by moving average, it is conventional to specify from 12 to 15 elements in the moving average set. The elements may be unweighted or weighted (user specified), or the user may choose a computed weight set. The problem of loss of data at the end of a centered moving average series may be handled by letting the number of elements in the set diminish to the number of remaining rows as the end of the series is approached. The interpretation is that seasonal and irregular influences are smoothed from the object series, leaving a combination TxC series which is written to column 2 of the decomposition matrix. The column 2 entries are numbers of the same magnitude as the column 1 object series.

c. A third series, written to column 3 of the matrix, is generated by computing the ratios of entries in column 1 (the object series, TxCxSxI) to the corresponding entries in column 2 (TxC), which by cancellation (or division) leaves a combination SxI series. This process is thus the basis for the name of the technique, "ratio-to-moving average." The column 3 entries are index numbers which vary about unity (1).

d. A fourth series, written to column 4 of the decomposition matrix, is generated by smoothing the SxI series in column 3 to eliminate the irregular influences, leaving an isolated seasonal series, S. This is the second-stage smoothing process. If the analyst opts for a moving average to accomplish the second-stage smoothing, it is conventional to employ about half as many elements as in the first-stage moving average. Again, the elements may be unweighted or weighted as specified by the user. All entries in the S series are totaled and averaged by months to constitute a set of twelve seasonal adjustment factors. The column 4 entries are index numbers which vary about unity.

e. A fifth series, written to column 5 in the decomposition matrix, is generated by computing the ratios of entries in column 3 (SxI) to the corresponding entries in column 4 (S), which by cancellation (or division) yields an isolated irregular, I, series. The column 5 entries are index numbers which vary about unity. (The Census Method accomplishes this by dividing a seasonally-adjusted original series, TxCxI, by a trend-cycle series, TxC, thus isolating the irregular component.)

f. A sixth series, written to column 6 in the decomposition matrix, is generated by trend regression on the unidirectional range of the object series. This series is interpreted as an isolated T series. The column 6 entries are numbers of the same magnitude as the object series numbers.

g. Finally, a seventh series, written to column 7 of the decomposition matrix, is generated by computing the ratios of the entries in the second column (TxC) by the corresponding entries in the sixth column (T), which by cancellation yields an isolated cyclical, C, Series. The column 7 entries are index numbers which vary about unity.

Table 6-1 contains a decomposition matrix resulting from applying these procedures to series Y1 (introduced in Chapter 1) as the object series (the computations were done by the author's proprietary software). Columns 4 and 7 exhibit clearly-defined seasonal and cyclical patterns, respectively, and that column 5 contains few runs of numbers above or below unity. Therefore, it may be judged that the smoothings employed in this decomposition were relatively effective.

Table 6-1. Time series decomposition matrix for Series Y1.


TIME SERIES DECOMPOSITION ANALYSIS

DECOMPOSITION OF Y1 SERIES, 12 & 5 ELEMENT UNWEIGHTED MOVING AVERAGE
THE CALCULATED LINEAR TREND EQUATION IS:
   Y =   228.3455 +     0.4058 * X

         ORIGINAL  12 MONTH  RATIO  5 MTH   RATIO   REGRESS    RATIO
          SERIES    MV.AV.  (1)/(2) MV.AV. (3)/(4)  EST (1)   (2)/(6)
 X DATE  TxCxSxI      TxC     SxI     S       I        T         C
           (1)        (2)     (3)    (4)     (5)      (6)       (7)

 1  1   233.4000
 2  2   236.6000
 3  3   239.8000
 4  4   242.7000
 5  5   245.2000
 6  6   252.2000
 7  7   252.3000   246.7080 1.0227
 8  8   252.2000   248.0420 1.0168
 9  9   251.1000   247.4250 1.0149 1.0199 0.9951   231.9980   1.0665
10 10   251.1000   246.5670 1.0184 1.0211 0.9973   232.4030   1.0609
11 11   252.2000   245.6170 1.0268 1.0226 1.0041   232.8090   1.0550
12 12   251.7000   244.6580 1.0288 1.0087 1.0199   233.2150   1.0491
13  1   249.4000   243.5250 1.0241 0.9950 1.0293   233.6210   1.0424
14  2   229.2000   242.4750 0.9453 0.9820 0.9625   234.0260   1.0361
15  3   229.5000   241.5500 0.9501 0.9714 0.9781   234.4320   1.0304
16  4   231.3000   240.4580 0.9619 0.9665 0.9952   234.8380   1.0239
17  5   233.7000   239.6000 0.9754 0.9791 0.9962   235.2440   1.0185
18  6   238.6000   238.6000 1.0000 0.9929 1.0072   235.6500   1.0125
19  7   239.7000   237.8170 1.0079 1.0014 1.0065   236.0550   1.0075
20  8   241.1000   236.5420 1.0193 1.0096 1.0096   236.4610   1.0003
21  9   238.0000   236.9500 1.0044 1.0121 0.9924   236.8670   1.0004
22 10   240.8000   236.9670 1.0162 1.0148 1.0013   237.2730   0.9987
23 11   240.2000   237.1580 1.0128 1.0084 1.0044   237.6780   0.9978
24 12   242.3000   237.1920 1.0215 1.0051 1.0163   238.0840   0.9963
25  1   234.1000   237.1750 0.9870 0.9961 0.9909   238.4900   0.9945
26  2   234.1000   236.9500 0.9880 0.9911 0.9968   238.8960   0.9919
27  3   229.7000   236.5420 0.9711 0.9852 0.9857   239.3020   0.9885
28  4   233.6000   236.4670 0.9879 0.9901 0.9978   239.7070   0.9865
29  5   234.1000   236.0250 0.9918 0.9943 0.9976   240.1130   0.9830
30  6   238.4000   235.6750 1.0116 1.0007 1.0108   240.5190   0.9799
31  7   237.0000   234.8830 1.0090 1.0048 1.0042   240.9250   0.9749
32  8   236.2000   235.4330 1.0033 1.0055 0.9977   241.3300   0.9756
33  9   237.1000   235.2000 1.0081 1.0020 1.0060   241.7360   0.9730
34 10   235.5000   236.5000 0.9958 0.9953 1.0004   242.1420   0.9767
35 11   236.0000   237.4080 0.9941 0.9956 0.9985   242.5480   0.9787
36 12   232.8000   238.6580 0.9755 0.9863 0.9890   242.9540   0.9823
37  1   240.7000   239.6000 1.0046 0.9902 1.0145   243.3590   0.9846
38  2   231.3000   240.5250 0.9616 0.9930 0.9684   243.7650   0.9867
39  3   245.3000   241.5580 1.0155 1.0025 1.0130   244.1710   0.9893
40  4   244.5000   242.5580 1.0080 1.0058 1.0022   244.5770   0.9917
41  5   249.1000   243.5500 1.0228 1.0154 1.0073   244.9820   0.9942
42  6   249.7000   244.5670 1.0210 1.0143 1.0066   245.3880   0.9967
43  7   248.1000   245.7170 1.0097 1.0142 0.9956   245.7940   0.9997
44  8   248.6000   246.1250 1.0101 1.0096 1.0005   246.2000   0.9997
45  9   249.1000   247.2580 1.0074 1.0054 1.0020   246.6060   1.0026
46 10   247.4000   247.5080 0.9996 1.0019 0.9976   247.0010   1.0020
47 11   248.2000   248.1250 1.0003 0.9969 1.0035   247.4170   1.0029
48 12   246.6000   248.5250 0.9923 0.9913 1.0010   247.8230   1.0028
49  1   245.6000   249.4080 0.9847 0.9896 0.9951   248.2290   1.0048
50  2   244.9000   249.9920 0.9796 0.9901 0.9894   248.6340   1.0055
51  3   248.3000   250.5920 0.9909 0.9933 0.9975   249.0400   1.0062
52  4   251.9000   251.1500 1.0030 1.0026 1.0004   249.4460   1.0068
53  5   253.9000   251.8250 1.0082 1.0083 0.9999   249.8520   1.0079
54  6   260.3000   252.4000 1.0313 1.0119 1.0192   250.2580   1.0086
55  7   255.1000   253.0420 1.0081 1.0125 0.9957   250.6630   1.0095
56  8   255.8000   253.5830 1.0087 1.0114 0.9974   251.0690   1.0100
57  9   255.8000   254.2670 1.0060 1.0052 1.0008   251.4750   1.0111
58 10   255.5000   254.7670 1.0029 1.0028 1.0001   251.8810   1.0115
59 11   255.1000   255.0500 1.0002 0.9987 1.0015   252.2860   1.0110
60 12   254.3000   255.2580 0.9962 0.9957 1.0006   252.6920   1.0102
61  1   252.1000   255.1830 0.9879 0.9940 0.9939   253.0980   1.0082
62  2   253.1000   255.3830 0.9911 0.9935 0.9976   253.5040   1.0074
63  3   254.3000   255.7420 0.9944 0.9946 0.9998   253.9100   1.0072
64  4   255.3000   255.8670 0.9978 0.9997 0.9980   254.3150   1.0061
65  5   256.4000   255.9750 1.0017 1.0026 0.9990   254.7210   1.0049
66  6   259.4000   255.8500 1.0139 1.0068 1.0070   255.1270   1.0028
67  7   257.5000   256.0920 1.0055 1.0082 0.9973   255.5330   1.0022
68  8   260.1000   256.1920 1.0153 1.0083 1.0069   255.9380   1.0010
69  9   257.3000   256.1000 1.0047 1.0033 1.0014   256.3440   0.9990
70 10   256.8000   256.2580 1.0021 1.0027 0.9995   256.7500   0.9981
71 11   253.6000   256.4000 0.9891 0.9966 0.9925   257.1560   0.9971
72 12   257.2000   256.6420 1.0022 0.9915 1.0108   257.5620   0.9964
73  1   253.3000   257.1830 0.9849 0.9900 0.9948   257.9670   0.9970
74  2   252.0000   257.3830 0.9791 0.9913 0.9877   258.3730   0.9962
75  3   256.2000   257.5250 0.9949 0.9913 1.0036   258.7790   0.9952
76  4   257.4000   258.1330 0.9956 0.9992 0.9964   259.1850   0.9959
77  5   259.3000   258.7670 1.0021 1.0033 0.9988   259.5900   0.9968
78  6   265.9000   259.5250 1.0246 1.0051 1.0193   259.9960   0.9982
79  7   259.9000   260.0830 0.9993 1.0083 0.9910   260.4020   0.9988
80  8   261.8000   260.7000 1.0042 1.0096 0.9946   260.8080   0.9996
81  9   264.6000   261.5700 1.0116 1.0046 1.0069   261.2140   1.0014
82 10   264.4000   262.1670 1.0085 1.0052 1.0033   261.6190   1.0021
83 11   262.7000   262.8130 0.9996 1.0055 0.9941   262.0250   1.0030
84 12   263.9000   263.3140 1.0022 1.0034 0.9988   262.4310   1.0034



Table 6-1, continued.


AVERAGE SEASONAL ADJUSTMENT FACTORS:
MONTH    SEASONAL
  1      0.9921
  2      0.9898
  3      0.9894
  4      0.9937
  5      1.0002
  6      1.0050
  7      1.0079
  8      1.0087
  9      1.0072
 10      1.0059
 11      1.0031
 12      0.9971

FOR ORIGINAL SERIES, MEAN SQUARED ERROR IS:    5.8641
STANDARD ERROR OF THE ESTIMATE:                2.4216

Forecasting by Recomposition

Given the data in columns 4 through 7 of the decomposition matrix, the seasonal adjustment factors computed from column 5, and the trend (or multiple) regression equation which generated column 6, a forecast may be constructed. The primary technique of decomposition was division; the primary technique of forecasting then is the opposite, recomposition by multiplication:

1. A trend estimate (T) of the value of the object series in the target month is made by entering the target month row number into the trend regression equation as described in Chapter 4; this is a "first-approximation" forecast.

2. The trend estimate is multiplied by a cyclical adjustment factor which is either entered by the analyst or selected from the values in column 7. In the latter case, the analyst exercises judgment to pick cyclical adjustment factors which are like what the analyst expects to be the cyclical characteristics of the target month. The resulting product is a trend-cycle (TxC) forecast.

3. The TxC forecast is multiplied by the correct seasonal adjustment factor, given the identity of the target month. The result (TxCxS) is a trend-cycle forecast to which a seasonal multiplier has been applied.

4. Finally, the TxCxS forecast may be multiplied by an irregular adjustment factor if the user has reason to anticipate any unusual condition or event which will affect the situation. The I multiplier could be selected from a row of column 5 which the analyst thinks likely to be similar in properties to the target period. The final product is a TxCxSxI forecast.

Table 6-2 shows recomposition forecasts for months 87 through 92 for the original object series Y1 which ended at month 84.

Table 6-2. Recomposition Forecasts for Series Y1.

         TREND            ADJUSTMENTS         REVISED
        FORECAST   SEASON   CYCLIC    IRREG   FORECAST
X MONTH   (T)        (S)      (C)      (I)     (STCI)
87  3   263.6482   0.9894   0.9846   1.0100   259.3816
88  4   264.0540   0.9937   0.9867   1.0120   261.9989
89  5   264.4598   1.0002   0.9893   1.0030   262.4561
90  6   264.8656   1.0050   0.9917   0.9910   261.6063
91  7   265.2713   1.0079   0.9942   0.9860   262.0808
92  8   265.6771   1.0087   0.9967   0.9850   263.0751

Assessment

Time series recomposition is a complex albeit still naive approach to time series forecasting. It may appear to be a purely technical approach to forecasting a value at a date beyond the end of the original series; however, to compose a successful forecast the analyst must exercise a great deal of judgment in specifying weights and choosing adjustment factors.

A cautionary word is in order. One who would use time series recomposition procedures as a means for forecasting a time series must of necessity become a student of the history of the period covered by the original time series. It is only with an intimate historical knowledge of the period that the analyst can hope to make appropriate adjustments to the trend estimate of the target date value so as to compose an accurate forecast.

What's Next

The next two chapters bring together regression and moving average technieques into an integrated approach to forecasting time series.

<>

7. AUTOCORRELATION IN TIME SERIES

Autocorrelation

In any time series containing non-random patterns of behavior, it is likely that any particular item in the series is related in some fashion to other items in the same series. If there is a consistent relationship between entries in the series, e.g., the 5th item is like the 1st, the 6th is like the 2nd, and so on, then it should be possible to use information about the relationship to forecast future values of the series, i.e., the 33rd item should be like the 29th. In this case we may say that the series has some ability to forecast itself because of autocorrelation (or self-correlation) among values within the series.

Autocorrelation Coefficients

One means for identifying and assessing the strength of autocorrelation in a time series is to compute the coefficients of autocorrelation between pairs of entries within the series. If the analyst is interested in the autocorrelation between adjacent entries, the autocorrelation should be specified to order k=1. For the correlation between every other entry in the series, the autocorrelation should be specified to order k=2. Autocorrelation order k=3 would be for the correlation between each entry and the third from it in the series, and so on.

The formula for computing such autocorrelations is

S(x-xb)(y-yb) / S(y-yb)²

where y is the value of the object series, x is a lagged value of it, and yb and xb are the respective means; there are n-k terms in the numerator summation, and n terms in the denominator summation. These are "simple" autocorrelations in the sense that in computing them no account is taken of the correlations between other pairs of entries.

The Correlogram

In a statistical package designed for time series analysis, the user can specify some high order of autocorrelation, say k=20, for which autocorrelation coefficients are desired. Once the autocorrelation coefficients are computed, they may be plotted against the order k to constitute a simple autocorrelation correlogram such as that illustrated in Figure 7-1 for Series Y1. Before considering an interpretation of the information contained in the simple correlogram, let us develop the idea for a second type of correlogram.

Figure 7-1.  Simple autocorrelation correlogram.

   SIMPLE  -1.0       -.5        0        .5       1.0
 K  AUTOCORR I---------+---------+---------+---------I
 1   .8858   I              :    I    :            * I
 2   .8121   I              :    I    :          *   I
 3   .7008   I              :    I    :        *     I
 4   .6207   I              :    I    :       *      I
 5   .5555   I              :    I    :     *        I
 6   .5249   I              :    I    :     *        I
 7   .4768   I              :    I    :    *         I
 8   .4508   I              :    I    :   *          I
 9   .4636   I              :    I    :   *          I
10   .4794   I              :    I    :    *         I
11   .5186   I              :    I    :    *         I
12   .5276   I              :    I    :     *        I
13   .5171   I              :    I    :    *         I
14   .4436   I              :    I    :   *          I
15   .3853   I              :    I    :  *           I
16   .3105   I              :    I    :*             I
17   .2687   I              :    I    *              I
18   .2243   I              :    I    *              I
19   .1862   I              :    I   *:              I
20   .1592   I              :    I  * :              I
             I---------+---------+---------+---------I
           -1.0       -.5        0        .5       1.0

A partial autocorrelation coefficient for order k measures the strength of correlation among pairs of entries in the time series while accounting for (i.e., removing the effects of) all autocorrelations below order k. For example, the partial autocorrelation coefficient for order k=5 is computed in such a manner that the effects of the k=1, 2, 3, and 4 partial autocorrelations have been excluded. The partial autocorrelation coefficient of any particular order is the same as the autoregression coefficient (described in Chapter 4) of the same order. Figure 7-2 illustrates a partial autocorrelation correlogram for Series Y1.

Figure 7-2.  Partial autocorrelation correlogram.

  PARTIAL  -1.0       -.5        0        .5       1.0
 K  AUTOCORR I---------+---------+---------+---------I
 1   .8858   I              :    I    :            x I
 2   .1278   I              :    I   x:              I
 3  -.1878   I              :x   I    :              I
 4   .0352   I              :    Ix   :              I
 5   .0811   I              :    I x  :              I
 6   .1222   I              :    I  x :              I
 7  -.0826   I              :  x I    :              I
 8   .0297   I              :    Ix   :              I
 9   .2613   I              :    I    x              I
10   .0875   I              :    I x  :              I
11   .0925   I              :    I x  :              I
12  -.0658   I              :   xI    :              I
13  -.0579   I              :   xI    :              I
14  -.2402   I              x    I    :              I
15  -.0432   I              :   xI    :              I
16  -.0119   I              :    x    :              I
17   .0542   I              :    Ix   :              I
18  -.0238   I              :    x    :              I
19  -.0684   I              :   xI    :              I
20   .0480   I              :    Ix   :              I
             I---------+---------+---------+---------I
           -1.0       -.5        0        .5       1.0

Selection Criteria

Several criteria may be specified for choosing a model format, given the simple and partial autocorrelation correlograms for a series:

(a) If none of the simple autocorrelations is significantly different from zero, the series is essentially a random number or white-noise series which is not amenable to autoregressive modeling.

(b) If the simple autocorrelations decrease linearly, passing through zero to become negative, or if the simple autocorrelations exhibit a wave-like cyclical pattern, passing through zero several times, the series is not stationary; it must be differenced one or more times before it may be modeled with an autoregressive process.

(c) If the simple autocorrelations exhibit seasonality, i.e., there are autocorrelation peaks every dozen or so (in monthly data) lags, the series is not stationary; it must be differenced with a gap approximately equal to the seasonal interval before further modeling.

(d) If the simple autocorrelations decrease exponentially but approach zero gradually, while the partial autocorrelations are significantly non-zero through some small number of lags beyond which they are not significantly different from zero, the series should be modeled with an autoregressive process.

(e) If the partial autocorrelations decrease exponentially but approach zero gradually, while the simple autocorrelations are significantly non-zero through some small number of lags beyond which they are not significantly different from zero, the series should be modeled with a moving average process.

(f) If the partial and simple autocorrelations both converge upon zero for successively longer lags, but neither actually reaches zero after any particular lag, the series may be modeled by a combination autoregressive and moving average process.

The simple autocorrelation correlogram for series Y1 exhibits a characteristic decline and approach toward zero, thus suggesting that Series Y1 can be modeled by an AR process of some order. The particular order can be inferred by counting the number of significantly non-zero partial autocorrelations, in this case 1. Therefore, Series Y1 should be modeled by an AR(1) equation.

What's Next

Techniques for combining autoregressive and moving average approaches are elaborated in Chapter 8 and its appendix.

<>

8. ARIMA MODELING

Integrating Autoregressive with Moving Average Approaches

In Chapter 5 we introduced moving average models and in Chapter 6 we showed how multiple regression analysis could be extended to the autoregressive context. These two apparently separate modeling approaches are encompassed by a class of models known more generally as ARMA or ARIMA models. The process of ARIMA modeling serves to integrate the two approaches to modeling.

Theoretically, any time series that contains no trend or from which trend has been removed can be represented as consisting of two parts, a self-deterministic part, and a disturbance component. The self-deterministic part of the series should be forecastable from its own past by an autoregressive (AR) model with some number of terms, p, of the form

y_t = b₀ + b₁y_t-1 + b₂y_t-2 + ... + b_py_t-p.

It should be possible to model the disturbance component (the residuals from the autoregressive model) by a moving average (MA) with a large enough number (q) of elements,

y_t = w₁e_t-1 + w₂e_t-2 + ... + w_qe_t-q,

where the symbol w represents the weight, and the symbol e represents the disturbance in time period t-i. It should be possible, for any AR model of order p, to find an equivalent MA model with a large enough number, q, of disturbance term elements in it. In practice, most economic or business time series can be modeled with rather modest numbers of terms, p and q, in the form of an AR, an MA, or an ARMA model. In order to achieve parsimony, the forecaster's task is to identify the smallest numbers of terms, p and q, to include within the model and still satisfactorily forecast the series.

An autoregressive model of order p is conventionally classified as AR(p). A moving average model with q terms is classified as MA(q). A combination model containing p autoregressive terms and q moving average terms is classified as ARMA(p,q). If the object series is differenced d times to achieve stationarity, the model is classified as ARIMA(p,d,q), where the symbol "I" signifies "integrated." An ARIMA(p,0,q) is the same as an ARMA(p,q) model; likewise, an ARIMA(p,0,0) is the same as an AR(p) model, and an ARIMA(0,0,q) is the same as an MA(q) model.

Various approaches have been developed for ARIMA modeling. The procedure which has become the standard for estimating ARIMA models was proposed by G. E. P. Box and G. M. Jenkins (Time Series Analysis, Forecasting, and Control, San Francisco, Holden Day, 1970). The procedure involves making successive approximations through three stages: identification, estimation, and diagnostic checking.

Stages in the Analysis

In the identification stage, the analyst's job is to ensure that the series is sufficiently stationary (free of trend and seasonality), and to specify the appropriate number of autoregressive terms, p, and moving average terms, q. Statisticians have developed the concept of the autocorrelation correlogram (introduced in Chapter 2) to serve as the basis for judgment of the stationarity of the series and to provide criteria for specification of p and q.

In the estimation stage, the analyst's job is to estimate the parameters (coefficient values) of the specified numbers, p and q, of autoregressive and moving average terms. This is usually accomplished by implementing some form of regression analysis.

Once the parameter values of the specified model have been estimated, the third stage of diagnostic checking is undertaken. The objective of diagnostic checking is to ascertain whether the model "fits" the historical data well enough. To accomplish diagnostic checking, the model is used to forecast all of the extant values in the series. The model is judged to fit the series well if the differences between the actual series values and the forecasted values are small enough and sufficiently random. If the differences (also known as "residuals") are judged not to be sufficiently small or random, they may contain additional information which, if captured by further analysis, can enhance the forecastability of the model. To capture the additional information, the analyst must "return to the drawing board" to respecify the model and reestimate the parameters.

If an adequate model cannot be specified solely in terms of autoregressive and moving average terms, the analyst may resort to inclusion of other variables in the model as described in the discussion of multiple regression models in Chapter 5. Indeed, in the eyes of Rational Expectation theorists, the analyst would be remiss if other variables were not included to encompass all available information.

Criteria for Identification

The autocorrelation correlograms introduced in Chapter 7 may serve as criteria to judge:
a. whether a series is sufficiently stationary;
b. whether the appropriate model is an AR, an MA, or some combination; and
c. what order, p or q, of AR or MA model will likely best fit the data.

The analyst should proceed by constructing both simple and partial autocorrelation correlograms for the object series to some generous order, k, perhaps between 12 and 20. If upon inspection of the simple autocorrelation correlogram the analyst notes that all of the simple autocorrelations lie within the selected confidence interval, the analyst should conclude that the series is essentially a random number series which is not amenable to either AR or MA modeling. The analyst may then resort to multiple regression modeling as espoused by Rational Expectations theorists in expectation of identifying other data series which contain information useful in predicting the behavior of the object series.

Assuming that all of the simple autocorrelations do not fall within the selected confidence interval, there may still be a possibility of establishing an effective ARIMA model. If the simple autocorrelations start from quite high levels and descend linearly, eventually passing through zero to become negative, the implication is that the series is non-stationary, i.e., that it contains a significant trend component. This possibility may be confirmed by generating a sequence plot as described in Chapter 1, or by conducting a time series decomposition as described in Chapter 6.

If the analyst wishes to continue in applying ARIMA techniques to the object series, it must first be converted to a stationary series. Stationarity is usually accomplished by differencing the series, i.e., by performing a difference transformation upon the original series. The differenced series is a new series consisting of the increments between consecutive items in the original series. The differenced series is less likely to exhibit trend than is the original series. Correlograms should then be constructed for the differenced series. If the simple autocorrelations for the differenced series still exhibit nonstationarity (rare in economic data), it may be subjected to a second differencing. The analyst should continue with the differencing process until satisfied that the resulting series is sufficiently free of trend (or stationary) before proceeding to the specification of an ARIMA model.

In a series containing seasonal behavior, the simple or partial autocorrelations exhibit spurts of values at the seasonal intervals (e.g., every fourth autocorrelation in quarterly data, or every twelfth autocorrelation in monthly data), even if the autocorrelations otherwise seem to converge upon zero. Before attempting to specify an ARMA model of a highly-seasonal series, the analyst should first difference the series with a gap equal to what appears to be the seasonal interval, then construct correlograms for the differenced series to confirm stationarity. After sufficient stationarity has been attained, the analyst may inspect the simple and partial correlograms to draw an inference about the appropriate form of the model. Fortunately, the simple and partial correlograms for a series which is best modeled by an AR process exhibit patterns which are practically the opposite of those for a series which is best modeled by moving averages.

Figure 8-1 shows the theoretical autocorrelation patterns for a series best modeled by autoregression. It should be noted that the simple autocorrelations gradually approach zero, but the partial autocorrelations are significantly non-zero to some point (in this case through k=2) beyond which they are not significantly different from zero. With experience the analyst will recognize that the series exhibiting correlogram patterns similar to those illustrated in Figure 8-1 should be modeled by an AR equation of degree indicated by the number of significantly non-zero partial autocorrelations. The series for which autocorrelations are illustrated in Figure 8-1 should then be modeled by an AR(2) equation.

Figure 8-1. Theoretical simple and partial autocorrelations for a series best modeled by an AR process.


   SIMPLE  -1.0      -.5         0        .5       1.0
K AUTOCORR   I---------+---------+---------+---------I
 1           I              :    I    :         x    I
 2           I              :    I    :      x       I
 3           I              :    I    :   x          I
 4           I              :    I    : x            I
 5           I              :    I    x              I
 6           I              :    I   x:              I
 7           I              :    I  x :              I
 8           I              :    I x  :              I
 9           I              :    Ix   :              I
10           I              :    Ix   :              I
11           I              :    x    :              I
12           I              :    x    :              I
             I---------+---------+---------+---------I
           -1.0      -.5         0        .5       1.0



  PARTIAL  -1.0      -.5         0        .5       1.0
K AUTOCORR   I---------+---------+---------+---------I
 1           I              :    I    :         *    I
 2           I              :    I    :     *        I
 3           I              :    *    :              I
 4           I              :    *    :              I
 5           I              :    *    :              I
 6           I              :    *    :              I
 7           I              :    *    :              I
 8           I              :    *    :              I
 9           I              :    *    :              I
10           I              :    *    :              I
11           I              :    *    :              I
12           I              :    *    :              I
             I---------+---------+---------+---------I
           -1.0      -.5         0        .5       1.0

By way of contrast, a series which exhibits the opposite patterns for the simple and partial correlograms, as illustrated in Figure 8-2, should be modeled by a moving average process. In Figure 8-2, it should be apparent that the partial autocorrelations gradually decrease and approach zero, but the simple autocorrelations drop after one significantly non-zero value beyond which higher-ordered simple autocorrelations are not significantly different from zero. This series then is best modeled in an MA(1) specification, because there is one significantly non-zero simple autocorrelation.

Figure 8-2. Theoretical simple and partial autocorrelations for a series best modeled by an MA process.


   SIMPLE  -1.0      -.5         0        .5       1.0
K AUTOCORR   I---------+---------+---------+---------I
 1           I              :    I    :         *    I
 2           I              :    *    :              I
 3           I              :    *    :              I
 4           I              :    *    :              I
 5           I              :    *    :              I
 6           I              :    *    :              I
 7           I              :    *    :              I
 8           I              :    *    :              I
 9           I              :    *    :              I
10           I              :    *    :              I
11           I              :    *    :              I
12           I              :    *    :              I
             I---------+---------+---------+---------I
           -1.0      -.5         0        .5       1.0



  PARTIAL  -1.0      -.5         0        .5       1.0
K AUTOCORR   I---------+---------+---------+---------I
 1           I              :    I    :         x    I
 2           I              :    I    :      x       I
 3           I              :    I    :   x          I
 4           I              :    I    : x            I
 5           I              :    I    x              I
 6           I              :    I   x:              I
 7           I              :    I  x :              I
 8           I              :    I x  :              I
 9           I              :    Ix   :              I
10           I              :    Ix   :              I
11           I              :    x    :              I
12           I              :    x    :              I
             I---------+---------+---------+---------I
           -1.0      -.5         0        .5       1.0

We shall not illustrate a third case which can easily be described. If the simple and partial autocorrelations both gradually approach zero, but neither drops off to be insignificantly different from zero beyond some definite order, then a combination ARMA (or ARIMA if the series has been differenced) model should be specified.

The correlograms illustrated in Figures 8-1 and 8-2 are for theoretical autocorrelation distributions. Correlograms constructed for any real historical series should not be expected to follow precisely the theoretical patterns illustrated for the AR and MA models, but rather to exhibit some scatter about the theoretical paths. The confidence interval is therefore very important to the analyst in attempting to recognize when either the simple or the partial autocorrelations drop to values which are not significantly different from zero. The analyst may have to inspect correlograms for a great many real series in order to become comfortable in his ability to recognize patterns which are best modeled by AR, MA, or some combination specification. Although we have illustrated the 95-percent confidence interval in Figures 8-1 and 8-2, the analyst may feel more comfortable with some other confidence interval.

The theoretical autocorrelation patterns which we have been illustrating have all exhibited "positive autocorrelation" in the sense that observations have followed a pattern of "runs" of observations above and below the mean for the series. Each of the so-called "runs" consists of a number (greater than one) of observations above or below the mean, followed by some number of observations below or above the mean. We have not illustrated correlograms for series characterized by "negative autocorrelation" where successive observations regularly alternate above and below the mean, with no runs longer than single observations. Series characterized by negative autocorrelation are not as common in economic and business contexts as are positively autocorrelated series. Suffice it to note that the correlograms for a negatively autocorrelated series differ from those illustrated for positively autocorrelated series in that successive simple and partial autocorrelations alternate between positive and negative signs. Otherwise, they can be expected to converge upon zero in a similar manner to that in which stationary, positively autocorrelated series do.

The correlograms illustrated in Chapter 7 for Series Y1 suggest that Series Y1 should be modeled with an AR of order 2. In order to illustrate the Box-Jenkins approach to ARIMA modeling, we shall introduce a new series, Y2, for which data are specified in Table 8-1.

Table 8-1. Monthly data for series Y2


MONTH  YEAR 1  YEAR 2  YEAR 3  YEAR 4  YEAR 5  YEAR 6

   1    15.7    14.5    13.1    11.9    14.9    12.8
   2    14.3    13.7    14.6    11.8    15.6    18.6
   3    13.8    14.1    13.1    12.6    15.8    15.9
   4    13.6    14.8    13.6    13.3    14.9    14.1
   5    12.4    14.1    11.6    13.1    16.6    14.8
   6    13.9    14.2    11.1    14.1    15.9    15.6
   7    15.4    14.1    11.7    12.9    15.6    15.4
   8    13.8    14.1    11.1    13.5    15.4
   9    14.1    14.1    11.9    15.0    15.6
  10    13.5    13.4    11.6    14.6    14.9
  11    14.1    14.2    11.3    13.9    15.9
  12    15.0    13.1    12.8    14.0    14.2

The Box-Jenkins Procedure for Estimating an ARIMA Model

1. The first step in the Box-Jenkins procedure is to generate a sequence plot and sufficiently high-order correlograms for the object series. Figure 8-3 shows the correlograms though k=12 for series Y2.

Figure 8-3. Simple and partial autocorrelation correlograms for Series Y2.



   SIMPLE  -1.0      -.5         0        .5       1.0
K AUTOCORR   I---------+---------+---------+---------I
 1   .6352   I              :    I    :       *      I
 2   .5243   I              :    I    :     *        I
 3   .6153   I              :    I    :      *       I
 4   .5530   I              :    I    :     *        I
 5   .5413   I              :    I    :     *        I
 6   .4279   I              :    I    :   *          I
 7   .3330   I              :    I    : *            I
 8   .3313   I              :    I    : *            I
 9   .3029   I              :    I    :*             I
10   .1803   I              :    I   *:              I
11   .1232   I              :    I  * :              I
12   .0731   I              :    I *  :              I
             I---------+---------+---------+---------I
           -1.0      -.5         0        .5       1.0



  PARTIAL  -1.0      -.5         0        .5       1.0
K AUTOCORR   I---------+---------+---------+---------I
 1   .6352   I              :    I    :       x      I
 2   .2026   I              :    I   x:              I
 3   .3864   I              :    I    :  x           I
 4   .0831   I              :    I x  :              I
 5   .1690   I              :    I  x :              I
 6  -.1718   I              : x  I    :              I
 7   .1331   I              : x  I    :              I
 8  -.0908   I              :  x I    :              I
 9   .0058   I              :    x    :              I
10  -.1518   I              : x  I    :              I
11  -.0347   I              :   xI    :              I
12  -.1015   I              :  x I    :              I
             I---------+---------+---------+---------I
           -1.0      -.5         0        .5       1.0

2. If the sequence plot of the object series exhibits noticeable trend, seasonal, or cyclical behavior, then differencing of the series is in order. Inspection of the plots illustrated in Figure 8-3 suggest that series Y2 is adequately stationary, so it was judged that no differencing of series Y2 is necessary. However, if a differenced series still exhibits noticeable trend or cyclical behavior (rare in economic and business time series), it should be differenced again to generate a "second difference" series, and step 1 should be repeated. All further analysis should be conducted on the series generated in the last differencing, but a forecast made with a model specified for a differenced series will be for the change from the last period of the original series, not for the level of the series.

3. The patterns of the correlograms generated in step 1 should constitute the basis for a judgment as to whether an AR, MA, ARMA, or ARIMA format will be most suitable for modeling the object series. It appears from the correlograms in Figure 8-3 that an ARMA specification of autoregressive order p=9 and moving average order q=3 will be appropriate as a first approximation.

4. Modern statistical packages such as SPSS and EViews contain automated routines for estimating ARIMA models. Figure 8-4 illustrates the EViews display for an ARIMA(9,0,3) regression model. The probability column in Figure 8-4 indicates that coefficients of terms AR(1) and AR(2) are statistically significant at the .05 level or below, but that the coefficient of term AR(3) is not. Nor are the coefficients of any autoregressive terms beyond AR(5). All three of the moving average terms are significant at the .05 level or below.

Figure 8-4. EViews estimate of ARIMA(9,0,3) model of Y2.

Dependent Variable: Y2				
Method: Least Squares				
Sample(adjusted): 1984:10 1989:07				
Included observations: 58 after adjusting endpoints				
Convergence achieved after 67 iterations				

Variable     Coef. Std. Err.	  t-Stat.	  Prob.  

C	      14.91734	1.579435	 9.444736	0.0000
AR(1)	 -0.435004	0.217974	-1.995670	0.0520
AR(2)	 -0.501838	0.140726	-3.566066	0.0009
AR(3)	  0.303713	0.233046	 1.303234	0.1991
AR(4)	  0.521362	0.145279	 3.588703	0.0008
AR(5)	  0.620274	0.122892	 5.047310	0.0000
AR(6)	  0.357173	0.224648	 1.589921	0.1189
AR(7)	  0.025419	0.207039	 0.122776	0.9028
AR(8)	 -0.334811	0.187344	-1.787149	0.0807
AR(9)	  0.132355	0.143150	 0.924589	0.3601
MA(1)	  0.796780	0.264760	 3.009443	0.0043
MA(2)	  1.424237	0.159093	 8.952225	0.0000
MA(3)	  0.750934	0.316381	 2.373514	0.0219

R-squared		    0.791949	    Mean dependent var		13.99483
Adjusted R-squared	0.736469	    S.D. dependent var		1.501511
S.E. of regression	0.770805	    Akaike info criterion	2.511732
Sum squared resid	26.73628	    Schwarz criterion		2.973555
Log likelihood	   -59.84021	    F-statistic			    14.27444
Durbin-Watson stat	1.893651	    Prob(F-statistic)		0.000000

5. The model is then respecified to retain only independent variables AR(1), AR(2), AR(4), AR(5), and all of the moving average terns through q=3. This model is illustrated in Figure 8-5. The probability column in Figure 8-5 reveals that the AR(6), MA(1), and MA(3) terms now are not statistically significant at the .05 level.

Figure 8-5. EViews estimation of ARIMA model of Y2 with selected (statistically significant) AR and MA terms.

Dependent Variable: Y2				
Method: Least Squares				
Sample(adjusted): 1984:07 1989:07				
Included observations: 61 after adjusting endpoints				
Convergence achieved after 14 iterations
				
Variable     Coef. Std. Err.	  t-Stat.	  Prob.  

C	      14.30390	1.041311	 13.73644	0.0000
AR(1)	  0.641780	0.292500	  2.194119	0.0327
AR(2)	 -0.598591	0.273481	 -2.188785	0.0331
AR(3)	  0.526515	0.194278	  2.710116	0.0091
AR(5)	  0.382142	0.183147	  2.086528	0.0419
AR(6)	 -0.122431	0.146278	 -0.836980	0.4064
MA(1)	 -0.311070	0.308225	 -1.009230	0.3175
MA(2)	  0.548049	0.283719	  1.931663	0.0589
MA(3)	  0.106984	0.234677	  0.455877	0.6504

R-squared	    	0.619447	    Mean dependent var		14.01639
Adjusted R-squared	0.560900	    S.D. dependent var		1.474808
S.E. of regression	0.977276	    Akaike info criterion	2.927358
Sum squared resid	49.66360	    Schwarz criterion		3.238798
Log likelihood	   -80.28440	    F-statistic			    10.58038
Durbin-Watson stat	1.992037	    Prob(F-statistic)		0.000000

The analyst should continue through subsequent rounds of respecification to remove non-contributing terms (i.e., those which are statistically insignificant) until all of the remaining terms in the model are statistically significant at the selected criterion level (usually .05). This final model can then be used to forecast point estimates of the original series (if no differencing was needed) or the change in the original series if differencing was done), with appropriate confidence interval.

The model again is respecified to retain only independent variables AR(1), AR(2), AR(3), AR(5), and MA(2) as illustrated in Figure 8-6. In this model, all of the independent variable terms are statistically significant. The adjusted R-square increased from 0.5609 in the Figure 8-5 model to 0.5907 in the Figure 8-6 model. This final version is a "parsimonious" model because all terms which do not contribute significantly to the explanation of the dependent variable have been deleted, and the model retains only the minimal number of terms which are statistically significant explainers of the behavior of the dependent variable.

Figure 8-6. EViews estimation of Parsimonious ARIMA model of Y2 retaining only statistically significant terms.


Dependent Variable: Y2				
Method: Least Squares				
Sample(adjusted): 1984:06 1989:07				
Included observations: 62 after adjusting endpoints				
Convergence achieved after 12 iterations
				
Variable     Coef. Std. Err.	  t-Stat.	  Prob.  
  
C	      14.51484	1.229312	  11.80729	0.0000
AR(1)	  0.359086	0.107763	  3.332168	0.0015
AR(2)	 -0.869511	0.099188	 -8.766287	0.0000
AR(3)	  0.693095	0.104875	  6.608793	0.0000
AR(5)	  0.613063	0.114746	  5.342776	0.0000
MA(2)	  0.916126	0.070391	 13.01490	0.0000

R-squared		    0.624281	    Mean dependent var		14.01452
Adjusted R-squared	0.590734	    S.D. dependent var		1.462745
S.E. of regression	0.935775	    Akaike info criterion	2.796881
Sum squared resid	49.03774	    Schwarz criterion		3.002733
Log likelihood	   -80.70332	    F-statistic	   		    18.60948
Durbin-Watson stat	2.003398	    Prob(F-statistic)		0.000000

Diagnostic Checking

Once the ARIMA model has been specified and the parameters estimated, the model should be checked for adequacy. One way to do this is to use the model to forecast all of the known values of the data series, compute the differences (i.e., the residuals) between the known and forecasted values, and generate the simple autocorrelation correlograms for the residuals. If none of the residuals autocorrelations is significantly different from zero, the model may be judged adequate. Figure 8-7 shows the EViews correlogram for the model illustrated in Figure 8-6. Since none of the simple autocorrelations are significantly non-zero, it is unlikely that there is other information contained in residuals which, if captured by further analysis, might enhance the ability of the model to forecast.

Figure 8-7. EViews Correlogram for Parsimonious ARIMA model of Y2.

Sample: 1984:06 1989:07						
Included observations: 62						
						
Autocorrelation	Partial Correlation	      AC    PAC 
      . | .          . | .     		1  -0.008 -0.008	
      . |*.          . |*.     		2   0.088  0.087	
      . | .          . | .     		3   0.054  0.056	
      . |*.          . |*.     		4   0.126  0.120	
      . |*.          . |*.     		5   0.073  0.069	
      . | .          . | .     		6   0.045  0.026	
      . |*.          . | .     		7   0.079  0.058	
      . | .          .*| .     		8  -0.054 -0.081	
      . |*.          . |*.     		9   0.142  0.113	
      . | .          . | .     		10 -0.021 -0.028	
      . | .          . | .     		11  0.022 -0.012	
      . | .          . | .     		12 -0.048 -0.053

Yet another approach to diagnostic checking is to employ the Chi-square statistic as a diagnostic criterion. The analyst may compute a test statistic employing the equation

Q = (n-d) * S R²

where n is the number of observations in the series, d is the degree of differencing, R² is the square of the autocorrelation coefficient, and the sum is taken over the range of 1 to k, the order of autocorrelation. The appropriate number of degrees of freedom is k-d-1. If the computed value of Q is less than the Chi-square statistic for k-d-1 degrees of freedom, the model is judged adequate. For the residuals correlogram illustrated in Figure 8-6, Q is 2.476 and Chi-square is 18.307 at the .05 significance level; thus, the model may be judged adequate on this basis as well. Should the model not be judged adequate on any one or more of these bases, the analyst must either return to the drawing board to try to respecify the model in an ARIMA format, resort to multivariate means incorporating other possible independent variables, or abandon the effort to forecast the series, especially if the correlograms indicate that it is essentially a random number (or, "white noise") series.

Appendix A8 describes an alternate approach to ARIMA modeling which can be implemented with ordinary least squares regression.

An Alternate Procedure for Estimating an ARIMA Model

A simple ARIMA model estimation technique was first proposed by J. Durban in 1960 ("Estimation of Parameters in Time-Series Regression Models," Journal of the Royal Statistical Society, Series B, Volume 22, pp. 139-153, 1960). The Durban procedure is one which can be implemented using oridnary least squares regression and which often yields a satisfactory specification of the ARIMA model.

After the introduction of the Box-Jenkins procedure in 1970, the Durban approach has typically been used to make a first-approximation "guess" of the parameters of the autoregressive and moving average terms to be employed subsequently in the Box-Jenkins procedure. While the Durban approach is simpler, though rougher, than the Box-Jenkins approach, it often yields a satisfactory specification of an ARIMA model and is thus recommended to forecasting non-professionals. The analyst should go through the same first three steps as in the Box-Jenkins procedures described in Chapter 6.

4. Assuming that the appropriate model format is an ARMA or ARIMA, the analyst should enter data into an ordinary least squares (OLS) regression routine and select as dependent variable the series generated in the last differencing (or the object series if no differencing was needed). Then an autoregressive model with a "generous number" of terms, perhaps k=6 or higher order, should be specified. The analyst should also have the residuals written to the next available column of the data matrix so that they can be used in the second stage of the Durban estimation procedure. The analyst should note that the first k rows of the residuals column are empty (have zero values) because of the lagging done in the autoregression. Figure A8-1 illustrates the display of an OLS autoregressive k=6 model for Series Y2, i.e., an ARIMA(6,0,0) model.

Figure A8-1. Autoregressive (k=6) model for Series Y2.


AUTOREGRESSION
MODEL: Y(t) = a + b(1)*Y(t-1) + b(2)*Y(t-2) + ... + b(k)*Y(t-k)
DEPENDENT VARIABLE (Y) IS MATRIX COLUMN: 2 Y2
ORDER (K) OF EQUATION: 6
COEF OF MULTIPLE CORRELATION (R):     .7656    CORRECTED R:   .7406
COEF OF MULTIPLE DETERMINATION (RSQ): .5861    CORRECTED RSQ: .5485
STANDARD ERROR OF THE ESTIMATE:      1.0002    MSE:          1.0003

ANALYSIS OF VARIANCE:     SUMS OF SQUARES   DEGREES OF FREEDOM
TOTAL:                        130.5036      60
REMOVED BY REGRESSION:         76.4860      6
RESIDUAL:                      54.0176      54    F-VALUE:  12.7435

INDEP VAR    SIMPLE R   COEF (b)  S.E. COEF   T-VALUE  SIGNIFICANCE
 1 Y2 - 1      .6578       .3767      .1348    2.7947     .0070
 2 Y2 - 2      .5529      -.0533      .1377    -.3871     .7020
 3 Y2 - 3      .6585       .4308      .1396    3.0863     .0034
 4 Y2 - 4      .6053       .0545      .1404     .3879     .7014
 5 Y2 - 5      .5820       .2189      .1405    1.5580     .1207
 6 Y2 - 6      .4991      -.1520      .1479   -1.0277     .3090
CONSTANT (a)              1.7979

CORRELATIONS AMONG THE INDEPENDENT VARIABLES:
1.0000   .6392   .5384   .6544   .5876   .5691
 .6392  1.0000   .6341   .5380   .6337   .5473
 .5384   .6341  1.0000   .6349   .5328   .6510
 .6544   .5380   .6349  1.0000   .6417   .5722
 .5876   .6337   .5328   .6417  1.0000   .6322
 .5691   .5473   .6510   .5722   .6322  1.0000

5. Stage 2 of the Durban estimating procedure is implemented by specifying an autoregression model on data beyond the first k empty rows in the residual column (i.e., beginning at row k+1). For the object series, the analyst should specify an autoregressive model of order higher than may be needed, e.g., k=3. In addition to the autoregressive terms for the object series, the analyst should include in the model data for another autoregressive variable selected from the matrix column containing the residuals from the first-stage autoregression. These residuals then constitute the disturbances upon which the parameters of the moving average terms can be estimated. The regression on the residuals column should also be specified to the same order as the autoregressive term, e.g., k=3. No other independent variables should be added to the model. The regression model being estimated in this example should then have six terms, three autoregressive terms (k=3), and three disturbance terms (the residuals to autoregressive order k=3). Figure A8-2 illustrates the display of the results for the six-term regression model, i.e., an ARIMA(3,0,3) model.

Figure A8-2. Autoregressive model of Series Y2, including three autoregressive terms and three disturbance terms.

AUTOREGRESSION
MODEL: Y(t) = a + b(1)*Y(t-1) + b(2)*Y(t-2) + ... + b(k)*Y(t-k)
                  + b(k+1)*X(1) + b(k+2)*X(2) + ...
DEPENDENT VARIABLE (Y) IS MATRIX COLUMN: 2 Y2
ORDER (K) OF EQUATION: 3
COEF OF MULTIPLE CORRELATION (R):     .7915    CORRECTED R:   .7685
COEF OF MULTIPLE DETERMINATION (RSQ): .6265    CORRECTED RSQ: .5906
STANDARD ERROR OF THE ESTIMATE:       .9701    MSE:           .9412

ANALYSIS OF VARIANCE:     SUMS OF SQUARES   DEGREES OF FREEDOM
TOTAL:                       128.5084       57
REMOVED BY REGRESSION:        80.5091       6
RESIDUAL:                     47.9994       54    F-VALUE:  14.2570

INDEP VAR    SIMPLE R   COEF (b)  S.E. COEF   T-VALUE   SIGNIFICANCE
 1 Y2 - 1      .6716      1.5043      .4678    3.2155     .0025
 2 Y2 - 2      .5831      -.0910      .3502    -.2599     .7917
 3 Y2 - 3      .6714      -.5039      .3785   -1.3310     .1855
 4 AR6RES - 1 -.2397      1.2044      .4996    2.4108     .0182
 5 AR6RES - 2 -.0971       .3183      .3206     .9928     .6740
 6 AR6RES - 3 -.3577      -.9216      .3724   -2.4747     .0156
CONSTANT (a)              1.2273

CORRELATIONS AMONG THE INDEPENDENT VARIABLES:
1.0000   .6579   .5657  -.6454  -.1939  -.0695
 .6579  1.0000   .6479  -.1331  -.6436  -.2084
 .5657   .6479  1.0000  -.1279  -.1334  -.6668
 .6454   .1331   .1279  1.0000  -.1307   .1175
-.1939  -.6436  -.1334  -.1307  1.0000  -.1324
-.0695  -.2084  -.6668   .1175  -.1324  1.0000

The analyst should inspect the inference statistics for the estimated model, anticipating that some of the autoregressive terms may be redundant as judged by the significance levels (inferred from their t-values). The model should then respecified, reducing the order of the autoregressive and residual terms as seems appropriate, until it may be judged that an optimal model remains. In Figure A6-2 it appears that only the first order autoregressive term and the first- and third-order disturbance (residual) terms are significant.

The analyst may then specify a final model retaining only those autoregressive and residual terms which are judged to be statistically significant. Figure A8-3 illustrates the display of the results of generating an ARIMA(1,0,3) model for Series Y2. Three of the four variables in this model are statistically significant below the .01 level.

Figure A8-3. An ARIMA(1,0,3) for Series Y2.

AUTOREGRESSION
MODEL: Y(t) = a + b(1)*Y(t-1) + b(2)*Y(t-2) + ... + b(k)*Y(t-k)
                    + b(k+1)*X(1) + b(k+2)*X(2) + ...
DEPENDENT VARIABLE (Y) IS MATRIX COLUMN: 2 Y2
ORDER (K) OF EQUATION: 2

COEF OF MULTIPLE CORRELATION (R):     .7822    CORRECTED R:   .7682
COEF OF MULTIPLE DETERMINATION (RSQ): .6118    CORRECTED RSQ: .5902
STANDARD ERROR OF THE ESTIMATE:       .9702    MSE:           .9413

ANALYSIS OF VARIANCE:     SUMS OF SQUARES   DEGREES OF FREEDOM
TOTAL:                        128.5084      57
REMOVED BY REGRESSION:         78.6179      4     F-VALUE:  20.8795
RESIDUAL:                      49.8905      53        SIG:    .0000

INDEP VAR    SIMPLE R   COEF (B)  S.E. COEF   T-VALUE  SIGNIFICANCE
 1 Y2 - 1      .6716       .9248      .1215    7.6133     .0000
 2 AR6RES - 1 -.2397       .6047      .1853    3.2633     .0022
 2 AR6RES - 2 -.0971       .1955      .1448    1.3502     .1792
 3 AR6RES - 3 -.3577      -.4305      .1347   -3.1968     .0026
CONSTANT (a)              1.8233

CORRELATIONS AMONG THE INDEPENDENT VARIABLES:
1.0000  -.6454  -.1939  -.0695
-.6454  1.0000  -.1307   .1175
-.1939  -.1307  1.0000  -.1324
-.0695   .1175  -.1324  1.0000

INDEPENDENT VARIABLE NUMBER 1:                       15.6000
INDEPENDENT VARIABLE NUMBER 2:                        -.4721
INDEPENDENT VARIABLE NUMBER 3:                         .8635
INDEPENDENT VARIABLE NUMBER 4:                         .1992
POINT-ESTIMATE FORECAST OF DEPENDENT VARIABLE:       15.2734
95 PERCENT CONFIDENCE INTERVAL:              13.3330   TO   17.2139

Very little in the way of explanatory ability was lost in deleting the statistically-insignificant variable (Y2 - 2) of the ARIMA(3,0,3) model illustrated in Figure A8-2. Variable AR6RES - 2 might also be deleted from this model (this could be accomplished by backward stepwise regression) without further significant loss of explanatory ability. The AR(1,0,3) model illustrated in Figure A8-3 yields a higher value for the coefficient of determination (R₂) and a smaller standard error of the estimate than did the AR(6) model illustrated in Figure A8-1. The residuals from this final model should be written to the data matrix in the next available column so that the adequacy of the model may be judged in the final stage of diagnostic checking.

Diagnostic Checking

Figure A8-4 displays the simple correlogram for the final model specified in Figure A6-3; none of the simple autocorrelations are significantly non-zero, so there is no other information contained in the residuals which might improve the ability of the model to forecast.

Figure A8-4. Correlogram for the residuals of the ARIMA(1,0,3) model of series Y2.

   SIMPLE  -1.0      -.5         0        .5       1.0
K AUTOCORR   I---------+---------+---------+---------I
 1   .0203   I              :    I*   :              I
 2  -.0240   I              :    *    :              I
 3   .0090   I              :    *    :              I
 4   .0456   I             :     I*    :             I
 5   .1138   I             :     I *   :             I
 6  -.0373   I             :    *I     :             I
 7   .0009   I             :     *     :             I
 8   .0495   I             :     I*    :             I
 9   .1531   I             :     I  *  :             I
10  -.0537   I             :    *I     :             I
11  -.0019   I             :     *     :             I
12   .0074   I             :     *     :             I
             I---------+---------+---------+---------I
           -1.0      -.5         0        .5       1.0

Another approach to diagnostic checking is to estimate a model with higher-ordered autoregressive and moving average terms, then observe (i.e., draw an inference from the t-statistic) whether the regression coefficients of the additional terms are statistically significant. Figure A8-2 shows such a model; it can be seen that the coefficients of the additional terms are not significant at the 0.05 level.

What's Ahead

This brings to a close our survey of time series forecasting tools. Now it's up to the reader to try using them to forecast series that interest him or her. Bonne chance!

<>

9. The TSA App

System Access

TSA.exe, a computer app that can process the time series analyses and forecasts described in this book, is available upon request by email to rstanford@furman.edu.

TSA Menu

The TSA app is menu-driven by selecting menu item numbers. The main menu appears as follows:

TSA MAIN MENU OPTIONS:

1 DATA MANAGEMENT (TSFT 1)
2 DATA DESCRIPTION (TSFT 2)
3 UNIVARIATE FORECASTING MODELS (TSFT 3)
4 MOVING AVERAGE FORECASTING MODELS (TSFT 4, 7)
5 REGRESSION FORECASTING MODELS (TSFT 3, 5)
6 TIME SERIES DECOMPOSTION (TSFT 6)
9 ABOUT TSA

0 EXIT

ENTER SELECTION >>>

Data File Preparation

The TSA app is structured to process monthly data. It is dimensioned for a maximum of 960 rows (80 years of monthly data) and 20 columns (variables). A data file that is readable by this app may be prepared using EXCEL, SHEETS, or a Windows accessory, NOTEPAD or WORDPAD. The data file should be saved as a CSV (comma-delimited value) UNICODE TEXT DOCUMENT with file type specified as .DAT, e.g., filename.DAT.

To illustrate TSA features, a default data file, DATA1.DAT, is included with program file TSA.exe. The data file contains seven years of monthly labor force data for a metropolitan statistical area (Greenville-Anderson-Mauldin S.C. MSA, 1981-1987). The contents of DATA1.DAT are illustrated following:

CLF,TE,TU,PU,MFG,TX,AP,CE,WRT,CON,AHE,AHW
233.4000,229.0000,4.4000,1.9000,98.8000,48.3000,10.3000,9.3000,37.2000,15.0000,2.9700,40.8000 236.6000,232.2000,4.4000,1.9000,99.9000,48.5000,10.7000,9.5000,37.1000,15.9000,2.9700,41.2000 239.8000,235.6000,4.2000,1.8000,100.5000,48.6000,10.8000,9.6000,37.7000,16.6000,2.9800,41.0000 242.7000,238.0000,4.7000,1.9000,100.3000,48.4000,10.5000,9.6000,39.1000,17.5000,2.9900,41.5000 245.2000,240.4000,4.8000,2.0000,100.8000,48.5000,10.5000,9.7000,38.9000,18.1000,2.9900,40.4000 252.2000,244.9000,7.3000,2.9000,102.4000,49.2000,10.5000,9.9000,39.4000,18.9000,3.0000,41.5000 252.3000,246.0000,6.3000,2.5000,100.4000,47.5000,10.2000,10.1000,39.6000,19.6000,3.0000,41.9000 252.2000,246.5000,5.5000,2.2000,103.4000,49.3000,10.4000,10.2000,39.8000,19.6000,3.0200,40.7000 251.1000,245.6000,5.5000,2.2000,102.4000,49.2000,10.0000,10.3000,40.1000,19.4000,3.1200,40.6000 251.1000,256.8000,4.9000,1.9000,102.9000,49.6000,10.0000,10.2000,40.5000,19.0000,3.1300,40.0000 252.2000,246.8000,5.3000,2.1000,103.5000,49.8000,10.1000,10.2000,41.5000,18.8000,3.1700,41.2000 251.7000,247.0000,4.7000,1.9000,103.4000,49.7000,10.0000,10.1000,42.5000,18.5000,3.2000,41.9000 249.4000,244.5000,4.9000,2.0000,103.4000,49.8000,10.2000,10.1000,40.0000,16.9000,3.2000,40.5000 229.2000,221.0000,8.2000,3.6000,102.4000,48.6000,10.0000,10.1000,39.8000,17.4000,3.2000,40.4000 229.5000,221.1000,7.4000,3.2000,102.3000,48.1000,10.0000,10.3000,39.9000,17.9000,3.2000,39.6000 231.3000,223.9000,7.4000,3.2000,102.3000,48.3000,9.7000,10.3000,40.1000,18.0000,3.2100,38.9000 233.7000,226.4000,7.3000,3.1000,102.8000,48.5000,9.6000,10.4000,40.3000,18.8000,3.2800,40.3000 238.6000,229.3000,9.3000,3.9000,104.4000,49.1000,9.8000,10.5000,40.5000,19.3000,3.4000,40.8000 239.7000,230.8000,8.9000,3.7000,102.7000,47.4000,9.5000,10.6000,40.4000,19.4000,3.4300,40.9000 241.1000,232.7000,8.4000,3.5000,104.6000,48.8000,9.7000,10.6000,40.6000,19.2000,3.4400,40.6000 238.0000,229.0000,9.0000,3.8000,103.5000,48.2000,9.6000,10.7000,40.7000,18.8000,3.4800,40.0000 240.8000,230.1000,10.7000,4.4000,102.6000,47.3000,9.6000,10.7000,40.8000,18.5000,3.4600,38.5000 240.2000,225.1000,15.1000,6.3000,98.0000,44.0000,9.5000,9.7000,41.0000,18.5000,3.4800,39.0000 242.3000,225.1000,17.2000,7.1000,97.3000,44.5000,9.2000,9.5000,41.4000,18.2000,3.4600,38.0000
.
.
.
253.3000,242.1000,11.2000,4.4000,105.6000,43.9000,10.3000,8.5000,50.3000,15.2000,4.8700,40.6000 252.0000,241.3000,10.7000,4.2000,104.5000,43.3000,10.2000,8.5000,50.0000,15.0000,4.9100,40.5000 256.2000,245.2000,11.0000,4.3000,104.7000,42.9000,10.3000,8.5000,50.1000,14.9000,4.8900,40.5000 257.0000,246.2000,10.8000,4.2000,104.8000,42.8000,10.2000,8.5000,50.2000,15.3000,4.8800,39.1000 259.3000,249.0000,10.3000,4.0000,104.9000,42.6000,10.2000,8.5000,50.0000,15.4000,4.9300,40.7000 265.9000,254.1000,11.8000,4.4000,106.1000,42.8000,10.2000,8.8000,49.7000,15.8000,4.9700,40.9000 259.9000,249.9000,10.0000,3.8000,103.7000,41.0000,9.9000,8.8000,49.6000,16.2000,5.0400,40.6000 261.8000,252.6000,9.2000,3.5000,104.9000,42.0000,10.1000,8.6000,49.9000,16.5000,5.1700,40.6000 264.6000,255.0000,9.6000,3.6000,104.4000,41.7000,10.1000,8.6000,50.1000,16.5000,5.1600,40.8000 264.4000,254.5000,9.9000,3.7000,104.3000,41.5000,10.0000,8.6000,50.0000,16.5000,5.2000,40.9000 262.7000,252.2000,10.5000,4.0000,103.6000,41.0000,9.7000,8.6000,50.6000,16.2000,5.2800,41.4000 263.9000,253.4000,10.5000,4.0000,106.8000,42.5000,9.7000,8.7000,53.0000,16.6000,5.3500,41.3000

The first row lists 12 column labels (alphabetic). The number of alphabetic labels tells the TSA app the number of data columns to find.

The next 24 rows illustrated contain the data for the first 24 months of data with 12 column entries each, no embedded blanks or alphabetic characters, and no blanks or commas at the ends of rows.

Monthly data rows 25 through 72 have been omitted from this illustration. The last 12 rows illustrated contain the data in rows 73 through 84 (the seventh year of monthly data).

When data are entered into NOTEPAD or WORDPAD as illustrated and following these instructions, the data should be saved in the folder containing TSA.exe to a file name with extension .DAT. The file illustrated here, DATA1.DAT, is included as a default sample along with TSA.exe, but any other data file that is properly prepared may be opened instead of the default file.

Runtime Errors

Please note that if a "runtime error" occurs upon opening a data file in the app, the user should reload the data file in EXCEL or SHEETS, or in NOTEPAD or WORDPAD, and make sure that the number of data rows and the number of data column entries correspond to the numbers specified in the alphabetic first row of the data file. The user should also check for any inadvertant alphabetic characters or embedded blanks within or at the ends of data rows.

ARIMA Modeling with TSA

ARIMA modeling incorporates an AUTOREGRESSION on a time series of observations with a MOVING AVERAGE of that time series to forecast values within or beyond the end of that series. The conventional designation of an ARIMA forecasting model is p-d-q. i.e., autoregression of order p with a moving average of q elements and d-period differencing of an object series.

Autoregression is warranted if the analyst suspects that observations in the object series are influened by previous observations of the same series. A moving average may be included if the analyst suspects that there are disturbances in the object series that may be diminished or 'smoothed' with a moving average of the series.

If the analyst judges that the object series is not sufficiently stationary (i.e., it exhibits upward or downward trends over ranges of data), he or she may choose to substitute a series of d-period differences of the object series in place of the object series. For example, an ARIMA (3,1,3) forecasting model would entail autoregression of order 3 and a 3-element moving average on a differenced object series of 1 period of each observation from the previous observation. An ARIMA (2,0,3) model would involve an autoregression of order 2 and a 3-element moving average on an undifferenced (i.e., original) object series.

In the TSA system, data are arrayed in a matrix by columns (variables) and rows (obversations) that are saved in a data file that may be opened and accessed for analysis. Once the data file is opened and data are accessible, an ARIMA forecasting model may be implemented in the TSA system in three steps. For illustrative purposes, a default data file, DATA1.DAT, is included with TSA.exe. DATA1.DAT contains an array of 12 columns (variables) and 84 rows (monthly observations) of employment data for a metropolitan statistical area.

1. Choose Option 4 and implement a moving average (M.A.) of the selected object series (a column of data in the data file). The M.A. is automatically saved in the next unused column (up to 20) in the data array beginning at an indicated row after allowing for missing initial observations. In the data array of default data file DATA1.DAT, a 3-element M.A. would have no observations in the first 3 rows, so the M.A. series would begin in row 4 of column 13.

2. Choose Data Management (Option 1), Transformations (sub-option 6), and Differencing (sub-sub-option 6), and specify a period gap (typically 1 period). The differenced series is automatically saved in the next unused column (up to 20) in the data array. In the data array in the default data file DATA1.DAT, a 1-period differenced series of the object series would miss an observation in row 1 and so begin in row 2 of column 14.

Note. The data array expanded with two additional columns remains in system memory to be taken into TSA Option 7 (AUTOREGRESSION), or it may be saved in the same data file (default data file DATA1.DAT) or a new data file (e.g., DATA2.DAT). If the expanded data array is not saved, the added columns will expire when the user exits the TSA system but the original data array in the data file remains intact (i.e., 12 columns x 84 rows in the default data file DATA1.DAT).

3. If the user has not exited TSA after steps 1 and 2, the expanded data array remains in memory. If the user exited TSA after step 2 and later re-entered it, the data array may be retrieved if it was saved to a datafile in step 2 (e.g., DATA2.DAT). In either case, an ARIMA forecasting model may be implemented by choosing TSA Option 5 (REGRESSION FORECASTING MODELS) and sub-option 7 (AUTOREGRESSION). The user may select any column in the data array to be the object series, including the differenced series (column 14 in new data file DATA2.DAT). The user will be prompted for the A.R. order to which the selected object series is to be autoregressed, and whether to add any additional variables to the model. The user should add the M.A. array specified in step 1 (column 13 of the expanded data array). The user must indicate that the autoregression should begin with the first row that contains M.A. observations (row 4 in the expanded data array). Beginnning autoregression at rows that are empty may caluse TSA to stop and exit. If a proper starting row was specified, TSA computes and reports the results for the ARIMA model.

Upon examining the reported autoregression results and correlegrams, the user may choose to respecify the number of moving average elements (step 1), whether a differencing is needed and the number of periods by which the object series is differenced (step 2), and the autoregression order (step 4). Once an ARIMA model has been specified to the satisfaction of the user, the user may implement forecasts of values within or beyond the end of the object series.

<>

About TSA

TSA is a menu-driven app that enables analysis of multiple columns of time series data.

TSA is programmed to read a standard CSV (comma separated values) file that may have been created in Microsoft EXCEL, Google SHEETS, or a text editor and saved in CSV format as a UNICODE TEXT DOCUMENT with file type specified as .DAT, e.g., filename.DAT.

TSA is dimensioned for a maximum of 20 columns and 960 rows of data. In a file readable by TSA, alphabetic column headings are separated by commas in the first row and numeric data are separated by commas in subsequent rows:

    ROW 1:              HEADER1,HEADER2,HEADER3
    ROW 2:              1000.0,2000.0,3000.0
    ROW 3:              4000.0,5000.0,6000.0
                                 .
                                 .
    ROW LAST:           9799.9,9899.9,9999.9

The number of columns in a filename.DAT file is determined by the number of alphabetic comma-separated headings in the first row. TSA will discover the number of subsequent data rows (up to 960) in the data file by continuing to read entries until the end-of-file is reached.

TSA Features:

data matrix maximum 20 columns, 960 rows (80 years of monthly data)
ability to read EXCEL or SHEETS files saved in CSV format
(data may be downloaded from private or government sources into EXCEL files)
automatic detection of column headings and numbers of columns and rows
ability to transform columns
ability to select a column range for analysis
ability to lag data in a selected column relative to data in other columns
ability to construct and analyze time series models
ability to decompose a monthly TxCxSxI time series into components
menu options keyed to Time Series Forecasting Tools (TSFT) by R. Stanford
(parentheses following TSA menu items refer to TSFT chapters)
app and default data file available from rstanford@furman.edu (must be downloaded to a folder on the user's computer where the user's data files will be saved)

EssaysVolume13

Time Series Forecasting Tools

Richard A. Stanford

The Need to Forecast

Types of Forecasts

Forecasting by Default

Forecasting by Intent

Computational vs. Judgmental Forecasting

Science vs. Art

Model Method

Models for Forecasting

Data Sources and Types

Specification of a Forecasting Model

The Data Matrix

Data Requirements

Transformations

What's Ahead

Sequence Plots

Autocorrelation

Other Independent Variables

The Correlation Matrix

What's Ahead

Naive Rules

Classes of Simple, Naive Rules

A. Default Forecast Rules

B. Rules which Address the Trend Factor

C. Rules which Address the Seasonality Factor

D. Rules which Account for Both Trend and Seasonality

The Mean Squared Error

On Horse Races

What's Ahead

The Smoothing Process

Removing Unwanted Variation

Unweighted Moving Average Models

Subjectively-Designed Weighting Factors

Computed Weighting Systems

Exponentially-Weighted Moving Averages

Moving Averages as Forecasting Models

What's Ahead

Trend Regression

The Autoregressive Model

Multiple Regression Models

Non-linear Regression Models

Inferences about the Regression Model

What's Ahead

Time Series Decomposition

Components of a Time Series

Decomposition Techniques

Forecasting by Recomposition

Assessment

What's Next

Autocorrelation

Autocorrelation Coefficients

The Correlogram

Selection Criteria

What's Next

Integrating Autoregressive with Moving Average Approaches

Stages in the Analysis

Criteria for Identification

The Box-Jenkins Procedure for Estimating an ARIMA Model

Diagnostic Checking

An Alternate Procedure for Estimating an ARIMA Model

Diagnostic Checking

What's Ahead

System Access

TSA Menu

Data File Preparation

Runtime Errors

ARIMA Modeling with TSA

Comments

Post a Comment

Popular posts from this blog

StanfordCollectedEssays

EssaysVolume12

EsssaysVolume10