EssaysVolume13
Time Series Forecasting Tools
Richard A. Stanford
Furman University
Greenville, SC 29613
Copyright 2024 by Richard A. Stanford
stored, or transmitted by any means without written permission
of the author except for brief excerpts used in critical analyses
and reviews. Unauthorized reproduction of any part of this work
is illegal and is punishable under the copyright laws of the
United States of America.
CONTENTS
NOTE: You may click on the symbol <> at the end of any section to return to the CONTENTS.
1. Forecasting Tools
2. Criteria for Model Selection
3. Time Series Forecasting Rules
4. Moving Average Forecasting Models
5. Econometric Forecasting Techniques
6. Time Series Decomposition
7. Autocorrelation in Time Series
8. ARIMA Modeling
9. Illustration
Note: analyses described in this work may be conducted with the app RASstat that is available from rstanford@furman.edu.
<Blog Post Essays>
<This Computer Essays>
<>
Decision theorists distinguish two types of control: passive and active. Active
control may be exerted if there is some possibility of altering the direction or
magnitude of some process under the firm's management. Even if such active control is
not possible under known technologies (e.g., the weather, the implementation of a
government policy, a competitor's R&D effort), there may be some possibility of exerting
passive control. Passive control is anticipatory of the likely consequence--for example
to move out of the way or to erect barriers or other means of protection from the
phenomenon, or to move to take advantage of it.
Almost all of the firm's activities as well most natural phenomena are amenable to
some form of active or passive control by the firm's management. Each alternative form
and degree of control action is likely to result in a range of possible outcomes rather
than a unique result which can be known with certainty in advance of choosing from among
the possible control alternatives. Decision risk is inherent in the dispersion of the
outcome possibilities.
This chapter examines a technique for reducing the risk associated with selection
from among the various available control alternatives. The means for reducing risk is to
attain more information about the possible outcomes by estimating or forecasting both the
most likely outcome of a control action, and the range within which the outcome is likely
to occur.
Event-timing forecasts can perhaps be best approached by seeking to identify some sort of
leading indicator (e.g., the date at which a terminal illness is diagnosed).
Qualitative-outcome forecasting is best approached by seeking to add to the stock of
information upon which probability assessments may be based (e.g., almost 90 percent of
all sales personnel in our industry are females). Quantitative-magnitude forecasts may
be approached by examining series of historical data about the target phenomenon and
related matters. This chapter focuses on the latter class of questions by describing
analytical models which can analyze and extrapolate so-called time series of data.
a. Except for the nature of the tools used, there is nothing particularly "scientific"
about forecasting.
b. The computational techniques and models which are described in the ensuing chapters
are in fact only tools of analysis.
c. The exercise of judgment can never be escaped, even when following the most
formal computational approach possible.
A successful forecaster who uses computational tools has to have exercised good judgment
in selecting and adjusting the tools. In this the forecaster has utilized science in
practicing the art.
Perhaps very few real-world relationships can be represented so simply as a straight-line
model incorp-orating only two variables. Thus, the model may be extended to encompass
several other variables, e.g.,
In these models, the dependent variable y is related (by hypothesis, or as a matter of
empirical verification) to one or more independent variables, which are represented
by the x symbols. The other symbols in the equations (a, b1, b2,
etc.) are the so-called parameters of the relationship. They specify the way in which
the dependent variable is related to the independent variable(s).
We shall organize the time-series models examined in this chapter into two classes.
Naive models are those which construct the forecast of a series by extrapolation
of the same series. In effect, earlier values of the same series constitute the
independent variable which is then used to forecast future values of the series as
dependent variable. Multivariate models are those in which the object series
is treated as a dependent variable. In multivariate models, forecasts are constructed
with reference to one or more additional data series specified as independent variables.
Whatever the source of information, the numerical data "cake" may be sliced in either
of two possible directions. First, observations of various aspects of a phenomenon or
process may be recorded at a point in time, but across the population of subjects. Data
collected in this fashion are generally referred to as cross-sectional data.
Alternately, observations of the behavior of a single entity may be recorded at various
points in time, either at regular intervals or irregularly. If the time interval between
subsequent observations is a constant, for example, a month or a year, the collection
of data may be referred to as a time series.
Although some of the models described in this chapter, in particular those estimated
by regression, may be made to address cross-sectional data, we shall take as our
primary mission the application of forecasting models to time series data where the
standard interval between observations is the month or the year.
Although the forecasting analyst might simply guess at the optimal form of the
equation of relationship and the likely values of the parameters (such as the
naive forecasting rules described below), both the equation form and the parameter
values can usually be estimated more accurately with reference to historical data
for the phenomenon. Thus, an historical data base is useful both to the specification
and to the validation phases of model construction.
If the data were collected as cross-sectional data (i.e., at a point in time but
across the various subjects in the population), then the rows of the matrix would
correspond to the included subjects. Alternately, if data were collected as time-series
data (i.e., over time but for a single subject), the rows of the matrix would represent
the succession of times at which observations were taken. The focus of this chapter is
upon time-series data, so we shall normally presume that the data matrix is dimensioned
for some number of variables horizontally, and some number of time-periods vertically.
How many rows of data should a forecasting data matrix contain? Again, there is no
single answer. The more observations available for analysis, the more reliable the
results will likely be. An often used rule of thumb is that a regression model with
one independent variable should span at least two dozen observations, and more if
additional independent variables are included.
If the series to be forecasted is thought to exhibit a seasonal pattern, the data
matrix should contain enough rows to accommodate several (three or more) years of
monthly data so that the identified seasonal pattern may be confirmed several times
over. If the object series is thought to exhibit a cyclical pattern, the matrix
should have enough rows to accommodate several such cycles. Since U.S. business cycles
over the past century have averaged around five years in duration (from peak through
trough to the next peak), the matrix may have to extend to 180 or more rows containing
monthly data to encompass three such cycles.
Irrespective of the number of rows and columns, the data matrix must be densely
populated with data to be analyzed with forecasting models. "Densely populated"
means that there may be no vacant cells in the matrix, that is, there may be no missing
or absent data. Should some data be missing, either the missing values must be found
or the columns (variables) containing the vacant cells must be deleted from the analysis.
Alternately, the matrix may be shortened in length at either end to exclude the vacant
cells. In any case, it is strictly not legitimate to interpolate, average, "guesstimate,"
or otherwise invent data to fill the vacant cells.
For illustrative purposes, we shall analyze sequence plots of time series components with
a seven-year collection of monthly data for a series which we shall denote Y1. Data for
the Y1 series are reproduced in Table 2-1 at the end of this chapter.
Random Noise. Every time series which is not subject to administrative control can
be expected to exhibit dispersion about its mean (arithmetic average) when graphed on a
sequence plot. This dispersion may exhibit certain discernible patterns, several of which
are described below and which may be isolated from the series employing commonly available
techniques. The dispersion which remains after all discernible patterns have been isolated
and removed from a series may be described as an irregular or random noise component.
If the series contained no discernible patterns to be isolated and removed, then the
series itself may be described as a random noise series.
The technical specification of a purely random noise series is that there is no
correlation among observations within the series (i.e., autocorrelation) so that no
observation within the series can be used to forecast the values of other observations
within the same series. The only relevant forecast which can be made for future values
of a purely random noise series is the mean of the series.
Figure 2-1 illustrates the sequence plot for the Y1 raw-data series. While series Y1,
identified by the asterisk (*) symbols, obviously exhibits a great deal of random noise
behavior, there are also present within the series one or more other behavioral patterns.
Trend may be detected computationally by dividing the time series into a convenient
number of ranges, say three or four, and computing the means for each of the ranges.
Trend is present if the means for the successive ranges consistently increase (or
decrease).
Trend in economic or business series may be attributed to a growth or contraction
process, or to inflation or deflation if the series consists of money-value data
which have not been deflated to eliminate the effects of change in the purchasing
power of the unit of currency.
Cyclical Behavior. The term "cycle" perhaps suggests greater regularity or
periodicity than should be expected in economic or business time series. Given the
irregularity of duration of so-called "business cycles," some analysts have preferred
to avoid the term "cycle" in favor of some alternative such as business fluctuations.
Also, the term "cycle" when applied to a business or economic phenomenon implies some
underlying causative mechanism, about which there is a notable lack of consensus among
economists and business analysts. The debate has gone in the direction of considering
whether there is a natural and irrevocable wave-like process in commercial contexts,
or only discrete fluctuations resulting from impacts of exogenous occurrences or policy
actions. Even so, we shall employ the term "cycle" to refer to the behavioral component
of a time series consisting of alternations in runs of values above and below the mean
of the series, or above and below the fitted trend line if trend is present.
Cyclical behavior may be detected computationally in either of two ways. One way is
similar to that for detecting trend. The difference is that if cyclical behavior is
present, the means of the successive ranges of the time series will not consistently
increase or decrease. The means for four consecutive 21-month ranges of the 84-month
series Y1 are 242.43, 238.21, 251.62, and 258.79, thus suggesting the presence of a
cyclical pattern. Since the means do generally increase, the implication is that there
is a positive trend present as well.
A second method of detecting cyclical behavior is to count the number of "runs" of
values above and below the mean of the series, and compare the actual number of runs
with the statistically expected number of runs if the series were a purely random
noise series. The expected number of runs for a purely random noise series may be
computed by the formula, 2p(n-p)/n-1, where n is the number of items in the series
and p is the number of items above the mean of the series. If the actual number of
runs is substantially smaller than the expected number of runs, there is reason to
believe that some sort of cyclical behavior is present in the series. The actual
and expected number of runs for the Y1 series are 7 and 40.66, respectively, thus
confirming the presence of cyclical behavior within the series.
Seasonality. What has become known in U.S. commercial circumstances as a "business
cycle" (in the U.K. it is known as a "trade cycle") is a fluctuation in business activity
from trough through peak to the next trough of five to six years duration. Seasonality
is a special type of cyclical behavior which comes closer to meeting the requisites of
periodicity than do any of the longer-duration waves in commercial activity. Seasonality
is cyclical behavior with a period of one year, and which repeats itself, possibly with
differences in amplitude, year after year with little to no difference in timing.
Seasonality is attributable to the passing of the seasons, and to custom or convention
in the timing of events such as the start of school and particular holidays. The seasonal
character may change over time due to changing customs or conventions. For example,
during the latter half of the twentieth century the start of school has been advancing
from early September to late August in most parts of the U.S., thus causing the
back-to-school spurt in retail sales activity to occur somewhat earlier during the year.
Seasonality cannot be observed at all in annual data, and can be seen only imperfectly
in quarterly data. It is most often analyzed and identified in monthly data recorded
in successive 12-month intervals, but can also be detected in weekly and daily data.
Business firms that record in-house data over ten equal periods per year (instead of
twelve months) may also expect to see seasonal behavior in their data.
It may be possible to discern other possible patterns of non-random behavior in time
series data collected over shorter intervals, for example in weekly or daily data. To
the extent that people are paid on a monthly basis (instead of weekly, biweekly, or
semi-monthly), we should not be surprised to observe a spurt of retail activity during
the week following the most common monthly pay day. Daily data can also exhibit
so-called "trading-day" variation associated with mid- or end-of-week activity.
Such variations of course cannot be observed in annual, quarterly, or monthly data.
Most monthly time series of economic or business data typically contain two or more
of these components. It is often possible to discern the presence of some of the
components by visual inspection of the sequence plot of the series. Trend will usually
be quite conspicuous, as apparent in Figure 2-1. It may also be possible to discern
cyclical swings around an imaginary or fitted trend line, but it typically is much
more difficult to discern seasonality distinct from random noise. Computational
techniques (e.g., time series decomposition) for separating out the components of a monthly
time series are described in Chapter 5. Figure 2-2 illustrates a sequence plot of
a combination trend-cycle series plotted from the data in Table 1-1.
How can one identify and assess the strength of autocorrelation in a time series?
One means is to compute the coefficients of autocorrelation between pairs of entries
within the series. If the analyst is interested in the autocorrelation between adjacent
entries, the autocorrelation should be specified to order k=1. For the correlation
between every other entry in the series, the autocorrelation should be specified to
order k=2. Autocorrelation order k=3 would be for the correlation between each entry
and the third from it in the series, and so on.
Computed autocorrelation coefficeints may serve as a basis for judging whether a time
series may be modeled by autoregressive (AR), moving average (MA), or some combination
(ARMA or ARIMA). The reader may find a discussion of the computation of such coefficients
and the criteria by which they may be interpreted in Chapter 7.
Another type of graphic display which might be useful to the analyst in examining the
possible influence of other phenomena on the object series is the scatter diagram.
A scatter diagram can relate the object series to another series on the premise that the
behavior of the object series is in some way governed or influenced by the behavior of
the other series. The object series may then be construed as a "dependent" variable,
and the other series as an "independent" or deterministic variable. It is conventional
(but not essential) to plot the dependent variable on the vertical axis of the scatter
diagram, and the independent variable on the horizontal axis.
The scatter diagram should convey a visual impression of whether or not there is indeed
any relationship between the two variables. If there is no significant relationship,
points plotted in the coordinate space will be randomly dispersed, exhibiting no
discernible pattern or path. In this case, no further consideration need be given to
that independent variable.
The closer the plotted points lie along a path with a discernible direction, the
stronger is the relationship between the two variables. The relationship might be
direct (indicated by an upward slope of the path) or inverse (shown by a downward slope),
linear (best represented by a straight line) or curvilinear (a curved path). If all
of the plotted points happened to lie precisely along a particular line (straight or
curved), we could say that there is a "perfect" relationship between the two variables
(but this would be so rare in economic and business contexts as to be suspect).
If the analyst has reason to believe that the object series is governed or influenced
by two or more other phenomena for which comparable time series are available, then
one scatter diagram relating the object series to each of the prospective deterministic
variables should be constructed. However, it may occur that neither of the deterministic
variables by itself will exhibit a visually identifiable relationship with the dependent
variable in a scatter diagram.
Naive rules, in their simplicity, are relatively low-cost approaches to forecasting,
but if a method is effective one should not hesitate to employ it because of its
simplicity or naivete. Naive rules are more effective at short-term than at long-term
forecasting. As the forecasting span or gap is lengthened, the less accurate is the naive
forecast likely to be, and the greater the attendant risk in basing decisions upon such
forecasts.
The simple, naive rules described below may be made to address the trend and
seasonality factors which may be present in a time series, but naive rules are rarely
able to account for any cyclical behavior present in a series. There are four classes
of simple, naive rules:
a. those which address neither seasonality nor trend (the default forecasting rules);
b. those which address the trend factor, but assume seasonality to be insignificant;
c. those which address the seasonality factor, but assume trend to be insignificant; and
d. those which attempt to address both the trend and seasonality components of the time
series.
It is possible to formalize the default forecasting approach into an algebraic rule
which can be represented in the following format:
Rules A.1 and A.2 are so naive and simple that one might doubt the wisdom of
formalizing them. But they serve three purposes: (1) to reveal the use of the
symbolic representations in the most rudimentary format; (2) to serve as a point of
departure in the development of subsequent rules; and (3) to constitute a benchmark
rule against which the performance effectiveness of other forecast rules may be
compared.
In the class B rules surveyed following, the presumption is that no other type of
variation (e.g., seasonal or cyclical) than trend is a significant factor within the
series. The forecasting technique used in the class B rules is to construct a
trend-adjustment factor to apply to the series observation which constitutes the basis
for the forecast.
The simplest technique for attempting to account for change in a forecasting rule
is to add the most recent absolute change between two observations,
(yt - yt-1), to the most recent observation of the series,
yt, in order to compose the forecast of the next value of the series,
yt+1. If the analyst wishes to forecast a value several periods, i,
beyond the most recent observation, all that is necessary is to multiply the computed
change by i, which shall henceforth be referred to as the "forecast gap." This may
be represented algebraically as
Suppose that there is reason to believe that the process of change in the series
is that of growth, so that a relative change may be more meaningful than an absolute
change. Rule B.2 is a modification of Rule B.1 to compose the forecasted value by
applying the most recent relative change, yt/yt-1, to the most
recent observation:
Rule B.3 represents an effort to address the problem of incorporating long-term
change. Instead of using only the most recent absolute change as an adjustment factor,
Rule B.3 uses the average of all successive observation increments in the series, and
thereby employs information spanning the entire data series. The algebraic formulation
of the rule is:
Like Rule B.3, Rule B.4 utilizes information spanning the entire data series.
Rule B.4 is a modification of Rule B.2 to use the average period-to-period relative
changes over the entire data series (instead of the most-recent single-period relative
change) as the trend adjustment factor. The algebraic formulation is as follows:
These four simple, Naive rules are purported to account for trend variation in a
data series; they are not the only possible ways to treat trend variation, only the
simplest. Moving average rules, described below, can also address the trend
phenomenon.
Rule C.1 illustrates the simplest and most naive method of attempting to account
for seasonality in a time series:
Rule D.2: yt+i = yt+i-12(S(yk/yk-1)/(m-1))i,
summed from k=2 to m.
This text surveys the tools and techniques of economic
forecasting. These tools, together with those of simulation modeling, should provide
the analyst with the ability both to assess the condition of the organization and plan
strategically for its succssful operation. Our objective in the present chapter is to
survey the field of economic forecasting and acquaint the reader with the possible
applications in decision making settings.
The Need to Forecast
When and why is forecasting needed? Simply put, when the result of an action is of
consequence, but cannot be known in advance with precision, forecasting may reduce
decision risk by supplying additional information about the possible outcome.
The potential benefit of forecasting lies in the realm of decision making to exert
control over some process.
Types of Forecasts
There are basically three types of questions about future states which efforts at forecasting
might address. When might an event occur? What are the qualitative characteristics of the
outcome of an expected event? What will be the magnitude of a quantity at a future point in
time? An example of the first question is "When will the next vacancy occur in the sales
department?" An example of the second is "What will be the sex of the next salesperson
employed?" And an example of the third is "What will be the likely volume of sales during
the third quarter of the year?"
Forecasting by Default
All rational decision makers engage in forecasting behavior, whether by intent or by
default. Many times decision makers do not formally, intentionally, or even consciously,
construct a forecast of likely outcomes before making their decisions. But they must
have informally indulged in the implicit forecast that the future will be like the
recent past. The default forecast, i.e., that the future will be like the recent
past, may be entirely adequate for most of the simpler and less-consequential
decision-making circumstances of daily life and commercial operation. Many aspects of
the world are more complex and dynamic, however. The more complex and dynamic a
decision situation, and the more consequential the likely outcome, the less likely
is the default forecast to be adequate.
Forecasting by Intent
The dynamism of the world, the range of alternative courses of action, the consequence of
outcomes, and the variability of outcomes have led analysts to develop a variety of time
series forecasting techniques. The availability of these more formalized techniques has
enabled decision makers to engage more readily in intentional forecasting analyses before
having to make consequential decisions. While such time series forecasting techniques
have in the past tended to remain within the preserves of professional analysts with
whom the decision makers have contracted for consulting services, there is technically
no reason why the ultimate decision makers themselves cannot grasp and wield the known
and proven forecasting tools.
Computational vs. Judgmental Forecasting
A purely judgmental approach to forecasting might avoid the use of computational
techniques altogether. The exclusively judgmental approach relies upon the perceptiveness,
insight, and experience of the forecaster to produce the forecast of the future state.
Depending upon how consequential the decision and how able the forecaster, a purely
judgmental approach may yield satisfactory forecasts. In cases of more consequential
decisions or more dynamic situations, computational forecasting methods may be in order.
However, it does not necessarily follow that a computational approach will be able to
improve upon informed judgment.
Science vs. Art
Lawrence Salzman has provided a meaningful distinction between science and art in the
forecasting realm. Salzman suggests that after the usual adjustments are made to the
data in a computational approach, "From then on the science melts to a degree, and the
liquid part is called art. We can define the artist in most general terms as one who
knows the science of his subject and is able to adapt it to his needs."
(Computerized Economic Analysis, McGraw-Hill, 1968, p. 73) As Salzman further
notes, computational forecasting techniques are little more than potentially valuable
tools; they can enable the forecaster "to gain insight and help him to make more
sophisticated value judgments." Three important points follow from Salzman's
distinction:
Model Method
The forecasting techniques described in the ensuing pages of this chapter are those of
model methodology. A brief review of model formats will provide a platform for
extending them in the realm of forecasting. If continuous variation may be depicted in
two- or three-dimensional graphic space, the model may be represented in mathematical
form as a generalized functional-notation statement, e.g.,
Models for Forecasting
Equations (2), (4), and (5) above are possible formats (among many others) for forecasting
models, but to be useful as such a couple of adjustments must be made. First, a lead-lag
structure needs to be built into the independent-dependent variable relationship.
Otherwise, we shall find ourselves in the difficult position of having to predict
values of the independent variables before we can forecast values of the dependent
variable. The structuring of a lead-lag relationship may be accomplished by pairing
dependent variable values with independent variable observations occurring one or
more periods earlier in time. Equation (2) may thus be recast as
Data Sources and Types
Where does one find data? What kinds of data are needed? How much data? Data may be
obtained by conducting experiments, from surveying opinions, preferences, or expectations,
or from historical sources. With notable exceptions, social scientists have generally
shied from experimentalism, preferring to rely upon surveys and recorded history for
their data. The on-going natural courses of human interaction constitute the most
prolific of stochastic processes. But human experience becomes data only when human
beings go to the trouble (and expense) of observing and recording that information for
later consideration. There are cost implications to the recording of history, no less
so than the conducting of surveys.
Specification of a Forecasting Model
The process of specifying a forecasting model involves (1) selecting the variables
to be included, (2) selecting the form of the equation of relationship, and (3) estimating
the values of the parameters in that equation. After the model is specified, (4) its
performance characteristics should be verified or validated by comparison of its
forecasts with historical data for the phenomenon it was designed to forecast.
The Data Matrix
As noted above, data may be collected from experiments, by conducting surveys, or
from historical sources. Regardless of the source of the data, a convenient form in
which to organize it for statistical analysis is the data matrix which is a
rectangular array of numbers presented in a row-and-column format. The columns and rows
may be given either possible identity as desired, but for our purposes it will be
convenient to construe the columns as "variables," and the rows as "cases" or
"observations."
Data Requirements
How much data are required in order to specify the forecasting models described in
this chapter? Unfortunately, there is no single answer to this question. If after
examining these models the reader decides that some of the naive models should be
adequate, the maximum number of required columns in the data matrix is only one, that
of the series to be forecasted. Should the reader want to try the multivariate models
described in the last section of the chapter, the data matrix must have at least two
columns, one to contain the so-called object series, and one or more additional
columns to contain the independent variable or predictor series.
Transformations
After the matrix has been populated with original data, the analyst may find that other
versions or transformations of the data in certain columns are needed in some of the
forecasting models. For example, it may be desirable to lag or lead all of the values
in one of the columns relative to data in the other columns. This would require that
the data in that column be shifted upward or downward by the requisite number of rows.
In a regression model, it may be desirable to use the squared or logarithmic values of
the data in some column as a dependent or independent variable. If such data
transformations are required, it will be convenient in structuring the data matrix
to allow several vacant columns beyond the original data columns to receive the
transformed data.
What's Ahead
Once a potentially forecastable data series has been entered into a matrix, the analyst
must consider criteria for selecting appropriate forecasting models. Chapter 2 surveys
the criteria which are available for this purpose.
<>
Once data have been captured for the time series to be forecasted, the analyst's next step
is to select a model (or models) which has potential for producing successful forecasts.
Various statistical and graphic techniques may be useful to the analyst in the selection
process.
Sequence Plots
The best place to start with any time series forecasting analysis is to graph sequence
plots of the time series to be forecasted. A sequence plot is a graph of the data series
values, usually on the vertical axis, against time (or the matrix row counter) usually on
the horizontal axis. The purpose of the sequence plot is to give the analyst a visual
impression of the nature of the time series. This visual impression should suggest to the
analyst whether there are certain behavioral "components" present within the time series.
The conventional approach to time series analysis is to presume that any time series may
consist of several possible components, depending upon the standard intervals over which
observations were recorded.
Figure 2-1. Sequence plot of trend for Series Y1.
VARIABLE: ORIGINAL = *
STANDARD DEVIATION = 9.9184
MEAN = 248.2333
VARIABLE: TREND = .
STANDARD DEVIATION = 6.6418
MEAN = 248.6420
-3 -2 -1 0 1 2 3
I-------+-------+-------+-------+-------+-------+
7 7 I . I *
8 8 I . I *
9 9 I . I *
10 10 I . I *
11 11 I . I *
12 12 I . I *
13 1 I . I*
14 2 I * . I
15 3 I * . I
16 4 I * . I
17 5 I *. I
18 6 I . * I
19 7 I . * I
20 8 I . * I
21 9 I x I
22 10 I . * I
23 11 I .* I
24 12 I . * I
25 1 I * . I
26 2 I * . I
27 3 I * . I
28 4 I * . I
29 5 I * . I
30 6 I * . I
31 7 I * . I
32 8 I * . I
33 9 I * . I
34 10 I * . I
35 11 I * . I
36 12 I * . I
37 1 I * . I
38 2 I * . I
39 3 I X I
40 4 I *. I
41 5 I . *
42 6 I .I*
43 7 I XI
44 8 I .*
45 9 I X
46 10 I *.
47 11 I *I.
48 12 I * I.
49 1 I * I.
50 2 I * I .
51 3 I X .
52 4 I I .*
53 5 I I . *
54 6 I I . *
55 7 I I .*
56 8 I I . *
57 9 I I . *
58 10 I I .*
59 11 I I X
60 12 I I *.
61 1 I I * .
62 2 I I * .
63 3 I I *.
64 4 I I X
65 5 I I X
66 6 I I . *
67 7 I I X
68 8 I I . *
69 9 I I *.
70 10 I I * .
71 11 I I * .
72 12 I I * .
73 1 I I * .
74 2 I I * .
75 3 I I * .
76 4 I I * .
77 5 I I * .
78 6 I I . *
79 7 I I * .
80 8 I I * .
81 9 I I .*
82 1 I I X
83 1 I I * .
84 1 I I *.
VAR 1 EXPECTED NUMBER OF RUNS IF SERIES IS RANDOM: 40
VAR 1 ACTUAL NUMBER OF RUNS: 7
Trend. The sequence plot of a time series may be said to exhibit trend if the
data path is not approximately level, but appears to change consistently in the same
direction (which may be upward or downward). It is common to fit a so-called "trend
line" to the data path so as to minimize the deviations (actually the squares of the
deviations) of the fitted line from the plot of the points. Regression techniques for
fitting a trend line are described below. The trend is said to be linear if the slope
of the fitted trend line does not change, and non-linear if the slope does change over
the course of the series. The dot (.) symbols in Figure H1-1 identify the path of a
fitted positive-slope, linear trend line.
Figure 2-2. Trend-cycle series derived from series Y1.
VARIABLE: TREND-CYCLE = +
STANDARD DEVIATION = 8.6293
MEAN = 247.9816
VARIABLE: TREND = .
STANDARD DEVIATION = 6.6418
MEAN = 248.6420
-3 -2 -1 0 1 2 3
I-------+-------+-------+-------+-------+-------+
7 7 I . + I
8 8 I . +
9 9 I . +I
10 10 I . + I
11 11 I . + I
12 12 I . + I
13 1 I . + I
14 2 I . + I
15 3 I . + I
16 4 I . + I
17 5 I .+ I
18 6 I X I
19 7 I X I
20 8 I + . I
21 9 I +. I
22 10 I + . I
23 11 I + . I
24 12 I + . I
25 1 I + . I
26 2 I + . I
27 3 I + . I
28 4 I + . I
29 5 I + . I
30 6 I + . I
31 7 I + . I
32 8 I + . I
33 9 I + . I
34 10 I + . I
35 11 I + . I
36 12 I + . I
37 1 I + . I
38 2 I + . I
39 3 I + . I
40 4 I + . I
41 5 I + . I
42 6 I + .I
43 7 I +.I
44 8 I +.I
45 9 I +.
46 10 I X
47 11 I +.
48 12 I +.
49 1 I IX
50 2 I I X
51 3 I I .+
52 4 I I X
53 5 I I .+
54 6 I I . +
55 7 I I .+
56 8 I I . +
57 9 I I .+
58 10 I I .+
59 11 I I . +
60 12 I I .+
61 1 I I .+
62 2 I I X
63 3 I I X
64 4 I I .+
65 5 I I X
66 6 I I X
67 7 I I +.
68 8 I I +.
69 9 I I +.
70 10 I I + .
71 11 I I + .
72 12 I I + .
73 1 I I + .
74 2 I I + .
75 3 I I + .
76 4 I I + .
77 5 I I + .
78 6 I I +.
79 7 I I X
80 8 I I +.
81 9 I I X
82 10 I I X
83 11 I I .+
84 12 I I .+
Autocorrelation
In any time series containing non-random patterns of behavior, it is likely that any
particular item in the series is related in some fashion to other items in the same
series. If there is a consistent relationship between entries in the series, e.g.,
the 5th item is like the 1st, the 6th is like the 2nd, and so on, then it should be
possible to use information about the relationship to forecast future values of the
series, i.e., the 33rd item should be like the 29th. In this case we may say that
the series has some ability to forecast itself because of autocorrelation (or
self-correlation) among values within the series.
Other Independent Variables
Many time series are adequately self-predictive employing naive forecasting rules,
moving average models, autoregressive models, or some combination. However, if none
of these approaches can adequately forecast the time series, then there are two
remaining possibilities. Either the series is characterized so extensively by random
noise that its behavior is simply not forecastable, or the behavior of the series is
influenced by some other or "outside" phenomena not included in the model. The latter
case implies the existence of specification errors. Analysts are reluctant (perhaps as
a matter of pride) to admit that a series is so random in behavior as to be
unforecastable. So, ruling out this possibility for the moment, analysts are led to
search for other data series which might explain the behavior of the object series.
The Correlation Matrix
A statistical correlation matrix, an example of which is illustrated in Figure 2-3, may
enable the analyst to assess the strength of relationships between the object series
and other possible series and among the other series. Each number in the correlation
matrix, identified by a column header and a row descriptor, indicates the degree of
relationship (i.e., the correlation) between the respective variables.
Figure 2-3. Correlation matrix for variables Y1, X1, and X2.
MATRIX OF COEFFICIENTS OF CORRELATION BETWEEN PAIRS OF VARIABLES IDENTIFIED HORIZONTALLY
AND VERTICALLY:
1 2 3
1 1.0000 0.9087 -0.1977
2 0.9087 1.0000 -0.5767
3 -0.1977 -0.5767 1.0000
The range of the correlation coefficient, usually denoted by the symbol r or R, is from
-1 to +1. Correlation coefficients close to zero (positive or negative) imply negligible
relationship, and will correspond to a scatter diagram within which plotted points are
dispersed across the coordinate space in no discernible pattern. Correlation coefficients
approaching 1 (positive or negative) indicate a strong relationship, corresponding to a
scatter diagram with plotted points lying close to a line which can represent the
relationship. Positive correlation coefficients imply a direct relationship (i.e.,
both variables changing in the same direction); negative correlation coefficients
suggest an inverse relationship (one variable increases while the other decreases).
The principal diagonal of the correlation matrix is populated by unity numbers (1.0000),
signifying perfect correlation between each variable and itself.
What's Ahead
Chapters 1 and 2 have examined the possibility of forecasting various aspects of the firm's
situation, and the criteria for selection of a forecasting model. A range of naive
time series forecasting rules and techniques which may be of use to the manager is
surveyed in Chapters 3 and 4. Chapter 5 elaborates multivariate forecasting techniques.
Chapters 6, 7 and 8 describe forecasting techniques which employ combinations of approaches.
Table 2-1. Monthly data for series Y1.
MTH YEAR1 YEAR2 YEAR3 YEAR4 YEAR5 YEAR6 YEAR7
1 233.4 249.4 234.1 240.7 245.6 252.1 253.3
2 236.6 229.2 234.1 231.3 244.9 253.1 252.0
3 239.8 229.5 229.7 245.3 248.3 254.3 256.2
4 242.7 231.3 233.6 244.5 251.9 255.3 257.0
5 245.2 233.7 234.1 249.1 253.9 256.4 259.3
6 252.2 238.6 238.4 249.7 260.3 259.4 265.9
7 252.3 239.7 237.0 248.1 255.1 257.5 259.9
8 252.2 241.1 236.2 248.6 255.8 260.1 261.8
9 251.1 238.0 237.1 249.1 255.8 257.3 264.6
10 251.1 240.8 235.5 247.4 255.5 256.8 264.4
11 252.2 240.2 236.0 248.2 255.1 253.6 262.7
12 251.7 242.3 232.8 246.6 254.3 257.2 263.9
<>
In Chapters 1 and 2 we examined the forecasting environment and the criteria for selecting
potentially effective forecasting models. In this chapter we survey various types of time
series forecasting rules which have found widespread use in the managerial decision
context.
Naive Rules
Naive rules are simple but potentially effective time-series forecasting techniques.
They are rules in the sense that they are prespecified so that no parameter values need be
estimated. The naivete is implicit in the fact that the basis for any naive forecast of a
time series is the time series itself. The series is used to predict itself, that is,
historical values of the series are used to compose or construct future values of the same
series. The technique of naive forecasting is therefore extrapolation.
Classes of Simple, Naive Rules
Every time series exhibits variation in observed values between each observation and
the next across the entire span of the series. Any time series may be presumed to consist
of one or more types of variation: seasonal, cyclical, trend, and irregular. The method
of analysis is to attempt to account for each of the types of variation present in the
series. Some analysts describe the irregualr variation as "random noise" if it meets
certain criteria. If a series is composed exclusively of such random noise variation,
it may not be possible to forecast its values reliably using any of the rules described
in this text.
A. Default Forecast Rules
Rational people often make decisions without first engaging in any sort of explicit
forecasting effort. When they do so, they engage in what we described in Chapter 1 as
default forecasting, the presumption that the future state will be similar to the present
or recent past. The default forecast may be adequate for dealing with many of the
minimal-consequence decisions of daily life. Thus, default forecasting is not necessarily
irrational.
B. Rules which Address the Trend Factor
Trend is the phenomenon of a long-term change in a recorded data series, generally
in the same direction throughout the span of the series. The presence of trend may not
be discernible in a few consecutive observations within the series, especially if other
types of variation (seasonal, cyclical, or purely random) are present. A sequence plot
of a time series (the time series values plotted vertically with respect to time itself
on the horizontal axis) will usually reveal the presence of trend as a gentle upward or
downward "drift" of the data path. An upward sloping trend path in a real-value time
series may be indicative of a growth phenomenon; a downward-sloping path suggests
contraction. In a money-value time series, an upward-sloping path may represent some
combination of real growth and inflation; a downward-sloping trend path might indicate
contraction with deflation.
C. Rules which Address the Seasonality Factor
Seasonality is a pattern of variation within a time series which repeats itself
from year to year. Seasonality may be associated with agricultural functions, seasonal
weather patterns, custom and convention, or religious or secular holidays. It is
important to remember that a seasonal pattern in one time series may or may not resemble
that in another time series.
D. Rules which Account for Both Trend and Seasonality
Even though the approach to seasonality employed in Rule C.1 is naive to the point
of shortsightedness, it may enable a valuable modification to the trend adjustment factors
of Rules B.3 and B.4. These rules modified to take into account the seasonal aspects
of the most recent year may be recast as:
We have described nine simple, naive rules, but we have not exhausted the possibilities for simple, naive format rules. Readers are encouraged to devise and try versions of these approaches which are specific to their own forecasting contexts.
The Mean Squared Error
Any of the simple, naive rules which we have described in this chapter could yield satisfactory forecasting results, even though each suffers some conceptual deficiencies. Some method is needed for making comparisons among them. The Mean Squared Error (MSE) for any forecasting rule is a measure of the average forecast error when the rule is applied to the original data series for which the rule was developed. Once a time-series rule has been conceptualized, it may be tested against any particular time series by using it to forecast as many values within the time series as possible. Although it would be more appropriate from a conceptual standpoint to validate the rule with data from outside the range used for rule construction, such additional data may not be readily available.For Rules A.1 and B.4, if the original time series is m observations long and the forecast gap is i periods, then forecasts of the last m-i-1 observations can be made. For Rules C.1 through D.2, forecasts can be made for the last m-i-12 observations. For each of these observations, the forecast error is the difference between the forecasted value and the actual observation.
The MSE can thus be computed for any time-series rule or model which can be conceptualized. The rule or model with the smallest MSE would thus have the best forecasting record over the period encompassed by the original time series. Will it also be the most effective rule for forecasting beyond the end of the series? The answer depends upon whether the same conditions persist beyond the end of the series.
The Standard Error of the Estimate (SEE), the square root of the MSE, is the standard deviation of the forecast errors. Its usefulness is in specifying a tentative confidence interval for a point estimate forecast (we should note that many statisticians feel that confidence intervals estimated for time series are questionable). Assuming that the forecast errors are normally distributed and that past trends continue into the future, there is a 95 percent chance that a future value, when it occurs, will lie within approximately two standard errors of the point estimate made with the rule. Other approximate confidence intervals may also be computed.
On Horse Races
While the theoretical approach might permit economy in the specification of models and the empirical analysis, it may also result in a certain opportunity loss. What if a "satisfactory" model is identified before the best model available is discovered? The analyst may even run through the entire model "stable" before identifying a "satisfactory" model. In this latter situation, the horse-race approach might just as well have been used from the start.The horse-race approach might permit us to avoid or minimize the amount of conceptual analysis, but require a larger volume of empirical computation. The resulting lowest-MSE model might still not yield satisfactory forecasting results, in which case the analyst must either give up or pursue the theoretical approach to develop yet other possible models.
So we are left with a dilemma: time each horse separately until a fast-enough horse is found; or run a horse race to find a winner? The analyst will have to make a procedural choice at this point. But we must offer one parting caution: a horse which is fast enough for one rider, or which wins one race, will not necessarily be fast enough for any other rider or to win another race. One should not jump to the conclusion that a universal forecasting rule or model has been found simply because it can do an adequate job on one time series.
What's Ahead
This brings us to an end of our survey of time series forecasting rules. A combination of these techniques, time-series decomposition, is described in Chapter 5. Another more sophisticated time series forecasting technique, ARIMA modeling, is described in Chapter 7. These and similar techniques, together with the capability of simulation modeling, may provide the informational basis of both strategic planning and tactical decision making. We now shift to moving average modeling in Chapter 3.<>
In this chapter we examine a class of naive forecasting models which are more complex than the forecasting rules described in Chapter 3. Moving average models, which function to generate a new series by computing moving averages of the original series, are oriented primarily toward removing the seasonal and irregular components or isolating the trend-cycle components of a time series. The newly generated series is a "smoothed" version of the original series.
The Smoothing Process
Moving average models function to smooth the original time series by averaging a rolling subset of elements of the original series. The subset of the original series consists of an arbitrarily selected number of consecutive observations. The subset "rolls" or "moves" forward through the series starting from the earliest observation in the series, adding a new element at the leading edge while deleting the earliest element at the trailing edge, with each successive averaging process.The effect of the moving average process is to ameliorate the degree of variation within the original series by composing the new smoothed series. It is possible to follow a first smoothing of a series with another smoothing of the successor series. The second smoothing may be followed by yet other smoothings. The moving average process may be used for two purposes, to remove unwanted variation from a time series, and as a forecasting model.
Removing Unwanted Variation
Moving average routines may be designed to remove the seasonal and random noise variation within a time series. If the moving average routine is used repeatedly on each newly-generated series, it may succeed in removing most of any cyclical variation present. What is left of the original series after early smoothings to remove seasonal and random or irregular components is a successor series retaining some combination of trend and cyclical behavior. If no trend or cyclical behavior are present in the time series, the smoothings may leave a successor series which plots as a nearly horizontal line against time on the horizontal axis. Assuming the presence of trend and cyclical behavior in the original series, the moving average process provides a method of isolating it.While successive applications of an efficient moving-average routine may result in filtering out all variation other than the trend and cyclical behavior from an original series, this may not be the objective. Rather, the analyst may wish to filter out only the seasonal or only the irregular variation. Either may be targeted by judiciously selecting the number of elements to be included in the moving average subset, and by designing an appropriate weighting system to accomplish his objective. For example, the U.S. Department of Commerce typically uses an unweighted moving average to filter out the seasonality from a series, then a judiciously designed weighted moving average to filter out the irregular variation.
An unweighted moving average with a relatively small number of elements (say five to seven) will have its smoothing effect without destroying the seasonality present in a series. A moving average with a larger number of elements (eleven or more) with weights designed to emphasize the elements toward the center of the subset will likely be even more efficient in removing the irregular variation, but will tend also to destroy any seasonality still present.
If the analyst's intention is to deseasonalize a time series, a number of moving-average elements in the neighborhood of eleven to thirteen is called for. An odd number of elements is more easily handled than is an even number due to the need to center the moving averages relative to the object series. Also, an appropriately-designed weighting scheme applied to the elements of the moving average may serve to improve the efficiency of the seasonality removal process.
Unweighted Moving Average Models
We shall designate all unweighted moving average models with number of elements to be specified by the analyst as Class U.k models. The general form of the unweighted, centered moving average model with an odd number of subset elements may be specified as,Model U.k:
j from t-((k-1)/2) to t+((k-1)/2),
where y is an observation in the original series at row t, k is the number of elements in the moving average, and j is the subset element counter.
Subjectively-Designed Weighting Factors
To this point we have made only passing references to the possibility of applying weighting factors to the elements of the moving average subset. If no explicit weights are used, then implicit weights of unity (value 1) are applied to each element in the subset, and the sum of the subset values must be divided by the sum of the weights (the number of elements times the weight of each) in computing each average.The analyst may choose to use subjectively-determined, non-unitary weights to be applied to the subset elements in computing the averages. A typical scheme is to design the element weighting system so that the sum of the weights is unity (or 100 percent). In this case, each element is multiplied by its assigned fractional (or decimal value) weight, and it is unnecessary to divide the sum of the weighted values by the sum of the weights in order to compute the average, unless toward the end of the series the number of elements is diminishing.
For our purposes, all weighted moving average (WMA) models where the analyst both specifies the number of elements and subjectively determines the weights will be designated as Class W.k models. The general format of the Class W.k models may be specified as,
Model W.k:
from t-((k-1)/2) to t+((k-1)/2),
p from 1 to k,
where W is an element weighting factor applied to the jth element in the moving average, and p is the element counter subscript.
Computed Weighting Systems
Instead of subjectively designing a set of weighting factors, the analyst might opt for any of several commonly-used computed weighting systems. One such system was designed around the turn of the twentieth century by an actuary, J. Spenser, to smooth insurance policy-holder data so that insurance companies could devise premium rate structures associated with policy-holder age. The so-called Spenser Weighted Moving Average technique has been used extensively in a wide variety of applications, and continues to be used today. We shall not in this text attempt to specify the formulae for generating a set of Spenser weights; the interested reader may consult Lawrence Salzman's Computerized Economic Analysis (McGraw-Hill, 1968) for a full exposition of the method.
Exponentially-Weighted Moving Averages
The moving average models described thus far may be satisfactory for treatment of a time series which, if divided into subsets, would exhibit approximately the same mean for each subset as for the series as a whole. Consistently changing means from one subset to the next would imply the presence of a gradual trend factor.Suppose, however, that some outside influence has affected the data for the series so that consecutive subsets of the series would exhibit significantly different means which appear to have little or no relationship to each other. Exponentially weighted moving average (EWMA) models provide some ability to adapt to the changing-mean phenomenon. The basic form of an EWMA model is,
Model E.1:
Model E.2:
The EWMA process smooths seasonal and irregular variation out of an original series, and may cause loss of some of the trend/cyclical variation as well. For forecasting purposes, it may be desirable to modify the EWMA model to avoid the trend loss and attempt to account for seasonal behavior. While such modifications are feasible, there appears to be no way to make an EWMA forecasting model account for cyclical variation.
Moving Averages as Forecasting Models
Any of the moving average routines described in this section may be used as forecasting models with a variable forecasting gap (i.e., lag between the value forecasted and the base value upon which it is constructed). Using the symbol i for the forecast gap, t for the subscript of the observation upon which the forecast is based, y to represent the forecasted value of the original series, and MA to represent any of the moving averages described in this chapter, the forecast model may be specified as
or if seasonality is thought to be present in the series being forecasted,
What's Ahead
This brings us to an end of our survey of naive time series forecasting techniques. A combination of these techniques, time-series decomposition, is described in Chapter 6. Another more sophisticated time series forecasting technique, ARIMA modeling, is described in Chapter 8. These and similar techniques, together with the capability of simulation modeling, may provide the informational basis for reducing decision risk in both strategic planning and tactical decision making. We now shift to econometric forecasting techniques in Chapter 5.<>
The term "econometric" refers to the application of statistical regression techniques to the process of economic modeling. The purpose of the regression analysis is to estimate the values of the parameters in a model which best fits the characteristics of a phenomenon which is the object of analysis.
Our approach to the forecasting capabilities of regression analysis is purely from a user perspective. Being a user of statistics requires knowing what a statistical procedure is supposed to do, how to enter data into its computational routine, how to capture the computed results, and how to interpret the results to give them contextual meaning. The subject of this chapter addresses only the user requirements; one who is interested in the mathematical theory behind the regression analysis should consult a statistics text. Our intent here is to examine its application only to the forecasting context.
We shall examine simple and multiple regression forecasting models in a subsequent section. Our initial task is to develop the concept of univariate regression, and demonstrate its applicability to forecasting. Within a time-series framework, a univariate regression analysis may be contrived from a simple regression context by (a) introducing an artificial independent variable associated with the sequence of observations in the object series, or (b) by deriving a series of observations from the object or dependent variable series to serve as independent variable. The former is referred to as trend regression; the latter as auto- (or self-) regression.
Trend Regression
In trend regression, the independent variable is taken to be the observation counter or any linearly-increasing numeric observation identifier. Suppose that for the time-series context we settle upon the convention of using the row-wise observation counter, represented by the symbol t. The trend regression model may then be represented in functional-notation format as
The perceptive reader may object that this is essentially a standard simple regression model into which a sequential observation identifier has been inserted as independent variable. While this is of course correct, it is none-the-less also true that no information other than the object series and its observation-sequence is necessary in order to accomplish the trend regression.
The sequence plot illustrated in Figure 1-1 of Chapter 1 exhibits a great deal of "scatter" about a gently upward-sloping path from left to right as time "passes" on the horizontal axis. If the plotted points are connected in sequence, the emerging line has a jagged appearance due to the presence of random noise within the series. If one were seeking a mathematical function to represent the behavior of the data series over time, i.e., the trend behavior of the series, it would be very difficult (likely impossible) to construct a single equation for the jagged line. Alternately, it is possible to draw a smooth line, free-hand style, through the data plot. The smooth line might be straight or curvilinear, but in either case it is easier to devise or construct a mathematical function to represent it than one to represent the jagged line formed by sequentially connecting the plotted points.
Depending upon the amount of variation in the object series, a trend regression equation may be able to estimate, with some error, values of existing observations within the series, and to predict values of hypothetical observations beyond the end of the series. This latter possibility constitutes the potential of trend regression to serve as a forecasting technique. Given such a trend regression equation developed from a "least-squares" regression procedure on data for a time series, future values of the series may be forecasted by inserting the observation counter corresponding to the target date into the regression equation and solving for the dependent variable value. The error involved in estimating or predicting such values constitutes a potentially serious problem, especially if there is much cyclical, seasonal, or purely random variation present within the series.
Trend regression used for forecasting purposes can attempt to account only for the long-term average change in a series. Trend regression by itself cannot account for any cyclical, seasonal, or random variation present in the historical data series. Forecasts made with the trend regression equation will thus diverge from the actual value of the series when it does occur by the amount of seasonal, cyclical, and irregular influence.
The Autoregressive Model
An alternative regression approach is based on the premise that each observation in a time series is related in a consistent and identifiable way to one or more previous observations of the same series. In other words, the best predictor of any particular observation of a time series may be some earlier value(s) of the same series. The simple statistical regression model may be employed to try to discover such a relationship if it exits. With the object series as dependent variable, the approach is to generate an independent variable series from the object series by shifting the dependent variable data downward in the data matrix by the number of rows corresponding to the required order of autoregression in order to compose the independent variable data series. The form of such a relationship may be expressed in functional-notation format as
The autoregressive model may employ the structure of the multiple regression model to add terms for successively earlier observations of the object series. These will serve as independent variable values relative to each observation of the object series. The general form of the kth-order auto-regressive model is
An autoregressive model within which the parameter values are neither zeros nor exceed unity in absolute value may have forecasting potential. In order to use such an autoregressive model for forecasting purposes, the analyst needs to know only the values of the n observations prior to the forecast target value. A forecast gap may be built into the relationship between the object-series dependent variable and its autoregressive terms. Once the appropriate prior-period values of the series are known, they may be entered into the autoregressive model so that it may be solved for the predicted value of the dependent variable. The final step in the process is then to assess the level of confidence which can be placed in the forecasts so constructed.
Multiple Regression Models
Some series can be adequately forecasted with reference to trend or earlier values of the same series. But other series can be forecasted only inadequately in this manner. As we noted above, there are two possibilities for these series: either they are characterized so extensively by random noise that they are unforecastable, or there are one or more other phenomena which govern or influence the behavior of the series. If comparable time series for these other phenomena can be acquired, then conventional simple or multiple regression procedures may be implemented to model and forecast the object series.Once a multiple regression model has been specified and the parameter values estimated, the analyst may discern the predictive ability of each of the included independent variables by examining the inference statistics for each of them. Any independent variable which in the judgment of the analyst does not make a satisfactory contribution to the explanation of the behavior of the dependent variable series may then be deleted from the model when the model is respecified.
Some statistical software packages include options for stepwise deletion of inadequately contributing independent variables from the model according to some criterion specified by the programmer or the analyst. In the stepwise regression procedure, the full model including all variables selected by the analyst is first estimated. Then the model is automatically respecified in subsequent steps, omitting one variable in each step, until only one independent variable remains in the model. The analyst may then inspect the sequence of model specifications, looking for a significant drop in the overall level of explanation of the behavior of the dependent variable. Once this loss is identified, the model specified prior to the deletion of the independent variable resulting in the significant loss is the optimal model.
Non-linear Regression Models
The simple regression model, linear in its equation (2) format, can be extended to the nonlinear forms of exponential and geometric relationships by use of the logarithmic transformation. The exponential form,
As we have already shown in regard to autoregression, the multiple regression model can be extended to other contexts. Another potentially productive extension of multiple regression is into the realm of the polynomial relationship. The polynomial equation includes one independent variable raised to successively higher powers. For example, a quadratic polynomial equation includes linear and second-order (or squared) terms in the format
The analyst should consider a polynomial form of relationship when the scatter diagram exhibits a curved path which is not apparently amenable to exponential or geometric modeling. As a general criterion, the analyst should specify polynomial equation order k which is equal to the number of directions of curviture apparent in the scatter diagram, plus 1. For example, if the scatter diagram exhibits one direction of curvature, then a k=2, or second-order regression model should be specified. If the scatter diagram exhibits two directions of curviture, a k=3 or third-order (cubic) model of form
Finally, we should note that the multiple regression format can accommodate a mixture of all of the formats described to this point. For example, suppose the analyst finds that trend is a significant predictor of the behavior of the object series, but that the explanation needs to be supplemented by the presence of two other independent variables, x1 and x2, the first linear and the other in a second-order relationship. Such a regression model might have the form
Inferences about the Regression Model
Regression analysis purports to provide answers to a very specific question: "What is the nature of the relationship between the dependent variable and an independent variable?" The question is answered by estimating values of the parameters in a best-fit equation. But regression analysis begs two other very important questions: "Is there a significant relationship between the selected variables?" and, if so, "How strong (or close, or reliable) is the relationship?" If there is no significant relationship, or even if the existing relationship is only trivial, an automated regression analysis will dumbly estimate the values of the parameters. It is therefore necessary to delve into the significance of the estimated relationships. The existence and significance of an hypothesized relationship should perhaps be brought into question even before the regression analysis is conducted. It is the purpose of statistical inference analysis to assess the strength or quality of an estimated regression model.Two statistics conventionally computed in inference analysis when regression models are estimated, the mean squarred error and the standard error of the estimate, may be used for comparing regression forecasting models with the naive models described in other sections in this chapter and for specification of forecast confidence intervals (e.g., 95 and 99 percent).
What's Ahead
This brings us to an end of our survey of econometric forecasting techniques. Regression techniques are employed time-series decomposition, as described in Chapter 6. Combined autoregressive and moving average forecasting techniques are described in Chapter 8. Regression techniques, together with the capability of simulation modeling, may provide the informational basis of both strategic planning and tactical decision making.<>
In this chapter we describe a forecasting approach, time series decomposition, which brings together approaches introduced in Chapters 3 and 4.
Time Series Decomposition
Pioneering development of the techniques for decomposing a time series into constituent components was conducted for and by the U. S. Bureaus of Census and Labor Statistics during the first half of the twentieth century. The object of such work was the seasonal adjustment of time series data. A beneficial spin-off has been the ability to analyze cyclical behavior. The two approaches emerging from the early pioneering work are today formally known as the Census Method II (in several variants) and the BLS method. The two methods are similar except in the ways in which they isolate one of the components. The ensuing discussion in this appendix follows the procedures of the BLS Method, but we will note how the Census Method II accomplishes the same end. While the objective of governmental agencies may be the seasonal adjustment of time series data, the techniques which have been developed can also constitute a powerful approach to forecasting time series behavior.
Components of a Time Series
The fundamental underlying assumption of this approach is that every time series is composed of a number of component parts which are in some way related to one another or the whole. As described in Chapter 2, the conventionally defined component parts are trend (T), cyclical (C), seasonal (S), and irregular (I) or random variation. It is possible that these four components are each independent of all of the others, so that the behavior of the series is simply the sum of its parts which are additively related. The majority of analysts familiar with the approach seem to be of the opinion that the component parts are unlikely to be perfectly independent of one another, and are therefore multiplicatively related.Perhaps the easiest way to explain the process of time series decomposition is to describe the way in which a time series, known as the object series, may be decomposed into the component parts. From an object series written to column 1 of a data matrix are generated six additional series, four of which are the T, C, S, and I components. The seven series taken together constitute a "decomposition matrix."
Decomposition Techniques
The techniques used to decompose the object series are trend regression and "ratio to moving average" computations as developed by the Department of Commerce and the Bureau of the Census. The objective of these techniques is to identify both seasonality and cyclical behavior so that the former may be removed and the latter isolated for other purposes. Once the components of the time series have been separated into their own series, the reverse of the decomposition process, or recomposition, may be employed to construct forecasts of future values of the object series.Before the decomposition process is started, the analyst should generate a sequence plot to identify the range over which trend is unidirectional. The trend estimation should then be conducted only over this range. The steps employed in the decomposition process are as follows:
a. The original or object series, regarded as containing all four components which are assumed to be multiplicatively related, i.e., OBJECT = T x C x S x I, is written to column 1 of the decomposition matrix.
b. A second series, written to column 2 of the decomposition matrix, is generated by smoothing the object series with a centered moving average. This is the first of two smoothing stages. If the smoothing is to be done by moving average, it is conventional to specify from 12 to 15 elements in the moving average set. The elements may be unweighted or weighted (user specified), or the user may choose a computed weight set. The problem of loss of data at the end of a centered moving average series may be handled by letting the number of elements in the set diminish to the number of remaining rows as the end of the series is approached. The interpretation is that seasonal and irregular influences are smoothed from the object series, leaving a combination TxC series which is written to column 2 of the decomposition matrix. The column 2 entries are numbers of the same magnitude as the column 1 object series.
c. A third series, written to column 3 of the matrix, is generated by computing the ratios of entries in column 1 (the object series, TxCxSxI) to the corresponding entries in column 2 (TxC), which by cancellation (or division) leaves a combination SxI series. This process is thus the basis for the name of the technique, "ratio-to-moving average." The column 3 entries are index numbers which vary about unity (1).
d. A fourth series, written to column 4 of the decomposition matrix, is generated by smoothing the SxI series in column 3 to eliminate the irregular influences, leaving an isolated seasonal series, S. This is the second-stage smoothing process. If the analyst opts for a moving average to accomplish the second-stage smoothing, it is conventional to employ about half as many elements as in the first-stage moving average. Again, the elements may be unweighted or weighted as specified by the user. All entries in the S series are totaled and averaged by months to constitute a set of twelve seasonal adjustment factors. The column 4 entries are index numbers which vary about unity.
e. A fifth series, written to column 5 in the decomposition matrix, is generated by computing the ratios of entries in column 3 (SxI) to the corresponding entries in column 4 (S), which by cancellation (or division) yields an isolated irregular, I, series. The column 5 entries are index numbers which vary about unity. (The Census Method accomplishes this by dividing a seasonally-adjusted original series, TxCxI, by a trend-cycle series, TxC, thus isolating the irregular component.)
f. A sixth series, written to column 6 in the decomposition matrix, is generated by trend regression on the unidirectional range of the object series. This series is interpreted as an isolated T series. The column 6 entries are numbers of the same magnitude as the object series numbers.
g. Finally, a seventh series, written to column 7 of the decomposition matrix, is generated by computing the ratios of the entries in the second column (TxC) by the corresponding entries in the sixth column (T), which by cancellation yields an isolated cyclical, C, Series. The column 7 entries are index numbers which vary about unity.
Table 6-1 contains a decomposition matrix resulting from applying these procedures to series Y1 (introduced in Chapter 1) as the object series (the computations were done by the author's proprietary software). Columns 4 and 7 exhibit clearly-defined seasonal and cyclical patterns, respectively, and that column 5 contains few runs of numbers above or below unity. Therefore, it may be judged that the smoothings employed in this decomposition were relatively effective.
Table 6-1. Time series decomposition matrix for Series Y1.
TIME SERIES DECOMPOSITION ANALYSIS DECOMPOSITION OF Y1 SERIES, 12 & 5 ELEMENT UNWEIGHTED MOVING AVERAGE THE CALCULATED LINEAR TREND EQUATION IS: Y = 228.3455 + 0.4058 * X ORIGINAL 12 MONTH RATIO 5 MTH RATIO REGRESS RATIO SERIES MV.AV. (1)/(2) MV.AV. (3)/(4) EST (1) (2)/(6) X DATE TxCxSxI TxC SxI S I T C (1) (2) (3) (4) (5) (6) (7) 1 1 233.4000 2 2 236.6000 3 3 239.8000 4 4 242.7000 5 5 245.2000 6 6 252.2000 7 7 252.3000 246.7080 1.0227 8 8 252.2000 248.0420 1.0168 9 9 251.1000 247.4250 1.0149 1.0199 0.9951 231.9980 1.0665 10 10 251.1000 246.5670 1.0184 1.0211 0.9973 232.4030 1.0609 11 11 252.2000 245.6170 1.0268 1.0226 1.0041 232.8090 1.0550 12 12 251.7000 244.6580 1.0288 1.0087 1.0199 233.2150 1.0491 13 1 249.4000 243.5250 1.0241 0.9950 1.0293 233.6210 1.0424 14 2 229.2000 242.4750 0.9453 0.9820 0.9625 234.0260 1.0361 15 3 229.5000 241.5500 0.9501 0.9714 0.9781 234.4320 1.0304 16 4 231.3000 240.4580 0.9619 0.9665 0.9952 234.8380 1.0239 17 5 233.7000 239.6000 0.9754 0.9791 0.9962 235.2440 1.0185 18 6 238.6000 238.6000 1.0000 0.9929 1.0072 235.6500 1.0125 19 7 239.7000 237.8170 1.0079 1.0014 1.0065 236.0550 1.0075 20 8 241.1000 236.5420 1.0193 1.0096 1.0096 236.4610 1.0003 21 9 238.0000 236.9500 1.0044 1.0121 0.9924 236.8670 1.0004 22 10 240.8000 236.9670 1.0162 1.0148 1.0013 237.2730 0.9987 23 11 240.2000 237.1580 1.0128 1.0084 1.0044 237.6780 0.9978 24 12 242.3000 237.1920 1.0215 1.0051 1.0163 238.0840 0.9963 25 1 234.1000 237.1750 0.9870 0.9961 0.9909 238.4900 0.9945 26 2 234.1000 236.9500 0.9880 0.9911 0.9968 238.8960 0.9919 27 3 229.7000 236.5420 0.9711 0.9852 0.9857 239.3020 0.9885 28 4 233.6000 236.4670 0.9879 0.9901 0.9978 239.7070 0.9865 29 5 234.1000 236.0250 0.9918 0.9943 0.9976 240.1130 0.9830 30 6 238.4000 235.6750 1.0116 1.0007 1.0108 240.5190 0.9799 31 7 237.0000 234.8830 1.0090 1.0048 1.0042 240.9250 0.9749 32 8 236.2000 235.4330 1.0033 1.0055 0.9977 241.3300 0.9756 33 9 237.1000 235.2000 1.0081 1.0020 1.0060 241.7360 0.9730 34 10 235.5000 236.5000 0.9958 0.9953 1.0004 242.1420 0.9767 35 11 236.0000 237.4080 0.9941 0.9956 0.9985 242.5480 0.9787 36 12 232.8000 238.6580 0.9755 0.9863 0.9890 242.9540 0.9823 37 1 240.7000 239.6000 1.0046 0.9902 1.0145 243.3590 0.9846 38 2 231.3000 240.5250 0.9616 0.9930 0.9684 243.7650 0.9867 39 3 245.3000 241.5580 1.0155 1.0025 1.0130 244.1710 0.9893 40 4 244.5000 242.5580 1.0080 1.0058 1.0022 244.5770 0.9917 41 5 249.1000 243.5500 1.0228 1.0154 1.0073 244.9820 0.9942 42 6 249.7000 244.5670 1.0210 1.0143 1.0066 245.3880 0.9967 43 7 248.1000 245.7170 1.0097 1.0142 0.9956 245.7940 0.9997 44 8 248.6000 246.1250 1.0101 1.0096 1.0005 246.2000 0.9997 45 9 249.1000 247.2580 1.0074 1.0054 1.0020 246.6060 1.0026 46 10 247.4000 247.5080 0.9996 1.0019 0.9976 247.0010 1.0020 47 11 248.2000 248.1250 1.0003 0.9969 1.0035 247.4170 1.0029 48 12 246.6000 248.5250 0.9923 0.9913 1.0010 247.8230 1.0028 49 1 245.6000 249.4080 0.9847 0.9896 0.9951 248.2290 1.0048 50 2 244.9000 249.9920 0.9796 0.9901 0.9894 248.6340 1.0055 51 3 248.3000 250.5920 0.9909 0.9933 0.9975 249.0400 1.0062 52 4 251.9000 251.1500 1.0030 1.0026 1.0004 249.4460 1.0068 53 5 253.9000 251.8250 1.0082 1.0083 0.9999 249.8520 1.0079 54 6 260.3000 252.4000 1.0313 1.0119 1.0192 250.2580 1.0086 55 7 255.1000 253.0420 1.0081 1.0125 0.9957 250.6630 1.0095 56 8 255.8000 253.5830 1.0087 1.0114 0.9974 251.0690 1.0100 57 9 255.8000 254.2670 1.0060 1.0052 1.0008 251.4750 1.0111 58 10 255.5000 254.7670 1.0029 1.0028 1.0001 251.8810 1.0115 59 11 255.1000 255.0500 1.0002 0.9987 1.0015 252.2860 1.0110 60 12 254.3000 255.2580 0.9962 0.9957 1.0006 252.6920 1.0102 61 1 252.1000 255.1830 0.9879 0.9940 0.9939 253.0980 1.0082 62 2 253.1000 255.3830 0.9911 0.9935 0.9976 253.5040 1.0074 63 3 254.3000 255.7420 0.9944 0.9946 0.9998 253.9100 1.0072 64 4 255.3000 255.8670 0.9978 0.9997 0.9980 254.3150 1.0061 65 5 256.4000 255.9750 1.0017 1.0026 0.9990 254.7210 1.0049 66 6 259.4000 255.8500 1.0139 1.0068 1.0070 255.1270 1.0028 67 7 257.5000 256.0920 1.0055 1.0082 0.9973 255.5330 1.0022 68 8 260.1000 256.1920 1.0153 1.0083 1.0069 255.9380 1.0010 69 9 257.3000 256.1000 1.0047 1.0033 1.0014 256.3440 0.9990 70 10 256.8000 256.2580 1.0021 1.0027 0.9995 256.7500 0.9981 71 11 253.6000 256.4000 0.9891 0.9966 0.9925 257.1560 0.9971 72 12 257.2000 256.6420 1.0022 0.9915 1.0108 257.5620 0.9964 73 1 253.3000 257.1830 0.9849 0.9900 0.9948 257.9670 0.9970 74 2 252.0000 257.3830 0.9791 0.9913 0.9877 258.3730 0.9962 75 3 256.2000 257.5250 0.9949 0.9913 1.0036 258.7790 0.9952 76 4 257.4000 258.1330 0.9956 0.9992 0.9964 259.1850 0.9959 77 5 259.3000 258.7670 1.0021 1.0033 0.9988 259.5900 0.9968 78 6 265.9000 259.5250 1.0246 1.0051 1.0193 259.9960 0.9982 79 7 259.9000 260.0830 0.9993 1.0083 0.9910 260.4020 0.9988 80 8 261.8000 260.7000 1.0042 1.0096 0.9946 260.8080 0.9996 81 9 264.6000 261.5700 1.0116 1.0046 1.0069 261.2140 1.0014 82 10 264.4000 262.1670 1.0085 1.0052 1.0033 261.6190 1.0021 83 11 262.7000 262.8130 0.9996 1.0055 0.9941 262.0250 1.0030 84 12 263.9000 263.3140 1.0022 1.0034 0.9988 262.4310 1.0034 Table 6-1, continued. AVERAGE SEASONAL ADJUSTMENT FACTORS: MONTH SEASONAL 1 0.9921 2 0.9898 3 0.9894 4 0.9937 5 1.0002 6 1.0050 7 1.0079 8 1.0087 9 1.0072 10 1.0059 11 1.0031 12 0.9971 FOR ORIGINAL SERIES, MEAN SQUARED ERROR IS: 5.8641 STANDARD ERROR OF THE ESTIMATE: 2.4216
Forecasting by Recomposition
Given the data in columns 4 through 7 of the decomposition matrix, the seasonal adjustment factors computed from column 5, and the trend (or multiple) regression equation which generated column 6, a forecast may be constructed. The primary technique of decomposition was division; the primary technique of forecasting then is the opposite, recomposition by multiplication:1. A trend estimate (T) of the value of the object series in the target month is made by entering the target month row number into the trend regression equation as described in Chapter 4; this is a "first-approximation" forecast.
2. The trend estimate is multiplied by a cyclical adjustment factor which is either entered by the analyst or selected from the values in column 7. In the latter case, the analyst exercises judgment to pick cyclical adjustment factors which are like what the analyst expects to be the cyclical characteristics of the target month. The resulting product is a trend-cycle (TxC) forecast.
3. The TxC forecast is multiplied by the correct seasonal adjustment factor, given the identity of the target month. The result (TxCxS) is a trend-cycle forecast to which a seasonal multiplier has been applied.
4. Finally, the TxCxS forecast may be multiplied by an irregular adjustment factor if the user has reason to anticipate any unusual condition or event which will affect the situation. The I multiplier could be selected from a row of column 5 which the analyst thinks likely to be similar in properties to the target period. The final product is a TxCxSxI forecast.
Table 6-2 shows recomposition forecasts for months 87 through 92 for the original object series Y1 which ended at month 84.
Table 6-2. Recomposition Forecasts for Series Y1.
TREND ADJUSTMENTS REVISED FORECAST SEASON CYCLIC IRREG FORECAST X MONTH (T) (S) (C) (I) (STCI) 87 3 263.6482 0.9894 0.9846 1.0100 259.3816 88 4 264.0540 0.9937 0.9867 1.0120 261.9989 89 5 264.4598 1.0002 0.9893 1.0030 262.4561 90 6 264.8656 1.0050 0.9917 0.9910 261.6063 91 7 265.2713 1.0079 0.9942 0.9860 262.0808 92 8 265.6771 1.0087 0.9967 0.9850 263.0751
Assessment
Time series recomposition is a complex albeit still naive approach to time series forecasting. It may appear to be a purely technical approach to forecasting a value at a date beyond the end of the original series; however, to compose a successful forecast the analyst must exercise a great deal of judgment in specifying weights and choosing adjustment factors.A cautionary word is in order. One who would use time series recomposition procedures as a means for forecasting a time series must of necessity become a student of the history of the period covered by the original time series. It is only with an intimate historical knowledge of the period that the analyst can hope to make appropriate adjustments to the trend estimate of the target date value so as to compose an accurate forecast.
What's Next
The next two chapters bring together regression and moving average technieques into an integrated approach to forecasting time series.<>
Autocorrelation
In any time series containing non-random patterns of behavior, it is likely that any particular item in the series is related in some fashion to other items in the same series. If there is a consistent relationship between entries in the series, e.g., the 5th item is like the 1st, the 6th is like the 2nd, and so on, then it should be possible to use information about the relationship to forecast future values of the series, i.e., the 33rd item should be like the 29th. In this case we may say that the series has some ability to forecast itself because of autocorrelation (or self-correlation) among values within the series.
Autocorrelation Coefficients
One means for identifying and assessing the strength of autocorrelation in a time series is to compute the coefficients of autocorrelation between pairs of entries within the series. If the analyst is interested in the autocorrelation between adjacent entries, the autocorrelation should be specified to order k=1. For the correlation between every other entry in the series, the autocorrelation should be specified to order k=2. Autocorrelation order k=3 would be for the correlation between each entry and the third from it in the series, and so on.The formula for computing such autocorrelations is
The Correlogram
In a statistical package designed for time series analysis, the user can specify some high order of autocorrelation, say k=20, for which autocorrelation coefficients are desired. Once the autocorrelation coefficients are computed, they may be plotted against the order k to constitute a simple autocorrelation correlogram such as that illustrated in Figure 7-1 for Series Y1. Before considering an interpretation of the information contained in the simple correlogram, let us develop the idea for a second type of correlogram.Figure 7-1. Simple autocorrelation correlogram. SIMPLE -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 .8858 I : I : * I 2 .8121 I : I : * I 3 .7008 I : I : * I 4 .6207 I : I : * I 5 .5555 I : I : * I 6 .5249 I : I : * I 7 .4768 I : I : * I 8 .4508 I : I : * I 9 .4636 I : I : * I 10 .4794 I : I : * I 11 .5186 I : I : * I 12 .5276 I : I : * I 13 .5171 I : I : * I 14 .4436 I : I : * I 15 .3853 I : I : * I 16 .3105 I : I :* I 17 .2687 I : I * I 18 .2243 I : I * I 19 .1862 I : I *: I 20 .1592 I : I * : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0
A partial autocorrelation coefficient for order k measures the strength of correlation among pairs of entries in the time series while accounting for (i.e., removing the effects of) all autocorrelations below order k. For example, the partial autocorrelation coefficient for order k=5 is computed in such a manner that the effects of the k=1, 2, 3, and 4 partial autocorrelations have been excluded. The partial autocorrelation coefficient of any particular order is the same as the autoregression coefficient (described in Chapter 4) of the same order. Figure 7-2 illustrates a partial autocorrelation correlogram for Series Y1.
Figure 7-2. Partial autocorrelation correlogram. PARTIAL -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 .8858 I : I : x I 2 .1278 I : I x: I 3 -.1878 I :x I : I 4 .0352 I : Ix : I 5 .0811 I : I x : I 6 .1222 I : I x : I 7 -.0826 I : x I : I 8 .0297 I : Ix : I 9 .2613 I : I x I 10 .0875 I : I x : I 11 .0925 I : I x : I 12 -.0658 I : xI : I 13 -.0579 I : xI : I 14 -.2402 I x I : I 15 -.0432 I : xI : I 16 -.0119 I : x : I 17 .0542 I : Ix : I 18 -.0238 I : x : I 19 -.0684 I : xI : I 20 .0480 I : Ix : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0
Selection Criteria
Several criteria may be specified for choosing a model format, given the simple and partial autocorrelation correlograms for a series:(a) If none of the simple autocorrelations is significantly different from zero, the series is essentially a random number or white-noise series which is not amenable to autoregressive modeling.
(b) If the simple autocorrelations decrease linearly, passing through zero to become negative, or if the simple autocorrelations exhibit a wave-like cyclical pattern, passing through zero several times, the series is not stationary; it must be differenced one or more times before it may be modeled with an autoregressive process.
(c) If the simple autocorrelations exhibit seasonality, i.e., there are autocorrelation peaks every dozen or so (in monthly data) lags, the series is not stationary; it must be differenced with a gap approximately equal to the seasonal interval before further modeling.
(d) If the simple autocorrelations decrease exponentially but approach zero gradually, while the partial autocorrelations are significantly non-zero through some small number of lags beyond which they are not significantly different from zero, the series should be modeled with an autoregressive process.
(e) If the partial autocorrelations decrease exponentially but approach zero gradually, while the simple autocorrelations are significantly non-zero through some small number of lags beyond which they are not significantly different from zero, the series should be modeled with a moving average process.
(f) If the partial and simple autocorrelations both converge upon zero for successively longer lags, but neither actually reaches zero after any particular lag, the series may be modeled by a combination autoregressive and moving average process.
The simple autocorrelation correlogram for series Y1 exhibits a characteristic decline and approach toward zero, thus suggesting that Series Y1 can be modeled by an AR process of some order. The particular order can be inferred by counting the number of significantly non-zero partial autocorrelations, in this case 1. Therefore, Series Y1 should be modeled by an AR(1) equation.
What's Next
Techniques for combining autoregressive and moving average approaches are elaborated in Chapter 8 and its appendix.<>
Integrating Autoregressive with Moving Average Approaches
In Chapter 5 we introduced moving average models and in Chapter 6 we showed how multiple regression analysis could be extended to the autoregressive context. These two apparently separate modeling approaches are encompassed by a class of models known more generally as ARMA or ARIMA models. The process of ARIMA modeling serves to integrate the two approaches to modeling.Theoretically, any time series that contains no trend or from which trend has been removed can be represented as consisting of two parts, a self-deterministic part, and a disturbance component. The self-deterministic part of the series should be forecastable from its own past by an autoregressive (AR) model with some number of terms, p, of the form
An autoregressive model of order p is conventionally classified as AR(p). A moving average model with q terms is classified as MA(q). A combination model containing p autoregressive terms and q moving average terms is classified as ARMA(p,q). If the object series is differenced d times to achieve stationarity, the model is classified as ARIMA(p,d,q), where the symbol "I" signifies "integrated." An ARIMA(p,0,q) is the same as an ARMA(p,q) model; likewise, an ARIMA(p,0,0) is the same as an AR(p) model, and an ARIMA(0,0,q) is the same as an MA(q) model.
Various approaches have been developed for ARIMA modeling. The procedure which has become the standard for estimating ARIMA models was proposed by G. E. P. Box and G. M. Jenkins (Time Series Analysis, Forecasting, and Control, San Francisco, Holden Day, 1970). The procedure involves making successive approximations through three stages: identification, estimation, and diagnostic checking.
Stages in the Analysis
In the identification stage, the analyst's job is to ensure that the series is sufficiently stationary (free of trend and seasonality), and to specify the appropriate number of autoregressive terms, p, and moving average terms, q. Statisticians have developed the concept of the autocorrelation correlogram (introduced in Chapter 2) to serve as the basis for judgment of the stationarity of the series and to provide criteria for specification of p and q.In the estimation stage, the analyst's job is to estimate the parameters (coefficient values) of the specified numbers, p and q, of autoregressive and moving average terms. This is usually accomplished by implementing some form of regression analysis.
Once the parameter values of the specified model have been estimated, the third stage of diagnostic checking is undertaken. The objective of diagnostic checking is to ascertain whether the model "fits" the historical data well enough. To accomplish diagnostic checking, the model is used to forecast all of the extant values in the series. The model is judged to fit the series well if the differences between the actual series values and the forecasted values are small enough and sufficiently random. If the differences (also known as "residuals") are judged not to be sufficiently small or random, they may contain additional information which, if captured by further analysis, can enhance the forecastability of the model. To capture the additional information, the analyst must "return to the drawing board" to respecify the model and reestimate the parameters.
If an adequate model cannot be specified solely in terms of autoregressive and moving average terms, the analyst may resort to inclusion of other variables in the model as described in the discussion of multiple regression models in Chapter 5. Indeed, in the eyes of Rational Expectation theorists, the analyst would be remiss if other variables were not included to encompass all available information.
Criteria for Identification
The autocorrelation correlograms introduced in Chapter 7 may serve as criteria to judge:a. whether a series is sufficiently stationary;
b. whether the appropriate model is an AR, an MA, or some combination; and
c. what order, p or q, of AR or MA model will likely best fit the data.
The analyst should proceed by constructing both simple and partial autocorrelation correlograms for the object series to some generous order, k, perhaps between 12 and 20. If upon inspection of the simple autocorrelation correlogram the analyst notes that all of the simple autocorrelations lie within the selected confidence interval, the analyst should conclude that the series is essentially a random number series which is not amenable to either AR or MA modeling. The analyst may then resort to multiple regression modeling as espoused by Rational Expectations theorists in expectation of identifying other data series which contain information useful in predicting the behavior of the object series.
Assuming that all of the simple autocorrelations do not fall within the selected confidence interval, there may still be a possibility of establishing an effective ARIMA model. If the simple autocorrelations start from quite high levels and descend linearly, eventually passing through zero to become negative, the implication is that the series is non-stationary, i.e., that it contains a significant trend component. This possibility may be confirmed by generating a sequence plot as described in Chapter 1, or by conducting a time series decomposition as described in Chapter 6.
If the analyst wishes to continue in applying ARIMA techniques to the object series, it must first be converted to a stationary series. Stationarity is usually accomplished by differencing the series, i.e., by performing a difference transformation upon the original series. The differenced series is a new series consisting of the increments between consecutive items in the original series. The differenced series is less likely to exhibit trend than is the original series. Correlograms should then be constructed for the differenced series. If the simple autocorrelations for the differenced series still exhibit nonstationarity (rare in economic data), it may be subjected to a second differencing. The analyst should continue with the differencing process until satisfied that the resulting series is sufficiently free of trend (or stationary) before proceeding to the specification of an ARIMA model.
In a series containing seasonal behavior, the simple or partial autocorrelations exhibit spurts of values at the seasonal intervals (e.g., every fourth autocorrelation in quarterly data, or every twelfth autocorrelation in monthly data), even if the autocorrelations otherwise seem to converge upon zero. Before attempting to specify an ARMA model of a highly-seasonal series, the analyst should first difference the series with a gap equal to what appears to be the seasonal interval, then construct correlograms for the differenced series to confirm stationarity. After sufficient stationarity has been attained, the analyst may inspect the simple and partial correlograms to draw an inference about the appropriate form of the model. Fortunately, the simple and partial correlograms for a series which is best modeled by an AR process exhibit patterns which are practically the opposite of those for a series which is best modeled by moving averages.
Figure 8-1 shows the theoretical autocorrelation patterns for a series best modeled by autoregression. It should be noted that the simple autocorrelations gradually approach zero, but the partial autocorrelations are significantly non-zero to some point (in this case through k=2) beyond which they are not significantly different from zero. With experience the analyst will recognize that the series exhibiting correlogram patterns similar to those illustrated in Figure 8-1 should be modeled by an AR equation of degree indicated by the number of significantly non-zero partial autocorrelations. The series for which autocorrelations are illustrated in Figure 8-1 should then be modeled by an AR(2) equation.
Figure 8-1. Theoretical simple and partial autocorrelations for a series best modeled by an AR process.
SIMPLE -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 I : I : x I 2 I : I : x I 3 I : I : x I 4 I : I : x I 5 I : I x I 6 I : I x: I 7 I : I x : I 8 I : I x : I 9 I : Ix : I 10 I : Ix : I 11 I : x : I 12 I : x : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0 PARTIAL -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 I : I : * I 2 I : I : * I 3 I : * : I 4 I : * : I 5 I : * : I 6 I : * : I 7 I : * : I 8 I : * : I 9 I : * : I 10 I : * : I 11 I : * : I 12 I : * : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0
By way of contrast, a series which exhibits the opposite patterns for the simple and partial correlograms, as illustrated in Figure 8-2, should be modeled by a moving average process. In Figure 8-2, it should be apparent that the partial autocorrelations gradually decrease and approach zero, but the simple autocorrelations drop after one significantly non-zero value beyond which higher-ordered simple autocorrelations are not significantly different from zero. This series then is best modeled in an MA(1) specification, because there is one significantly non-zero simple autocorrelation.
Figure 8-2. Theoretical simple and partial autocorrelations for a series best modeled by an MA process.
SIMPLE -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 I : I : * I 2 I : * : I 3 I : * : I 4 I : * : I 5 I : * : I 6 I : * : I 7 I : * : I 8 I : * : I 9 I : * : I 10 I : * : I 11 I : * : I 12 I : * : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0 PARTIAL -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 I : I : x I 2 I : I : x I 3 I : I : x I 4 I : I : x I 5 I : I x I 6 I : I x: I 7 I : I x : I 8 I : I x : I 9 I : Ix : I 10 I : Ix : I 11 I : x : I 12 I : x : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0
We shall not illustrate a third case which can easily be described. If the simple and partial autocorrelations both gradually approach zero, but neither drops off to be insignificantly different from zero beyond some definite order, then a combination ARMA (or ARIMA if the series has been differenced) model should be specified.
The correlograms illustrated in Figures 8-1 and 8-2 are for theoretical autocorrelation distributions. Correlograms constructed for any real historical series should not be expected to follow precisely the theoretical patterns illustrated for the AR and MA models, but rather to exhibit some scatter about the theoretical paths. The confidence interval is therefore very important to the analyst in attempting to recognize when either the simple or the partial autocorrelations drop to values which are not significantly different from zero. The analyst may have to inspect correlograms for a great many real series in order to become comfortable in his ability to recognize patterns which are best modeled by AR, MA, or some combination specification. Although we have illustrated the 95-percent confidence interval in Figures 8-1 and 8-2, the analyst may feel more comfortable with some other confidence interval.
The theoretical autocorrelation patterns which we have been illustrating have all exhibited "positive autocorrelation" in the sense that observations have followed a pattern of "runs" of observations above and below the mean for the series. Each of the so-called "runs" consists of a number (greater than one) of observations above or below the mean, followed by some number of observations below or above the mean. We have not illustrated correlograms for series characterized by "negative autocorrelation" where successive observations regularly alternate above and below the mean, with no runs longer than single observations. Series characterized by negative autocorrelation are not as common in economic and business contexts as are positively autocorrelated series. Suffice it to note that the correlograms for a negatively autocorrelated series differ from those illustrated for positively autocorrelated series in that successive simple and partial autocorrelations alternate between positive and negative signs. Otherwise, they can be expected to converge upon zero in a similar manner to that in which stationary, positively autocorrelated series do.
The correlograms illustrated in Chapter 7 for Series Y1 suggest that Series Y1 should be modeled with an AR of order 2. In order to illustrate the Box-Jenkins approach to ARIMA modeling, we shall introduce a new series, Y2, for which data are specified in Table 8-1.
Table 8-1. Monthly data for series Y2
MONTH YEAR 1 YEAR 2 YEAR 3 YEAR 4 YEAR 5 YEAR 6 1 15.7 14.5 13.1 11.9 14.9 12.8 2 14.3 13.7 14.6 11.8 15.6 18.6 3 13.8 14.1 13.1 12.6 15.8 15.9 4 13.6 14.8 13.6 13.3 14.9 14.1 5 12.4 14.1 11.6 13.1 16.6 14.8 6 13.9 14.2 11.1 14.1 15.9 15.6 7 15.4 14.1 11.7 12.9 15.6 15.4 8 13.8 14.1 11.1 13.5 15.4 9 14.1 14.1 11.9 15.0 15.6 10 13.5 13.4 11.6 14.6 14.9 11 14.1 14.2 11.3 13.9 15.9 12 15.0 13.1 12.8 14.0 14.2
The Box-Jenkins Procedure for Estimating an ARIMA Model
1. The first step in the Box-Jenkins procedure is to generate a sequence plot and sufficiently high-order correlograms for the object series. Figure 8-3 shows the correlograms though k=12 for series Y2.Figure 8-3. Simple and partial autocorrelation correlograms for Series Y2.
SIMPLE -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 .6352 I : I : * I 2 .5243 I : I : * I 3 .6153 I : I : * I 4 .5530 I : I : * I 5 .5413 I : I : * I 6 .4279 I : I : * I 7 .3330 I : I : * I 8 .3313 I : I : * I 9 .3029 I : I :* I 10 .1803 I : I *: I 11 .1232 I : I * : I 12 .0731 I : I * : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0 PARTIAL -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 .6352 I : I : x I 2 .2026 I : I x: I 3 .3864 I : I : x I 4 .0831 I : I x : I 5 .1690 I : I x : I 6 -.1718 I : x I : I 7 .1331 I : x I : I 8 -.0908 I : x I : I 9 .0058 I : x : I 10 -.1518 I : x I : I 11 -.0347 I : xI : I 12 -.1015 I : x I : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0
2. If the sequence plot of the object series exhibits noticeable trend, seasonal,
or cyclical behavior, then differencing of the series is in order. Inspection of
the plots illustrated in Figure 8-3 suggest that series Y2 is adequately stationary,
so it was judged that no differencing of series Y2 is necessary. However, if a
differenced series still exhibits noticeable trend or cyclical behavior (rare in
economic and business time series), it should be differenced again to generate a
"second difference" series, and step 1 should be repeated. All further analysis
should be conducted on the series generated in the last differencing, but a forecast
made with a model specified for a differenced series will be for the change from the
last period of the original series, not for the level of the series.
3. The patterns of the correlograms generated in step 1 should constitute the basis for a judgment as to whether an AR, MA, ARMA, or ARIMA format will be most suitable for modeling the object series. It appears from the correlograms in Figure 8-3 that an ARMA specification of autoregressive order p=9 and moving average order q=3 will be appropriate as a first approximation.
4. Modern statistical packages such as SPSS and EViews contain automated routines for estimating ARIMA models. Figure 8-4 illustrates the EViews display for an ARIMA(9,0,3) regression model. The probability column in Figure 8-4 indicates that coefficients of terms AR(1) and AR(2) are statistically significant at the .05 level or below, but that the coefficient of term AR(3) is not. Nor are the coefficients of any autoregressive terms beyond AR(5). All three of the moving average terms are significant at the .05 level or below.
Figure 8-4. EViews estimate of ARIMA(9,0,3) model of Y2.
Dependent Variable: Y2 Method: Least Squares Sample(adjusted): 1984:10 1989:07 Included observations: 58 after adjusting endpoints Convergence achieved after 67 iterations Variable Coef. Std. Err. t-Stat. Prob. C 14.91734 1.579435 9.444736 0.0000 AR(1) -0.435004 0.217974 -1.995670 0.0520 AR(2) -0.501838 0.140726 -3.566066 0.0009 AR(3) 0.303713 0.233046 1.303234 0.1991 AR(4) 0.521362 0.145279 3.588703 0.0008 AR(5) 0.620274 0.122892 5.047310 0.0000 AR(6) 0.357173 0.224648 1.589921 0.1189 AR(7) 0.025419 0.207039 0.122776 0.9028 AR(8) -0.334811 0.187344 -1.787149 0.0807 AR(9) 0.132355 0.143150 0.924589 0.3601 MA(1) 0.796780 0.264760 3.009443 0.0043 MA(2) 1.424237 0.159093 8.952225 0.0000 MA(3) 0.750934 0.316381 2.373514 0.0219 R-squared 0.791949 Mean dependent var 13.99483 Adjusted R-squared 0.736469 S.D. dependent var 1.501511 S.E. of regression 0.770805 Akaike info criterion 2.511732 Sum squared resid 26.73628 Schwarz criterion 2.973555 Log likelihood -59.84021 F-statistic 14.27444 Durbin-Watson stat 1.893651 Prob(F-statistic) 0.000000
5. The model is then respecified to retain only independent variables AR(1), AR(2),
AR(4), AR(5), and all of the moving average terns through q=3. This model is
illustrated in Figure 8-5. The probability column in Figure 8-5 reveals that the AR(6),
MA(1), and MA(3) terms now are not statistically significant at the .05 level.
Figure 8-5. EViews estimation of ARIMA model of Y2 with selected (statistically significant) AR and MA terms.
Dependent Variable: Y2 Method: Least Squares Sample(adjusted): 1984:07 1989:07 Included observations: 61 after adjusting endpoints Convergence achieved after 14 iterations Variable Coef. Std. Err. t-Stat. Prob. C 14.30390 1.041311 13.73644 0.0000 AR(1) 0.641780 0.292500 2.194119 0.0327 AR(2) -0.598591 0.273481 -2.188785 0.0331 AR(3) 0.526515 0.194278 2.710116 0.0091 AR(5) 0.382142 0.183147 2.086528 0.0419 AR(6) -0.122431 0.146278 -0.836980 0.4064 MA(1) -0.311070 0.308225 -1.009230 0.3175 MA(2) 0.548049 0.283719 1.931663 0.0589 MA(3) 0.106984 0.234677 0.455877 0.6504 R-squared 0.619447 Mean dependent var 14.01639 Adjusted R-squared 0.560900 S.D. dependent var 1.474808 S.E. of regression 0.977276 Akaike info criterion 2.927358 Sum squared resid 49.66360 Schwarz criterion 3.238798 Log likelihood -80.28440 F-statistic 10.58038 Durbin-Watson stat 1.992037 Prob(F-statistic) 0.000000
The analyst should continue through subsequent rounds of respecification to remove
non-contributing terms (i.e., those which are statistically insignificant) until all of
the remaining terms in the model are statistically significant at the selected criterion
level (usually .05). This final model can then be used to forecast point estimates of the
original series (if no differencing was needed) or the change in the original series
if differencing was done), with appropriate confidence interval.
The model again is respecified to retain only independent variables AR(1), AR(2), AR(3), AR(5), and MA(2) as illustrated in Figure 8-6. In this model, all of the independent variable terms are statistically significant. The adjusted R-square increased from 0.5609 in the Figure 8-5 model to 0.5907 in the Figure 8-6 model. This final version is a "parsimonious" model because all terms which do not contribute significantly to the explanation of the dependent variable have been deleted, and the model retains only the minimal number of terms which are statistically significant explainers of the behavior of the dependent variable.
Figure 8-6. EViews estimation of Parsimonious ARIMA model of Y2 retaining only statistically significant terms.
Dependent Variable: Y2 Method: Least Squares Sample(adjusted): 1984:06 1989:07 Included observations: 62 after adjusting endpoints Convergence achieved after 12 iterations Variable Coef. Std. Err. t-Stat. Prob. C 14.51484 1.229312 11.80729 0.0000 AR(1) 0.359086 0.107763 3.332168 0.0015 AR(2) -0.869511 0.099188 -8.766287 0.0000 AR(3) 0.693095 0.104875 6.608793 0.0000 AR(5) 0.613063 0.114746 5.342776 0.0000 MA(2) 0.916126 0.070391 13.01490 0.0000 R-squared 0.624281 Mean dependent var 14.01452 Adjusted R-squared 0.590734 S.D. dependent var 1.462745 S.E. of regression 0.935775 Akaike info criterion 2.796881 Sum squared resid 49.03774 Schwarz criterion 3.002733 Log likelihood -80.70332 F-statistic 18.60948 Durbin-Watson stat 2.003398 Prob(F-statistic) 0.000000
Diagnostic Checking
Once the ARIMA model has been specified and the parameters estimated, the model should be checked for adequacy. One way to do this is to use the model to forecast all of the known values of the data series, compute the differences (i.e., the residuals) between the known and forecasted values, and generate the simple autocorrelation correlograms for the residuals. If none of the residuals autocorrelations is significantly different from zero, the model may be judged adequate. Figure 8-7 shows the EViews correlogram for the model illustrated in Figure 8-6. Since none of the simple autocorrelations are significantly non-zero, it is unlikely that there is other information contained in residuals which, if captured by further analysis, might enhance the ability of the model to forecast.
Figure 8-7. EViews Correlogram for Parsimonious ARIMA model of Y2.
Sample: 1984:06 1989:07 Included observations: 62 Autocorrelation Partial Correlation AC PAC . | . . | . 1 -0.008 -0.008 . |*. . |*. 2 0.088 0.087 . | . . | . 3 0.054 0.056 . |*. . |*. 4 0.126 0.120 . |*. . |*. 5 0.073 0.069 . | . . | . 6 0.045 0.026 . |*. . | . 7 0.079 0.058 . | . .*| . 8 -0.054 -0.081 . |*. . |*. 9 0.142 0.113 . | . . | . 10 -0.021 -0.028 . | . . | . 11 0.022 -0.012 . | . . | . 12 -0.048 -0.053
Another approach to diagnostic checking is to estimate a model with higher-ordered
autoregressive and moving average terms, then observe (i.e., draw an inference from the
t-statistic) whether the regression coefficients of the additional terms are statistically
significant.
Yet another approach to diagnostic checking is to employ the Chi-square statistic as a diagnostic criterion. The analyst may compute a test statistic employing the equation
Appendix A8 describes an alternate approach to ARIMA modeling which can be implemented with ordinary least squares regression.
An Alternate Procedure for Estimating an ARIMA Model
A simple ARIMA model estimation technique was first proposed by J. Durban in 1960 ("Estimation of Parameters in Time-Series Regression Models," Journal of the Royal Statistical Society, Series B, Volume 22, pp. 139-153, 1960). The Durban procedure is one which can be implemented using oridnary least squares regression and which often yields a satisfactory specification of the ARIMA model.
After the introduction of the Box-Jenkins procedure in 1970, the Durban approach has typically been used to make a first-approximation "guess" of the parameters of the autoregressive and moving average terms to be employed subsequently in the Box-Jenkins procedure. While the Durban approach is simpler, though rougher, than the Box-Jenkins approach, it often yields a satisfactory specification of an ARIMA model and is thus recommended to forecasting non-professionals. The analyst should go through the same first three steps as in the Box-Jenkins procedures described in Chapter 6.
4. Assuming that the appropriate model format is an ARMA or ARIMA, the analyst should enter data into an ordinary least squares (OLS) regression routine and select as dependent variable the series generated in the last differencing (or the object series if no differencing was needed). Then an autoregressive model with a "generous number" of terms, perhaps k=6 or higher order, should be specified. The analyst should also have the residuals written to the next available column of the data matrix so that they can be used in the second stage of the Durban estimation procedure. The analyst should note that the first k rows of the residuals column are empty (have zero values) because of the lagging done in the autoregression. Figure A8-1 illustrates the display of an OLS autoregressive k=6 model for Series Y2, i.e., an ARIMA(6,0,0) model.
Figure A8-1. Autoregressive (k=6) model for Series Y2.
AUTOREGRESSION MODEL: Y(t) = a + b(1)*Y(t-1) + b(2)*Y(t-2) + ... + b(k)*Y(t-k) DEPENDENT VARIABLE (Y) IS MATRIX COLUMN: 2 Y2 ORDER (K) OF EQUATION: 6 COEF OF MULTIPLE CORRELATION (R): .7656 CORRECTED R: .7406 COEF OF MULTIPLE DETERMINATION (RSQ): .5861 CORRECTED RSQ: .5485 STANDARD ERROR OF THE ESTIMATE: 1.0002 MSE: 1.0003 ANALYSIS OF VARIANCE: SUMS OF SQUARES DEGREES OF FREEDOM TOTAL: 130.5036 60 REMOVED BY REGRESSION: 76.4860 6 RESIDUAL: 54.0176 54 F-VALUE: 12.7435 INDEP VAR SIMPLE R COEF (b) S.E. COEF T-VALUE SIGNIFICANCE 1 Y2 - 1 .6578 .3767 .1348 2.7947 .0070 2 Y2 - 2 .5529 -.0533 .1377 -.3871 .7020 3 Y2 - 3 .6585 .4308 .1396 3.0863 .0034 4 Y2 - 4 .6053 .0545 .1404 .3879 .7014 5 Y2 - 5 .5820 .2189 .1405 1.5580 .1207 6 Y2 - 6 .4991 -.1520 .1479 -1.0277 .3090 CONSTANT (a) 1.7979 CORRELATIONS AMONG THE INDEPENDENT VARIABLES: 1.0000 .6392 .5384 .6544 .5876 .5691 .6392 1.0000 .6341 .5380 .6337 .5473 .5384 .6341 1.0000 .6349 .5328 .6510 .6544 .5380 .6349 1.0000 .6417 .5722 .5876 .6337 .5328 .6417 1.0000 .6322 .5691 .5473 .6510 .5722 .6322 1.0000
5. Stage 2 of the Durban estimating procedure is implemented by specifying an autoregression model on data beyond the first k empty rows in the residual column (i.e., beginning at row k+1). For the object series, the analyst should specify an autoregressive model of order higher than may be needed, e.g., k=3. In addition to the autoregressive terms for the object series, the analyst should include in the model data for another autoregressive variable selected from the matrix column containing the residuals from the first-stage autoregression. These residuals then constitute the disturbances upon which the parameters of the moving average terms can be estimated. The regression on the residuals column should also be specified to the same order as the autoregressive term, e.g., k=3. No other independent variables should be added to the model. The regression model being estimated in this example should then have six terms, three autoregressive terms (k=3), and three disturbance terms (the residuals to autoregressive order k=3). Figure A8-2 illustrates the display of the results for the six-term regression model, i.e., an ARIMA(3,0,3) model.
Figure A8-2. Autoregressive model of Series Y2, including three autoregressive terms and three disturbance terms.
AUTOREGRESSION MODEL: Y(t) = a + b(1)*Y(t-1) + b(2)*Y(t-2) + ... + b(k)*Y(t-k) + b(k+1)*X(1) + b(k+2)*X(2) + ... DEPENDENT VARIABLE (Y) IS MATRIX COLUMN: 2 Y2 ORDER (K) OF EQUATION: 3 COEF OF MULTIPLE CORRELATION (R): .7915 CORRECTED R: .7685 COEF OF MULTIPLE DETERMINATION (RSQ): .6265 CORRECTED RSQ: .5906 STANDARD ERROR OF THE ESTIMATE: .9701 MSE: .9412 ANALYSIS OF VARIANCE: SUMS OF SQUARES DEGREES OF FREEDOM TOTAL: 128.5084 57 REMOVED BY REGRESSION: 80.5091 6 RESIDUAL: 47.9994 54 F-VALUE: 14.2570 INDEP VAR SIMPLE R COEF (b) S.E. COEF T-VALUE SIGNIFICANCE 1 Y2 - 1 .6716 1.5043 .4678 3.2155 .0025 2 Y2 - 2 .5831 -.0910 .3502 -.2599 .7917 3 Y2 - 3 .6714 -.5039 .3785 -1.3310 .1855 4 AR6RES - 1 -.2397 1.2044 .4996 2.4108 .0182 5 AR6RES - 2 -.0971 .3183 .3206 .9928 .6740 6 AR6RES - 3 -.3577 -.9216 .3724 -2.4747 .0156 CONSTANT (a) 1.2273 CORRELATIONS AMONG THE INDEPENDENT VARIABLES: 1.0000 .6579 .5657 -.6454 -.1939 -.0695 .6579 1.0000 .6479 -.1331 -.6436 -.2084 .5657 .6479 1.0000 -.1279 -.1334 -.6668 .6454 .1331 .1279 1.0000 -.1307 .1175 -.1939 -.6436 -.1334 -.1307 1.0000 -.1324 -.0695 -.2084 -.6668 .1175 -.1324 1.0000
The analyst should inspect the inference statistics for the estimated model, anticipating that some of the autoregressive terms may be redundant as judged by the significance levels (inferred from their t-values). The model should then respecified, reducing the order of the autoregressive and residual terms as seems appropriate, until it may be judged that an optimal model remains. In Figure A6-2 it appears that only the first order autoregressive term and the first- and third-order disturbance (residual) terms are significant.
The analyst may then specify a final model retaining only those autoregressive and residual terms which are judged to be statistically significant. Figure A8-3 illustrates the display of the results of generating an ARIMA(1,0,3) model for Series Y2. Three of the four variables in this model are statistically significant below the .01 level.
Figure A8-3. An ARIMA(1,0,3) for Series Y2.
AUTOREGRESSION MODEL: Y(t) = a + b(1)*Y(t-1) + b(2)*Y(t-2) + ... + b(k)*Y(t-k) + b(k+1)*X(1) + b(k+2)*X(2) + ... DEPENDENT VARIABLE (Y) IS MATRIX COLUMN: 2 Y2 ORDER (K) OF EQUATION: 2 COEF OF MULTIPLE CORRELATION (R): .7822 CORRECTED R: .7682 COEF OF MULTIPLE DETERMINATION (RSQ): .6118 CORRECTED RSQ: .5902 STANDARD ERROR OF THE ESTIMATE: .9702 MSE: .9413 ANALYSIS OF VARIANCE: SUMS OF SQUARES DEGREES OF FREEDOM TOTAL: 128.5084 57 REMOVED BY REGRESSION: 78.6179 4 F-VALUE: 20.8795 RESIDUAL: 49.8905 53 SIG: .0000 INDEP VAR SIMPLE R COEF (B) S.E. COEF T-VALUE SIGNIFICANCE 1 Y2 - 1 .6716 .9248 .1215 7.6133 .0000 2 AR6RES - 1 -.2397 .6047 .1853 3.2633 .0022 2 AR6RES - 2 -.0971 .1955 .1448 1.3502 .1792 3 AR6RES - 3 -.3577 -.4305 .1347 -3.1968 .0026 CONSTANT (a) 1.8233 CORRELATIONS AMONG THE INDEPENDENT VARIABLES: 1.0000 -.6454 -.1939 -.0695 -.6454 1.0000 -.1307 .1175 -.1939 -.1307 1.0000 -.1324 -.0695 .1175 -.1324 1.0000 INDEPENDENT VARIABLE NUMBER 1: 15.6000 INDEPENDENT VARIABLE NUMBER 2: -.4721 INDEPENDENT VARIABLE NUMBER 3: .8635 INDEPENDENT VARIABLE NUMBER 4: .1992 POINT-ESTIMATE FORECAST OF DEPENDENT VARIABLE: 15.2734 95 PERCENT CONFIDENCE INTERVAL: 13.3330 TO 17.2139
Very little in the way of explanatory ability was lost in deleting the statistically-insignificant variable (Y2 - 2) of the ARIMA(3,0,3) model illustrated in Figure A8-2. Variable AR6RES - 2 might also be deleted from this model (this could be accomplished by backward stepwise regression) without further significant loss of explanatory ability. The AR(1,0,3) model illustrated in Figure A8-3 yields a higher value for the coefficient of determination (R2) and a smaller standard error of the estimate than did the AR(6) model illustrated in Figure A8-1. The residuals from this final model should be written to the data matrix in the next available column so that the adequacy of the model may be judged in the final stage of diagnostic checking.
Diagnostic Checking
Figure A8-4 displays the simple correlogram for the final model specified in Figure A6-3; none of the simple autocorrelations are significantly non-zero, so there is no other information contained in the residuals which might improve the ability of the model to forecast.Figure A8-4. Correlogram for the residuals of the ARIMA(1,0,3) model of series Y2.
SIMPLE -1.0 -.5 0 .5 1.0 K AUTOCORR I---------+---------+---------+---------I 1 .0203 I : I* : I 2 -.0240 I : * : I 3 .0090 I : * : I 4 .0456 I : I* : I 5 .1138 I : I * : I 6 -.0373 I : *I : I 7 .0009 I : * : I 8 .0495 I : I* : I 9 .1531 I : I * : I 10 -.0537 I : *I : I 11 -.0019 I : * : I 12 .0074 I : * : I I---------+---------+---------+---------I -1.0 -.5 0 .5 1.0
Another approach to diagnostic checking is to estimate a model with higher-ordered
autoregressive and moving average terms, then observe (i.e., draw an inference from the
t-statistic) whether the regression coefficients of the additional terms are statistically
significant. Figure A8-2 shows such a model; it can be seen that the coefficients of the
additional terms are not significant at the 0.05 level.
What's Ahead
This brings to a close our survey of time series forecasting tools. Now it's up to the reader to try using them to forecast series that interest him or her. Bonne chance!<>
System Access
TSA.exe, a computer app that can process the time series analyses and forecasts described in this book, is available upon request by email to rstanford@furman.edu.TSA Menu
The TSA app is menu-driven by selecting menu item numbers. The main menu appears as follows:1 DATA MANAGEMENT (TSFT 1)
2 DATA DESCRIPTION (TSFT 2)
3 UNIVARIATE FORECASTING MODELS (TSFT 3)
4 MOVING AVERAGE FORECASTING MODELS (TSFT 4, 7)
5 REGRESSION FORECASTING MODELS (TSFT 3, 5)
6 TIME SERIES DECOMPOSTION (TSFT 6)
9 ABOUT TSA
0 EXIT
ENTER SELECTION >>>
Data File Preparation
The TSA app is structured to process monthly data. It is dimensioned for a maximum of 960 rows (80 years of monthly data) and 20 columns (variables). A data file that is readable by this app may be prepared using EXCEL, SHEETS, or a Windows accessory, NOTEPAD or WORDPAD. The data file should be saved as a CSV (comma-delimited value) UNICODE TEXT DOCUMENT with file type specified as .DAT, e.g., filename.DAT.To illustrate TSA features, a default data file, DATA1.DAT, is included with program file TSA.exe. The data file contains seven years of monthly labor force data for a metropolitan statistical area (Greenville-Anderson-Mauldin S.C. MSA, 1981-1987). The contents of DATA1.DAT are illustrated following:
CLF,TE,TU,PU,MFG,TX,AP,CE,WRT,CON,AHE,AHW
233.4000,229.0000,4.4000,1.9000,98.8000,48.3000,10.3000,9.3000,37.2000,15.0000,2.9700,40.8000 236.6000,232.2000,4.4000,1.9000,99.9000,48.5000,10.7000,9.5000,37.1000,15.9000,2.9700,41.2000 239.8000,235.6000,4.2000,1.8000,100.5000,48.6000,10.8000,9.6000,37.7000,16.6000,2.9800,41.0000 242.7000,238.0000,4.7000,1.9000,100.3000,48.4000,10.5000,9.6000,39.1000,17.5000,2.9900,41.5000 245.2000,240.4000,4.8000,2.0000,100.8000,48.5000,10.5000,9.7000,38.9000,18.1000,2.9900,40.4000 252.2000,244.9000,7.3000,2.9000,102.4000,49.2000,10.5000,9.9000,39.4000,18.9000,3.0000,41.5000 252.3000,246.0000,6.3000,2.5000,100.4000,47.5000,10.2000,10.1000,39.6000,19.6000,3.0000,41.9000 252.2000,246.5000,5.5000,2.2000,103.4000,49.3000,10.4000,10.2000,39.8000,19.6000,3.0200,40.7000 251.1000,245.6000,5.5000,2.2000,102.4000,49.2000,10.0000,10.3000,40.1000,19.4000,3.1200,40.6000 251.1000,256.8000,4.9000,1.9000,102.9000,49.6000,10.0000,10.2000,40.5000,19.0000,3.1300,40.0000 252.2000,246.8000,5.3000,2.1000,103.5000,49.8000,10.1000,10.2000,41.5000,18.8000,3.1700,41.2000 251.7000,247.0000,4.7000,1.9000,103.4000,49.7000,10.0000,10.1000,42.5000,18.5000,3.2000,41.9000 249.4000,244.5000,4.9000,2.0000,103.4000,49.8000,10.2000,10.1000,40.0000,16.9000,3.2000,40.5000 229.2000,221.0000,8.2000,3.6000,102.4000,48.6000,10.0000,10.1000,39.8000,17.4000,3.2000,40.4000 229.5000,221.1000,7.4000,3.2000,102.3000,48.1000,10.0000,10.3000,39.9000,17.9000,3.2000,39.6000 231.3000,223.9000,7.4000,3.2000,102.3000,48.3000,9.7000,10.3000,40.1000,18.0000,3.2100,38.9000 233.7000,226.4000,7.3000,3.1000,102.8000,48.5000,9.6000,10.4000,40.3000,18.8000,3.2800,40.3000 238.6000,229.3000,9.3000,3.9000,104.4000,49.1000,9.8000,10.5000,40.5000,19.3000,3.4000,40.8000 239.7000,230.8000,8.9000,3.7000,102.7000,47.4000,9.5000,10.6000,40.4000,19.4000,3.4300,40.9000 241.1000,232.7000,8.4000,3.5000,104.6000,48.8000,9.7000,10.6000,40.6000,19.2000,3.4400,40.6000 238.0000,229.0000,9.0000,3.8000,103.5000,48.2000,9.6000,10.7000,40.7000,18.8000,3.4800,40.0000 240.8000,230.1000,10.7000,4.4000,102.6000,47.3000,9.6000,10.7000,40.8000,18.5000,3.4600,38.5000 240.2000,225.1000,15.1000,6.3000,98.0000,44.0000,9.5000,9.7000,41.0000,18.5000,3.4800,39.0000 242.3000,225.1000,17.2000,7.1000,97.3000,44.5000,9.2000,9.5000,41.4000,18.2000,3.4600,38.0000
.
.
.
253.3000,242.1000,11.2000,4.4000,105.6000,43.9000,10.3000,8.5000,50.3000,15.2000,4.8700,40.6000 252.0000,241.3000,10.7000,4.2000,104.5000,43.3000,10.2000,8.5000,50.0000,15.0000,4.9100,40.5000 256.2000,245.2000,11.0000,4.3000,104.7000,42.9000,10.3000,8.5000,50.1000,14.9000,4.8900,40.5000 257.0000,246.2000,10.8000,4.2000,104.8000,42.8000,10.2000,8.5000,50.2000,15.3000,4.8800,39.1000 259.3000,249.0000,10.3000,4.0000,104.9000,42.6000,10.2000,8.5000,50.0000,15.4000,4.9300,40.7000 265.9000,254.1000,11.8000,4.4000,106.1000,42.8000,10.2000,8.8000,49.7000,15.8000,4.9700,40.9000 259.9000,249.9000,10.0000,3.8000,103.7000,41.0000,9.9000,8.8000,49.6000,16.2000,5.0400,40.6000 261.8000,252.6000,9.2000,3.5000,104.9000,42.0000,10.1000,8.6000,49.9000,16.5000,5.1700,40.6000 264.6000,255.0000,9.6000,3.6000,104.4000,41.7000,10.1000,8.6000,50.1000,16.5000,5.1600,40.8000 264.4000,254.5000,9.9000,3.7000,104.3000,41.5000,10.0000,8.6000,50.0000,16.5000,5.2000,40.9000 262.7000,252.2000,10.5000,4.0000,103.6000,41.0000,9.7000,8.6000,50.6000,16.2000,5.2800,41.4000 263.9000,253.4000,10.5000,4.0000,106.8000,42.5000,9.7000,8.7000,53.0000,16.6000,5.3500,41.3000
The first row lists 12 column labels (alphabetic). The number of alphabetic labels tells the TSA app the number of data columns to find.
The next 24 rows illustrated contain the data for the first 24 months of data with 12 column entries each, no embedded blanks or alphabetic characters, and no blanks or commas at the ends of rows.
Monthly data rows 25 through 72 have been omitted from this illustration. The last 12 rows illustrated contain the data in rows 73 through 84 (the seventh year of monthly data).
When data are entered into NOTEPAD or WORDPAD as illustrated and following these instructions, the data should be saved in the folder containing TSA.exe to a file name with extension .DAT. The file illustrated here, DATA1.DAT, is included as a default sample along with TSA.exe, but any other data file that is properly prepared may be opened instead of the default file.
Runtime Errors
Please note that if a "runtime error" occurs upon opening a data file in the app, the user should reload the data file in EXCEL or SHEETS, or in NOTEPAD or WORDPAD, and make sure that the number of data rows and the number of data column entries correspond to the numbers specified in the alphabetic first row of the data file. The user should also check for any inadvertant alphabetic characters or embedded blanks within or at the ends of data rows.ARIMA Modeling with TSA
ARIMA modeling incorporates an AUTOREGRESSION on a time series of observations with a MOVING AVERAGE of that time series to forecast values within or beyond the end of that series. The conventional designation of an ARIMA forecasting model is p-d-q. i.e., autoregression of order p with a moving average of q elements and d-period differencing of an object series.Autoregression is warranted if the analyst suspects that observations in the object series are influened by previous observations of the same series. A moving average may be included if the analyst suspects that there are disturbances in the object series that may be diminished or 'smoothed' with a moving average of the series.
If the analyst judges that the object series is not sufficiently stationary (i.e., it exhibits upward or downward trends over ranges of data), he or she may choose to substitute a series of d-period differences of the object series in place of the object series. For example, an ARIMA (3,1,3) forecasting model would entail autoregression of order 3 and a 3-element moving average on a differenced object series of 1 period of each observation from the previous observation. An ARIMA (2,0,3) model would involve an autoregression of order 2 and a 3-element moving average on an undifferenced (i.e., original) object series.
In the TSA system, data are arrayed in a matrix by columns (variables) and rows (obversations) that are saved in a data file that may be opened and accessed for analysis. Once the data file is opened and data are accessible, an ARIMA forecasting model may be implemented in the TSA system in three steps. For illustrative purposes, a default data file, DATA1.DAT, is included with TSA.exe. DATA1.DAT contains an array of 12 columns (variables) and 84 rows (monthly observations) of employment data for a metropolitan statistical area.
1. Choose Option 4 and implement a moving average (M.A.) of the selected object series (a column of data in the data file). The M.A. is automatically saved in the next unused column (up to 20) in the data array beginning at an indicated row after allowing for missing initial observations. In the data array of default data file DATA1.DAT, a 3-element M.A. would have no observations in the first 3 rows, so the M.A. series would begin in row 4 of column 13.
2. Choose Data Management (Option 1), Transformations (sub-option 6), and Differencing (sub-sub-option 6), and specify a period gap (typically 1 period). The differenced series is automatically saved in the next unused column (up to 20) in the data array. In the data array in the default data file DATA1.DAT, a 1-period differenced series of the object series would miss an observation in row 1 and so begin in row 2 of column 14.
Upon examining the reported autoregression results and correlegrams, the user may choose to respecify the number of moving average elements (step 1), whether a differencing is needed and the number of periods by which the object series is differenced (step 2), and the autoregression order (step 4). Once an ARIMA model has been specified to the satisfaction of the user, the user may implement forecasts of values within or beyond the end of the object series.
<>
TSA is a menu-driven app that enables analysis of multiple columns of time series data.
TSA is programmed to read a standard CSV (comma separated values) file that may have been created in Microsoft EXCEL, Google SHEETS, or a text editor and saved in CSV format as a UNICODE TEXT DOCUMENT with file type specified as .DAT, e.g., filename.DAT.
TSA is dimensioned for a maximum of 20 columns and 960 rows of data. In a file readable by TSA, alphabetic column headings are separated by commas in the first row and numeric data are separated by commas in subsequent rows:
ROW 1: HEADER1,HEADER2,HEADER3 ROW 2: 1000.0,2000.0,3000.0 ROW 3: 4000.0,5000.0,6000.0 . . ROW LAST: 9799.9,9899.9,9999.9
TSA Features:
- data matrix maximum 20 columns, 960 rows (80 years of monthly data)
- ability to read EXCEL or SHEETS files saved in CSV format
(data may be downloaded from private or government sources into EXCEL files) - automatic detection of column headings and numbers of columns and rows
- ability to transform columns
- ability to select a column range for analysis
- ability to lag data in a selected column relative to data in other columns
- ability to construct and analyze time series models
- ability to decompose a monthly TxCxSxI time series into components
- menu options keyed to Time Series Forecasting Tools (TSFT) by R. Stanford
(parentheses following TSA menu items refer to TSFT chapters) - app and default data file available from rstanford@furman.edu
(must be downloaded to a folder on the user's computer where the user's data files will be saved)
Comments
Post a Comment