Spatiotemporal Regression

Data do not arise ex-nihilo, but occur at a specific location and time. As data generated close together in space and time share more similarities, opportunities exist to use these similarities to improve estimation.

Spatiotemporal data often exhibits other structure. For example, collection and aggregation of the data may result in reporting averages at fixed time intervals and at fixed locations. Examples of such data include the US Census data and temperature monitoring station data. Longitudinal or panel techniques provide the canonical way of modeling such data.

For data which do not occur at fixed locations and at regular time intervals, more general spatiotemporal techniques exist. Examples of such data include house transaction data (locations are not the same each period and sales can happen at any time), data accumulated from multiple observers (e.g., ore samples collected by different parties at different times using different technologies), and fish catch data from different vessels (log showing time and location).

At least several ways exist of modeling spatiotemporal data. First, one can use spatial statistical techniques and incorporate time through inclusion of temporal dichotomous variables. Second, one can directly model the correlation among observations as a function of time and location. Third, one can include enough spatial, spatiotemporal, and temporal lagged dependent as well as independent variables so the resulting residuals do not exhibit gross spatiotemporal dependence (i.e., general autoregressive specification).

For real estate data, I have had success using general autoregressive specifications (Pace, Barry, Gilley and Sirmans (2000)). Let a,b,c, and d represent scalar parameters. Let T represent an n by n matrix such that Tz yields a temporally averaged (past only) value where z is an n element random variable and let Sz represent an n by n matrix such that Sz yields a spatially averaged (but past values only which make it spatiotemporal) value where z is an n element random variable. I examined the specification for the dependent variable of [(I-aS)(I-bT)Y+(I-cT)(I-dS)Y]. This looks at filtering first for time and then for space as well as first for space and then for time. Doing this for the independent variables as well plus moving all terms but Y to the right hand side of the equation yields an autoregressive specification. For real estate data, T had the effect of averaging over the few months of prior data while S had the effect of averaging over the 15 (but with weight declining with order) nearest neighbors of each observation. Hence, this specification modeled the market-wide temporal forces, the local spatial forces, and their interactions.

In terms of the results, the spatiotemporal autoregression with 14 variables produced 8% lower sum-of-squared errors than an OLS regression using 211 variables (housing characteristics, temporal dichotomous variables, spatial dichotomous variables).

I have not placed the software to perform this technique yet on the web, but I do have software to make spatial autoregressions easy to estimate at http://www.spatial-statistics.com   (along with spatial statistics manuscripts, data, and links). These estimators coupled with temporal dichotomous variables make it simple to estimate some form of a spatiotemporal model. I have another restricted version of this model which I had applied to over 70,000 observations out of Fairfax Virginia. This plus other spatial real estate manuscripts reside at my LSU site. James LeSage's Econometric Toolbox at http://www.spatial-econometrics.com also has some spatial statistical functions which could help with spatiotemporal estimation.

References:

Pace, R. Kelley and Ronald Barry, O.W. Gilley, C.F. Sirmans, "A Method for Spatial-temporal Forecasting with an Application to Real Estate Prices," International Journal of Forecasting, Volume 16, Number 2, April-June 2000, p. 229-246.