Explosive growth in the size of spatial databases has highlighted the
need for spatial data mining techniques to mine the interesting but
implicit spatial patterns within these large databases. This book
explores computational structure of the exact and approximate spatial
autoregression (SAR) model solutions. Estimation of the parameters of
the SAR model using Maximum Likelihood (ML) theory is computationally
very expensive because of the need to compute the logarithm of the
determinant (log-det) of a large matrix in the log-likelihood function.
The second part of the book introduces theory on SAR model solutions.
The third part of the book applies parallel processing techniques to the
exact SAR model solutions. Parallel formulations of the SAR model
parameter estimation procedure based on ML theory are probed using data
parallelism with load-balancing techniques. Although this parallel
implementation showed scalability up to eight processors, the exact SAR
model solution still suffers from high computational complexity and
memory requirements. These limitations have led the book to investigate
serial and parallel approximate solutions for SAR model parameter
estimation. In the fourth and fifth parts of the book, two candidate
approximate-semi-sparse solutions of the SAR model based on Taylor's
Series expansion and Chebyshev Polynomials are presented. Experiments
show that the differences between exact and approximate SAR parameter
estimates have no significant effect on the prediction accuracy. In the
last part of the book, we developed a new ML based approximate SAR model
solution and its variants in the next part of the thesis. The new
approximate SAR model solution is called the Gauss-Lanczos approximated
SAR model solution. We algebraically rank the error of the Chebyshev
Polynomial approximation, Taylor's Series approximation and the
Gauss-Lanczos approximation to the solution of the SAR model and its
variants. In other words, we established a novel relationship between
the error in the log-det term, which is the approximated term in the
concentrated log-likelihood function and the error in estimating the SAR
parameter for all of the approximate SAR model solutions.