Sengupta, S. K. and J. Boyle, 1995: Report 29: Nonlinear
principal component analysis of climate data. PCMDI Report 29, Program
for Climate Model Diagnosis and Intercomparison, Lawrence Livermore National
Laboratory, 26 pp.
In traditional principal component analysis (PCA) a few significant
linear combinations of the original variables are extracted to arrive at
a parsimonious description of a complex data set obtained from climate
observations, analysis or from GCM outputs. These are uncorrelated variables
which are used in practice to understand the principal modes of variation
in the climatological process under study. If we drop the requirement of
linearity and uncorrelatedness, a greater data reduction is possible allowing
us to deal with fewer modes of variation. These nonlinear functions can
in fact be obtained by using a series of auto-associative feed-forward
neural networks in which the residuals from the previous network are fed
as the contents of the input output pair for the next. It can be shown
that in special cases such networks provide ordinary principal components.
We have explored this methodology to gain a better understanding of the
precipitation data over the US observed over land and bordering oceans
for the 1979 to 1988 decade. A careful comparison with the linear counterpart
has been made. The improvement in the data reduction is noticeable but
not overwhelming. Certain details in the modes of variation are more pronounced
in the nonlinear representation. The leading nonlinear mode captures the
seasonal cycles more clearly than the leading linear mode. In the latter,
the seasonal cycle is shared by subsequent modes of the PCA. The principal
linear and nonlinear modes of the observational data has been intercompared
with the corresponding modes of the data obtained from a GCM simulation.
We conclude by observing that nonlinear principal component analysis (NLPCA)
based on auto-associative neural networks is potentially a more effective
data reduction tool than conventional PCA. Also the principal modes of
variation of the precipitation data of the continental US are better differentiated
by a NLPCA than by ordinary PCA. It should be tried as an alternative method
especially when linear PCA fails to show meaningful patterns in climatological
data analysis.