eof [ parameter=value ... ] [ inputfile outputfile ] eof [ parameter=value ... ] [ inputfile ... directory ]
Parameters are: include_vars, across_vars, new_var_name, eof_dim, modes, burst_modes, max_iter.
eof computes empirical orthogonal functions (EOFs) of multi-dimensional data, where different instances of the variable to be decomposed occur along the dimension eof_dim (see laminate). The EOFs are computed directly via Singular Value Decomposition. This method is useful in cases where there are more points in a given instance of the data field than there are instances of the data field, since it does not require the calculation of the covariance matrix. modes specifies the number of EOFs to retain in the output dataset. eof reports the eigenvalues and percent of variance accounted for by each of the retained EOFs.
For each input variable "X", eof writes the EOF vectors, expansion coefficients and eigenvalues as output variables named "X_vec", "X_amp", and "X_var", respectively. In addition, the percentage of variance accounted for by each of the eigenvalues is written to a variable named "percent_var". The dimensions of "X_vec" are the same as "X", except that the eof_dim is replaced by a dimension named "mode" having size modes. Additionally, eof writes the total data variance, total variance captured by all the modes, variance in the retained modes, and the percent of variance in the retained modes in attributes named "X_d_var", "X_m_var", "X_r_var", and "X_p_var", respectively.
The vectors are normalized to have variance equal to the associated eigenvalue and thus have the same units as the input data. The expansion coefficient series are unit normalized. The output variables are all written to the output dataset as type GP_FLOAT (see include/gp.h). Eigenvalues that are smaller than 1.0E-5 times the largest eigenvalue are set equal to zero. If bad values occur in the input variable, they are set equal to zero before the computation of the eigenstructure. In this case, it is assumed that the "mean" of the data has been removed (see anomaly); thus the bad values take on the "mean" value.
If the user selects across_vars=yes, the input variables are effectively laminated together to create a single new input variable. For this to work, all the input variables must have the same dimensions. After this lamination, the original input variables are ignored, and EOFs are computed only for the new input variable. This new input variable has one additional dimension, whose length is equal to the number of original input variables. The name of this added dimension is set via the eof_dim parameter.
The across_vars=yes option is useful when computing EOFs across differents sets of sensor channels.
For each input variable X, the actual modes (eigenvectors) are stored in an output variable X_vec. If there are M modes, and if the dimensions of a given input variable X are N1, N2, ..., Nj, ..., Nm, where Nj is the eof_dim, then the modes (eigenvectors) are stored in an output variable with dimensions M, N1, N2, ..., Nj-1, Nj+1, ..., Nm.
If the user selects burst_modes=yes, then for each input variable X, the individual modes (eigenvectors) are also stored in separate variables. The names of these separate eigenvector variables are X_vec_##, where ## is the number of the mode. Using the dimensions from the previous paragraph's example, each eigenvector variable has dimensions N1, N2, ..., Nj-1, Nj+1, ..., Nm.
Specifies which variables to compute EOFs for. Each variable must have the eof_dim specified below.
The default is all variables in the input dataset(s).
Specifies whether or not all the input variables are to be effectively laminated together into a single new input varaiable. The default is no, meaning that input variables are not combined.
If across_vars=yes, this parameter sets the name of resulting single new input variable. There is no default.
Specifies the dimension about which to compute the EOFs. The different instances of the variable are assumed to occur along this dimension.
There is no default.
Specifies the number of EOFs to retain and write to the output dataset.
Valid responses are [> 0 and <= size of eof_dim]. The default is 10.
[OPTIONAL] Specifies the maximum number of iterations to be performed by the svd computation routine. If convergence is not achieved with the number of iterations specified, the program aborts with an error message.
Valid range is [ > 0 ]. The default is 30.
Specifies whether or not variables corresponding to individual eigenvectors are to be created in the output datasets. The default is yes.
The dataset sstsw.tdf contains two three-dimensional variables, sea surface temperature (sst) and surface shortwave (sw). The variables have dimensions month (=78), line (=13), and sample (=20) which represent, time, latitude, and longitude. The mean annual cycle of the data has been removed using anomaly. The following example shows how to compute and store the ten principle EOFs in the dataset named sstsweof.tdf. Note: since the two variables contain "bad" values, e.g. no sst over land, the bad value statistics are reported.
% eof sstsw.tdf sstsweof.tdf
include_vars : char(255) ? [] sst sw
across_vars : char( 3) ? [no]
eof_dim : char( 31) ? month
modes : int ? [10]
burst_modes : char( 3) ? [yes] no
Processing sstsw.tdf
sst mean: 0.0935385 variance: 0.30618
Number of bad values found are: 468
Number per sample : 6
Fraction of total data : 0.0230769
These bad values set to zero.
Eigen Number Eigen Value Percent Variance Summed Variance
1 0.1387 45.288 45.288
2 0.0368 12.019 57.307
3 0.0200 6.517 63.824
4 0.0114 3.716 67.540
5 0.0104 3.395 70.935
6 0.0079 2.596 73.530
7 0.0064 2.091 75.622
8 0.0063 2.046 77.668
9 0.0059 1.920 79.588
10 0.0051 1.666 81.255
sw mean: -1.2678e-07 variance: 61.4444
Number of bad values found are: 1404
Number per sample : 18
Fraction of total data : 0.0692308
These bad values set to zero.
Eigen Number Eigen Value Percent Variance Summed Variance
1 11.8119 19.225 19.225
2 5.9475 9.680 28.905
3 4.5616 7.424 36.329
4 4.1016 6.676 43.005
5 3.6781 5.986 48.991
6 3.0665 4.991 53.982
7 2.5657 4.176 58.158
8 2.4878 4.049 62.207
9 2.0233 3.293 65.500
10 1.7262 2.809 68.310
The contents of the output dataset is:
% contents sstsweof.tdf
printout : char( 3) ? [no]
Contents of File: sstsweof.tdf Page 1
Dimension Size Coord Scale Offset
mode 10 ? 1 0
month 78 time 1 0
line 13 y 1 0
sample 20 x 1 0
Attribute Type Units Value
sst_d_var double (Celsius)^2 0.30618
sst_m_var double (Celsius)^2 0.306198
sst_r_var double (Celsius)^2 0.2488
sst_p_var double percent 81.2546
sw_d_var double (W/m2)^2 61.4444
sw_m_var double (W/m2)^2 61.4413
sw_r_var double (W/m2)^2 41.9703
sw_p_var double percent 68.3095
history byte
Variable Type Units
sst_amp float
sst_vec float Celsius
sst_var float (Celsius)^2
sst_perc float percent
sw_amp float
sw_vec float W/m2
sw_var float (W/m2)^2
sw_perc float percent
Variable Dimension Size
sst_amp mode 10
sst_amp month 78
sst_vec mode 10
sst_vec line 13
sst_vec sample 20
sst_var mode 10
sst_perc mode 10
sw_amp mode 10
sw_amp month 78
sw_vec mode 10
sw_vec line 13
sw_vec sample 20
sw_var mode 10
sw_perc mode 10
Variable BadValue ValidMin ValidMax Scale Offset
sst_amp -3.4028e+38 -3.4028e+38 3.4028e+38 1 0
sst_vec -3.4028e+38 -3.4028e+38 3.4028e+38 1 0
sst_var -3.4028e+38 -3.4028e+38 3.4028e+38 1 0
sst_perc -3.4028e+38 -3.4028e+38 3.4028e+38 1 0
sw_amp -3.4028e+38 -3.4028e+38 3.4028e+38 1 0
sw_vec -3.4028e+38 -3.4028e+38 3.4028e+38 1 0
sw_var -3.4028e+38 -3.4028e+38 3.4028e+38 1 0
sw_perc -3.4028e+38 -3.4028e+38 3.4028e+38 1 0
datasets, eof_overview, spectral, eoffilt, eofproj, svd, cca, linfit, emath, laminate, dimavg, magnify, xcorrel, anomaly.
Memory allocation errors generally mean the variable's sizes (the sizes not associated with svd_dim) produce a matrix decomposition that is too large to compute. In this case the dataspace has to be reduced (see subset or magnify).
Last Update: $Date: 2002/05/07 23:51:20 $