impasc [ parameter=value ... ] [ inputfile outputfile ] impasc [ parameter=value ... ] [ inputfile ... directory impbin [ parameter=value ... ] [ inputfile outputfile ] impbin [ parameter=value ... ] [ inputfile ... directory impbin1 [ parameter=value ... ] [ inputfile outputfile ] impbin1 [ parameter=value ... ] [ inputfile ... directory
There are three basic functions for importing data into TeraScan: impasc, impbin, and impbin1.
impasc and impbin can import several variables at once, assuming that all the variables have the same dimensions. This is not that much of a restriction. Consider the following example in which file A contained data for NxM variable P and IxJ variable Q.
Import variable P to dataset X using impbin.
Import variable Q to dataset Y using impbin a second time.
Combine datasets X and Y into dataset Z using the assemble function.
impbin1 is a slightly more generalized version of impbin, but is restricted to importing one variable at a time. Again, assembly can be used to work around this restriction.
impasc reads text data, while impbin and impbin1 read binary data.
Importing Ascii Data
impasc requires that data be separated into fields by blanks or tabs, and that each input line contain the same number of fields for each variable. Also, for a given variable, the number of fields per line must divide the total number of variable elements, i.e., the product of its dimensions.
The following three examples show text input meeting the requirements of impasc. In the third example, impasc would have to be run twice to import all the variables
Example 1: 3 Variables (V1, V2, V3), with 1 dimension (9)
11 21 31
12 22 32
13 23 33
: : :
19 29 39
Value of Vk[j] = -k*10 + j
Example 2: 1 Variable (Y) with 2 dimensions (9 x 4)
-11 -12 -13 -14
-21 -22 -23 -24
-31 -32 -33 -34
: : : :
-91 -92 -93 -94
Value of Y[i,j] = -i*10 + j
Example 3: Combination of examples 1 and 2.
11 21 31 -11 -12 -13 -14
12 22 32 -11 -22 -23 -24
13 23 33 -11 -32 -33 -34
: : : : : : :
19 29 39 -11 -92 -93 -94
Importing binary data
impbin and impbin1 also require structure in the input data. Both functions assume the input contains a header, followed by data for the variables being imported. impbin and impbin1 both require that the offset from
Vk[I1,...,Ij,...,In] to Vk[I1,...,Ij +1,...,In]
be a function of only j and k, hereby called dimdist(k,j).
impbin further requires either:
that the data for each variable Vk is contiguous (i.e., dimdist(k,j) is minimal for each variable Vk and dimension Dj), or
that dimdist(k,0) is constant across all variables Vk, and that data for Vk[i,*,...,*] is contiguous for each i (i.e.,dimdist(k,j) is minimal for all variables Vk and dimensions Dj with j > 1).
In case 1, the data is said to be sorted by variable. In case 2, the data is said to be sorted by record, with record size equal to dimdist[*,0].
impbin1 does not restrict dimdist() like impbin, but can only import one variable at a time.
The following three examples show binary data that can be imported into TeraScan. impbin cannot import the variable described in example 6. impbin1 can import variables in all of the three examples, but only one variable at a time. size of() refers to the number of bytes in a single element.
Example 4: Three (4 x 3) variables (V1, V2, V3)
Sorted by variable
<end of header>
x x x x
111 112 113 121 122 123 131 132 133
y y y y y y y y y
211 212 213 221 222 223 231 232 233
311 312 313 321 322 323 331 332 333
z z z
411 412 413 421 422 423 431 432 433
:
Value of Vk[i,j] = k*100 + i+10 + j
dimdist(k,1) = 3*sizeof(Vk), dimdist(k,2) = sizeof(Vk).
Example 5: Three (4 x 3) variables (V1, V2, V3)
Sorted by record.
<end of header>
x x 111 112 113 211 212 213 y y y 311 312 313 z
x x 121 122 123 221 222 223 y y y 321 322 323 z
x x 131 132 133 231 232 233 y y y 331 332 333 z
:
Value of Vk[i,j] = k*100 + i*10 + j
dimdist(*,1) = 2*sizeof(x) + 3*sizeof(y) + 1*sizeof(z) +
3*(sizeof(V1) + sizeof(V2) + sizeof(V3))
dimdist(*,2) = 1
Example 6: One (3 x 4) variable V with two dimensions
<end of header>
y 11 x x 12 x x 13 x x 14 z z z
y 21 x x 22 x x 23 x x 24 z z z
y 31 x x 32 x x 33 x x 34 z z z
:
Value of V[i,j] = i*10 + j
dimdist(*,1) = sizeof(y) + 6*sizeof(x) +
3*sizeof(z) + 4*sizeof(V)
dimdist(*,2) = 2*sizeof(x) + sizeof(V)
The relative offset for the i-th variable (hereby called reloff(i)) is defined to be the offset from where the previous variable's data stops to where the i-th variable's data starts. The relative offset of the first variable is the offset from the end of the input files header and the start of the first variable's data.
In example 4, reloff(1) = 4*sizeof(x)
reloff(2) = 9*sizeof(y)
reloff(3) = 0
reloff(4) = 3*sizeof(z)
In example 5, reloff(1) = 2*sizeof(x)
reloff(2) = 0
reloff(3) = 1*sizeof(y)
reloff(4) = 3*sizeof(z)
Header length and relative offsets are used to describe input structure to impbin. Relative offsets are not used with impbin1 because impbin1 can only import one variable at a time, and the relative offset for that variable can be absorbed into the header. In example 6, this would mean adding sizeof(y) to the header length.
Both impbin and impbin1 support import without instantiation; i.e., without making a copy of the variable data. Datasets created without instantiation simply point to where the data is located and how it is structured. This feature is very useful in dealing with very large data files, such as those containing high-resolution SAR or SPOT imagery. It is the same feature used by the subset and assemble commands. The only drawback is that removing or renaming the original data file can leave link datasets pointing at nothing.
The assemble function also has an instantiate option.
Date and time data
All the above import functions understand the following input date and time formats:
yyindicates year, mm indicates month, and dd indicates day of the month.
yyindicates year, and ddd indicates day of the year.
hhindicates hours, mm indicates minutes, and ss indicates seconds.
In addition to these formats, impasc understands yy/mm/dd, yy.ddd, and hh:mm:ss.
Date and time formats are specified using the var_units parameter.
Date and time data is converted into TeraScan internal format. Dates are represented as days since January 1, 1990, inclusive. Time data is represented as seconds of the day. Date and time variables should be defined with type long.
Earth location for imported data
The most common type of ASCII data imported into TeraScan is point data, where each point has an explicitly stated latitude and longitude. However, many of the binary files imported into TeraScan are images without any per-pixel latitude and longitude coordinates.
The best way to add earth location to a dataset created by one of the import functions is to create a master dataset using either master, master2, master4 or gcpmaster, and then to copy the earth transform from this master dataset to the imported dataset using copyxfm.
In the best of circumstances, the import data conforms identically to one of the projections supported by TeraScan: rectangular, mercator, utm, polyconic, oblique stereographic, and polar stereographic. For simplicity, all projections except the polar stereographic are geocentric. In such ideal cases, either master or master2 could be used to create the master dataset.
In other cases, master4 or gcpmaster should be used to create the master dataset. Master datasets created by these two functions are still based on the standard projections supported by TeraScan, listed above. However, these projections are warped to better fit the imported data using ground control points (GCPs) ; i.e., known (line, sample, latitude, longitude) points. In master4, latitude and longitude must be supplied for each of the four corners. gcpmaster is more general, requiring a reference dataset containing an arbitrary number of GCPs.
The warping works best for images less than 1000 km on a side, with pixels that are roughly the same size and shape everywhere. The base projection should be chosen to minimize local distortion. The oblique stereographic projection is a reasonable choice for images located anywhere in the world. Other projections may be superior in specific cases.
For the above procedure to work, the row and column dimensions of imported image data should be explicitly named "line" and "sample", respectively. Dimension names are defined when running one of the import functions.
Getting data ready for import; byte-swapping
Import functions operate against standard UNIX files containing either ASCII or binary data. In the case of binary data, all data values must be stored in formats compatible with the host machine. The binary import functions do not accommodate foreign floating point formats; for example, impbin running on a SUN SPARCstation will not translate floating point data from a DEC VAX system. The binary import functions also don't support byte-swapped data.
Binary import functions impbin and impbin1 have a hidden parameter reverse_bytes to handle byte-swapping. The reverse_bytes parameter tells the import functions whether or not the data to be imported is byte-swapped. On Sun Solaris, the assumption is that input data is not byte-swapped, so the default setting is reverse_bytes=no. On PC Linux, the assumption is that input data is byte-swapped, so the default setting is reverse_bytes=yes. If these assumptions are incorrect, the user must set the reverse_bytes parameter accordingly. For example, to import non-byte-swapped data on PC Linux, the user must set reverse_bytes=no. To import byte-swapped data on Sun Solaris, the user must set reverse_bytes=yes.
Note, if reverse_bytes=yes, the instantiate=yes is forced.
There is an alternative to using the reverse_bytes parameter. A convenience function, byteswap, is provided for byte-swapping data in UNIX binary files. This function operates in place (i.e., overwrites the input) and is its own inverse. Running byteswap twice on the same file leaves the file in its original state.
When data to be imported is stored on tape in simple, non-archive format, the UNIX dd function should be used to copy the data to disk prior to import. dd has provisions for skipping tape blocks and/or limiting the number of blocks read. dd also supports byte-swapping if needed.
impasc, impbin, impbin1, assemble, master, master2, master4, copyxfm, byteswap.
Last Update: $Date: 2002/05/08 01:03:55 $