zipvars, offcat - Compress datsets by compressing selected variable data only

SYNOPSIS

zipvars  [ parameter=value ]  [ inputfile outputfile ]
zipvars  [ parameter=value ]  [ inputfile ... directory ]

offcat  offset  length  [ file ]

Parameters are: num_schemes, schemeN_vars, schemeN_proc, schemeN_unproc, schemeN_offset, other_vars, work_dir.

DESCRIPTION

zipvars compresses input datasets, creating corresponding output datasets. zipvars compresses selected variable data only, leaving dataset header/trailer information uncompressed. This has a three major advantages over using a standard file compressor (e.g., UNIX compress or GNU gzip) to compress a whole dataset.

  1. Datasets created with zipvars can be opened without any uncompression taking place; uncompression is required only when it is time to read compressed variable data.

  2. Different compression schemes can be used for different variables. Super efficient image compressors (e.g., JPEG or Wavelet) can be used to compress image variables.

  3. The space required to uncompress the data for a given variable is known exactly from that variable's datatype and dimensions.

Datasets output by zipvars can be read by any TeraScan function requiring input datasets, provided the user has defined a directory $UNCOMPRESSDIR in which uncompressed variable data can be cached. The available space in this directory is checked before uncompressing a given variable. See NOTES.

zipvars uses the directory $UNCOMPRESSDIR as scratch space. If this environment variable is undefined, the current directory is used.

Compression schemes consist of two UNIX pipe processes: a compress pipe and an uncompress pipe. A pipe is a process that reads from UNIX stdin and writes to UNIX stdout. The following is a list of recommended compression scheme pairs:


           compression                  uncompression

           compress -c                  zcat
           gzip -c                      gunzip
           rawtopgm nx ny | cjpeg       djpeg

The input to super efficient image compressors and output from corresponding uncompressors must be some common image format (e.g., pgm, sunraster). Each such image format has a header in which the image size is stored. In some cases (e.g. pgm) this header length is may vary, depending on the number of characters required to describe the image size. Regardless, the length of this header must be known to zipvars in order find the start of variable data after uncompression.

The offset to the start of variable data after uncompression is specified using the schemeN_offset parameter.

zipvars uses the utility function offcat to pipe variable data into any given compressor. offcat is similar to the UNIX cat command, except that it extracts a specified number of bytes from the input, starting at a specified offset.

PARAMETERS

num_schemes
The number of different compression schemes that will be used. The default is 1. The valid range is [1, 10].
schemeN_vars
The list of variables to compress using scheme N. There is no default. Wildcards * and ? can be used.
schemeN_proc
String containing the compress pipe process for scheme N. The default is compress -c.
schemeN_unproc
String containing the uncompress pipe process for scheme N. The default is zcat.
schemeN_offset
Offset to the start of variable data after uncompression for scheme N. The default is 0. Valid responses are any non-negative integers.
other_vars
List of uncompressed variables to be included in the output. Wildcards * and ? can be used. The default is not to include any uncompressed variables in the output datasets.
work_dir
The directory to use for scratch space for compressing variable data. If $UNCOMPRESSDIR is defined, it is the default. Otherwise the current directory is the default.

EXAMPLES

The following example taken from a C-shell script uses JPEG to compress an AVHRR dataset. The dataset contains byte-scaled calibrated AVHRR data. Note how the size of the PGM header is computed.

set SIZE = ( `varinfo var_name=avhrr_ch4 attr_name=size $OUTPUT.avhrr` )

  @ OFF = 7 + `echo $SIZE[2] $SIZE[1] | wc -c`

  zipvars num_schemes=1 scheme1_vars='av*'
        scheme1_proc="rawtopgm $SIZE[2] $SIZE[1] | cjpeg"
        scheme1_unproc=djpeg scheme1_offset=$OFF
        other_vars= work_dir=/tmp $OUTPUT.avhrr $OUTPUT.zipvars

NOTES

The GNU JPEG compression package containing cjpeg and djpeg can be found via Internet.

cjpeg has options that control the compression rate. The higher the compression rate, the lower the image quality after uncompression. Using -quality 75 on a number of test earth scenes resulted in compression rates of around 8:1, with fairly good uncompressed image quality.

On the IAF system installed in July 1996, instant is the only TeraScan function that automatically uncompresses datasets compressed via zipvars.


Last Update: $Date: 2000/12/07 20:03:26 $