DCS methods development overview

DESCRIPTION

The DCS System Developer's Kit is a tool for writing customized functions, converting raw DCS platform data to engineering units. It includes the DCS library, DCS header files and the man pages. Developer is supposed to create several text files which are then used to generate the dcsproc function with the dcsmake command (see below). This function is, typically, applied on the third step of DCS data processing, after data is ingested with dcsin and earth located with dcsloc TeraScan functions.

The DCS system consists of the following components that have to be created/updated by user:

DCS database:

the dcs_data database text file with the date-dependent platform information that can be used by conversion functions and also for reporting.

Configuration files:

methods.list - the list of methods (i.e. the conversion algorithms) supplied by the user.

outfields.list - the list of the output variables, computed by conversion algorithms and saved to the output dataset.

Method files:

the variable number of *.c files containing conversion algorithms with the same names as the ones specified in methods.list file.

Each of these three elements is described in more detail below.

DCS database

DCS database is an ASCII file with the historical (i.e. changing with time) data for each ARGOS platform of interest. Each record has a unique platform_ID and date double key by which the entire database has to be sorted in ascending order. The date key indicates the date when the information stored in this record became effective for the given platform_ID. The following fields are specified in each record in the order described below:

platform - the ARGOS platform ID.

date - effective date in a format yyyymmdd.hhmmss.

method - the name of the conversion algorithm. It has to match one entry in the "methods.list" file.

platform_type - the platform type (e.g. Drifting, Moored, Reference, Land, etc..). Only the first letter of the platform type is saved into the output, so each platform type should start from the unique letter.

site_name - any word, designating the site.

number_of_repetitions - the number of historical blocks in a single ARGOS message. For platform, transmitting no historical data, this value is 1.

group - the group name the platform belongs to. Used for reporting purpose only.

message_length - message length in bytes.

WMO_number - WMO platform number,

GTS_message_number - GTS message number,

constants_list - the list of platform and date - specific parameters required in computations. They can be parsed out of the record in the run-time by the conversion functions.

NOTE: no spaces are allowed in the string fields. Underscores should be used to separate the words in field names.

Below is an example of a dcs_data record file:

7350 19880101.1230 method_1 Land Sofia 2 Antarctic 12 95724 3 -30 -5 900 8.52 .0252

where platform_id=7350, effective_date/time=1988/01/01 02:25:00, method=method_1, platform_type=Land, site_name=Sofia, number_of_repetitions=2, group=Arctic, message_length=12 bytes, WMO_number=95724, GTS_message_number=3, platform_parameters=-30,-5,900,8.52,0.0252.

Configuration Files

methods.list file contains the variable number of entries looking like

METHOD( "<DBase Method Name>" , <Conversion Function Name>)

Each entry maps the method field in the dcs_data file record into the name of the corresponding conversion algorithm written by the developer.

The entry

METHOD("Ocean01", ocean01)

for example, indicates, that the database entry with the Ocean01 method name shall be processed by the user-supplied conversion algorithm that has a name ocean01 and is stored in one of the *.c files. The METHOD is a reserved word.

The records in the outfields.list file have the following format:

FIELD(<field_name>, <output_name>, <units>, <type>, <status_name>)

For example, the following entry:

FIELD(air_temp, "air_temperature", "celsius", GP_FLOAT,"s_air_temp")

specifies that the value to be computed from the ARGOS data will be referred in the conversion algorithm as air_temp; it will be saved to the output TeraScan dataset as "air_temperature"; its units are "celsius", its type is float (other possible types are GP_BYTE, GP_SHORT, GP_INT and GP_DOUBLE); and the status of this computed value (its meaning will be explained later) will be saved as another variable with the name "s_air_temp". If no status for this value is to be saved, its name has to be specified as "". The type of the status variable is always BYTE and its units are "status".

The values of platform_ID, date, time, method, and group are saved automatically and, therefore, they should NOT be mentioned in the outfields.list file.

It is important to understand, that the variables from the outfields.list file should be a SUPERSET of the variables that every specific algorithm specified in methods.list file may be generating. If some variables are irrelevant to particular ARGOS message, and, therefore, are not computed by the conversion algorithm, they will automatically be assigned the Bad values by the DCS system and it will be easily seen when the computed output values are printed out using the TeraScan printvar function.

Method files

The number of method *.c files written by user have to match the number of METHOD(_,_) entries in the methods.list file.

The name of the function in the method file has to match the second parameter in the corresponding METHOD(_,_) entry in the methods.list file.

If the method specified in dcs_data is not in the methods.list file, the platform will be ignored in the run-time. However, all methods in the methods.list file have to actually exist, or dcsproc won't be compiled.

The proposed DCS data processing scheme allows to process the ARGOS messages in the Batch and Single modes.

Batch Mode

In the batch mode all messages for a given platform_ID collected in one pass are processed by the conversion function at once, producing only one set of the output values. In the batch mode any extracted sensor value is treated as a so-called sensor object, that has the following components:

sensor vector - vector of values extracted/computed from each input message.

sensor value - single value computed from the sensor vector values by applying some statistical function to them. If the sensor value is saved into the output dataset, so is the sensor status.

sensor status - the number of components in the sensor vector used for computation of sensor value. It can be less than the number of messages in the batch if statistical function can reject the "bad" data. sensor status of output sensor variable can also be saved to the output if its name was specified as non-blank string in the outfields.list file.

The DCS Command Language introduces a simple and elegant way of treating sensor objects as simple scalar values, avoiding any kind of explicit vector notation in the code. Here are the basic features of DCSCL.

a) The special geoval class has been developed to represent sensor objects.

b) The sensor extraction functions extract the sensor vector values from every batch message in just one assignment operation.

c) The comprehensive collection of various statistics functions allows to compute the sensor value from the sensor vector components and automatically set the sensor status value.

d) In computing mathematical expressions, the DCSCL will assign the sensor status to the sensor object on the left-hand side of assignment to the minimal status of sensor objects on the right-hand side. It means that the quality of the output is equal to the worst quality of the input terms used in computation. Therefore, even one bad value on the right-hand side of an expression will cause the right-hand variable to be also marked as Bad.

e) DCSCL has mathematical library functions, defined as the ones in the standard C library. However, all exceptions are handled internally, assigning the sensor status of the left-hand side sensor object to zero and its sensor value to Bad value if illegal input parameters were passed.

Single Mode

Any conversion algorithm written for the Batch Mode, can also be used in the Single Mode, as long as statistics functions still make sense for an input vector with size=1. In this mode the conversion algorithm will be called for each platform as many times as many messages it sent to satellite in one pass.

EXAMPLE OF CONVERSION ALGORITHM

Let's consider an example of conversion algorithm to demonstrate the capabilities of the DCSCL. The conversion function has the name defined as the method field Ocean01 in the dcs_data file. The conversion algorithm is called ocean01, which is defined by the entry METHOD("Ocean01", ocean01) in the methods.list file.

File: ocean01.c
---------------------------------------------------------

// Section #1: include the "method.h" file
// The comments start with "//"
#include "method.h"

// Section #2: declare the method "ocean01"
method ocean01(dcsin i, dcsout r)
{
  // Section #3: Declare "scratch" variables
  geoval press_lo, air_temp, sea_temp, press_hi, press, dummy;
  float ATo, STo, Po, PSTo, TCa, TCb;
  int C = 32767;
  char *SenStat;

  // Section #4: Extract sensor counts
  if (i.count == 1) {   // Multi-hour data - time == "now"
    press_lo = i.xu( 0, 8);	// extract low bits of pressure
    air_temp = i.xs( 8, 8);	// extract air temperature
    sea_temp = i.xs(16, 8);	// extract sea temperature
    press_hi = i.xu(24, 8);	// extract high bits of pressure
  } else if (i.count == 2) {	// time == "now" - 1 hour
    press_lo = i.xu(32, 8);	// The same .......
    air_temp = i.xs(40, 8);	// ................
    sea_temp = i.xs(48, 8);	// ................
    press_hi = i.xu(56, 8);	// ................
  } else {			// invalid repetition #
    fmtout("ocean01: invalid repetition# %d\n", i.count);
    return;
  }

  // Section #5: Generate new sensor "press"
  press = 256*press_hi + press_lo;
  // Logic: if (press < 4000) press += C;
  press.outrange(4000, BadVal, C);

  // Section #6: Extract platform constants, index 0-relative.
  SenStat = i.pgets(0);	// Parameter #1 is a string
  ATo     = i.pgetr(1);  // Parameter #2 is a real value
  STo     = i.pgetr(2);  // ..........
  Po      = i.pgetr(3);
  PSTo    = i.pgetr(4);
  TCa     = i.pgetr(5);
  TCb     = i.pgetr(6);

  // Section #7: Validate input sensors, index 0-relative
  Validate(air_temp, SenStat, 0);
  Validate(sea_temp, SenStat, 1);
  Validate(press   , SenStat, 2);

  // Section #8: Average sensor data (arg == Stand. Deviation.)
  air_temp.ave();
  sea_temp.ave(2.);
  press.ave(100.);

  // Section #9: Convert averaged sensor values to physical units
  r.air_temp = ATo + .25*air_temp;
  r.sea_temp = STo + .125*sea_temp;
  dummy = r.sea_temp - PSTo;
  r.stat_press = Po + 1.25*press/256
                  + TCa * dummy
                  + TCb*dummy*dummy;

  // Section #10: Compute output position & time
  r.lat = i.lat.lst();
  r.lon = i.lon.lst();
  r.time = i.time.ave() - (i.count-1)*3600; // multi-hour data

  return;
}
---------------------------------------------------------------
End of file ocean01.c

Comments

Section #1: include the "method.h" file

Every method file should include the "method.h" file

Section #2: declare the method "ocean01"

dcsin and dcsout are reserved words for the structures carrying the input and the output DCS data. Some fields of dcsin structure are exposed in the code. Their names are fixed and can be found in "dcsin.h" header.

Since the input parameter to the function was declared as dcsin i, its fields are to be accessed from the code as i.<field_name>, e.g. r.lat = i.lat.ave();

Since the second parameter to the function was declared as dcsout r, the output field air_temp, declared in the outfields.list file needs to be referred as r.air_temp. The same is true for other parameters in outfields.list file as well as for the platform, date, and time output fields, hard-coded into the dcsout r structure and saved to the output dataset automatically.

Section #3: Declare "scratch" variables

geoval is the class (in C++ sense), developed to represent the sensor objects as was explained above.

Section #4: Extract sensor counts.

The following is a list of the bit-extraction functions:

geoval xu(int, int)   - eXtract Unsigned;
geoval xur(int, int)  - ... Unsigned bit-Reversed;
geoval xs(int, int)   - ... Signed, MSB=0 means (+) sign;
geoval xsr(int, int)  - ... Signed bit-Reversed, MSB=0 for (+);
geoval xs1(int, int)  - ... Signed, MSB=1 for (+);
geoval xsr1(int, int) - ... Signed bit-Reversed, MSB=1 for (-)
geoval xub(int, int)  - ... Unsigned Byte-Reversed
geoval xsb(int, int)  - ... Signed Byte-Reversed, MSB=0 for (+)

For every bit-extraction function the first parameter is a start bit number. The second parameter is a number of bits. The exceptions are xub() and xsb() where the second parameter is the number of 16-bit groups (the valid numbers are 1 and 2).

The count field of the dcsin i input structure is used for ARGOS messages with historical data. As it was mentioned earlier, such messages contain more than one block of data with the same sensor values but for different times. Since the conversion algorithm is the same, the developer needs to know the count number to extract the sensor values from the appropriate part of ARGOS message. This parameter changes between 1 and the value of the number_of_repetitions field specified in the dcs_data file. Obviously, the number of output records per one platform is equal to the number_of_repetitions value.

Section #5: compute "secondary" sensor values

Based on "primary" sensors (i.e. sensor values extracted directly from ARGOS messages), the "secondary" sensors can be computed. There are two "range-shifting" functions, outrange(min, max, shift) and inrange(min, max, shift), that implement the following logic:

for outrange():

for every component: if value is outside of [min, max] range, then add the shift value.

for inrange(): the same, but if the value is inside the range.

The reserved word BadVal, staying for bad value, can be specified instead of the min or max parameters to indicate plus or minus infinity. For example, the statement

pressure.inrange(BadVal, 1000, 500) is equivalent to the logic

if (pressure < 1000) pressure = pressure + 500;

which is performed for each component of pressure vector.

If shift value is specified as BadVal, the inrange() and outrange() functions will set components of vector to the bad value rather then shifting them. For example,

pressure.outrange(500, 1500, 0);

is equivalent to:

if (pressure < 500 OR pressure > 1500) pressure = BadVal;

which is performed for each component of pressure vector.

Section #6: Extract platform constants, index 0-relative.

Platform parameters can be extracted in any order, in any place in the code, but before their first use. First parameter in platform parameters list is a word of 0's and 1's indicating whether the telemetry sensors are valid. 1 means "valid", 0 means "invalid". All other parameters are real values.

The following is a list of platform parameters-extracting functions:


int    pgeti(int index);
double pgetr(int index);
char   pgetc(int index);
char  *pgets(int index);

where index is a sequential, 0-relative number of, respectively, integer, real, character or string parameter in the platform parameters list.

Section #7: Validate sensors

Suppose, some sensor(s) are known to be not operational during some period of time. The computations based on these sensors are not valid and the corresponding output values should be marked as Bad. It can be achieved by providing the "sensor status" string of '0's and '1's as yet another parameter in the database platform parameters list and, later, extracting specified character (0 or 1) and assigning the Bad value to the sensor if the character is 0. Suppose, that sensor status is the first word in the platform parameters list. Suppose, also, that the second character in this word indicates the status for "pressure" sensor. The following code will do this job (please, notice that all offset indexes are 0-relative):

geoval pressure;	// pressure sensor variable
char *SensorStat;	// will point at the "Sensor Status" word
pressure = i.xu(5,24);	// extract pressure sensor value
SensorStat = i.pgets(0); // point at "sensor status" parameter
Validate(pressure, 1);  // Verify if pressure sensor is working

Section #8: Apply stats. functions to sensor vectors to generate scalar values

The following is a list of stat. functions:

ave() - plain average
ave(double) - average with the bad data rejection
min() - minimal good value
max() - maximal good value
fst() - first good value
lst() - last good value
med() - median good value
fst2() - first two equal sequential good values
lst2() - last  two equal sequential good values

Section #9: Compute output values

The following is a list of valid operations used in the DCSL language to form expressions:

+, -, *, /, =, (, ), ==, <, >, <=, >=.

The logical expressions can have three values: True, False and Unknown. Unknown value is returned if one of the terms forming the logical expression has Bad value. It's up to the developer whether to write the code like this:

if (pressure < 1000) { // invoked if True or Unknown
  <Do something>
} else {               // only if False
  <Do something else>
}

or, more correct version:

boolean dummy;
dummy = (pressure < 1000);
if (dummy == True) {
  < Do smth #1>
} else if (dummy = False) {
  <Do smth #2>
} else {                 // Unknown, i.e. pressure=="bad"
  <Do smth #3>
}

Please, notice that all arithmetic comparison operations use the sensor value of a geoval object, so statistics function need to be applied by the time of comparison. That is why the logic of the following two statements

if (temperature < 20) temperature += 270;

and

temperature.inrange(BadVal, 20, 270);

is different. In the first statement the sensor value is compared to 20 to decide whether to shift the value of every vector component by 270. In the second statement each component of the sensor vector is compared to 270 to decide whether or not to shift it.

The following is a list of available math. functions:


geoval acos(geoval)
geoval asin(geoval)
geoval atan(geoval)
geoval atan2(geoval, geoval)
geoval cos(geoval)
geoval sin(geoval)
geoval tan(geoval)
geoval fabs(geoval)
geoval fmod(geoval, double)
geoval remainder(geoval, double)
geoval pow(geoval, double)
geoval pow(geoval, geoval)
geoval sqrt(geoval)
geoval hypot(geoval, geoval)
geoval exp(geoval);
geoval exp10(geoval);
geoval log(geoval);
geoval log10(geoval);

Each math. function uses the sensor value(s) of input parameter(s) to compute the sensor value of the output. The output sensor status is inherited from the input sensor object (or the minimal of the two input parameters, like in atan2() and hypot()). The output sensor value is set to Bad if the input sensor value is Bad or illegal (e.g. log() of the negative number).

An example of using mathematical functions:

geoval wind_east, wind_north;
wind_east  = i.xs(15, 8);	// extract east wind
wind_north = i.xs(23, 8);	// extract north wind
wind_east.med();		// find median
wind_north.med();		// find median
r.wind = hypot(wind_east, wind_north); // find abs. value
r.wind_dir = atan2(wind_north, wind_east); // find direction

Step #10: Assigning position and time values.

By the default, date and time of the last message in the batch is saved to the output, which is equivalent to

r.date = i.date.lst();
r.time = i.time.lst();

If the platform latitude, longitude and elevation are also to be saved to the output, it can be done using the lat, lon, and elev fields of the dcsin i input structure:

r.lat = i.lat.lst();
r.lon = i.lon.lst();
r.elev = i.elev.lst();

Below is a complete list of fields of the dcsin I structure that can be accessed from the conversion algorithm as I.<field_name>:

platform - platform number,
nmsg - number of messages in the batch,
time - GMT time in seconds of the day,
lat, lon, elev - latitude, longitude and elevation
        of platform, computed/assigned by dcsloc.
count - the call number for multi-hour method.

The following fields are loaded from the database recorded and stored in the dbrec sub-structure of dcsin structure:

method - method name string,
count - number of hours of data per one message,
group - group name string,
length - message length in bytes,
wmo - WMO platform number,
gtsmsg - GTS message number,
num_const - number of platform constants found in
                  the dcs_data file.

The above fields can be accessed as

I.dbrec->(field_name),

e.g R.wmo_platf_num = I.dbrec->wmo;.

BUILDING THE SYSTEM

Below are the steps of building your own version of dcsproc function based on your personal configuration and method files:

1. Do your development in a private directory, that has only configuration and method files in it.

2. Create methods.list file, specifying the methods you'd like to implement.

3. Create outfields.list file, specifying the superset of variables, computed with your methods.

4. Create as many method files as you specified in your methods.list file. The names of the method functions should match the second parameter in the METHOD(*,*) entries in the methods.list file. The output fields in your method files should be in the outfields.list file as the first parameter of FIELD(*,...) entries.

5. After methods.list, outfields.list or any method file were modified, the DCS system needs to be rebuilt with the command:

% dcsmake dcsproc

which will generate the dcsproc executable (or pick up any other name for it).

ENVIRONMENT VARIABLES

TSCANROOT and DCSDBASE environment variables need to be defined.

All TeraScan include files are expected to be under $TSCANROOT/include, all TeraScan libraries -- under $TSCANROOT/lib and dcs_data file -- under $DCSDBASE directories.

SEE ALSO

dcs, dcsin, dcsloc, dcsproc, dcsmix, findcases


Last Update: $Date: 1999/05/10 21:16:27 $