count - Count variable values meeting various criteria

SYNOPSIS

count  [ parameter=value ]  [ inputfile ]

Parameters are: variable, criteria,   numberlimits

DESCRIPTION

count determines the number of variable values meeting a given criteria, and prints that number to UNIX stdout. The number can be reported in one of three ways: count, percent of the total number of values, or percent of the number of good values. Percents are rounded to the nearest integer.

The following criteria are supported:

total      - total number of values
good       - number of good (non-missing) values
bad        - number of bad (missing) values

equal      - number of good values == N
neq        - number of good values != N

lt         - number of good values <  N
le         - number of good values <= N

gt         - number of good values >  N
ge         - number of good values >= N

[]         - number of good values inside [N1, N2]
()         - number of good values inside (N1, N2)

[)         - number of good values inside [N1, N2)
(]         - number of good values inside (N1, N2]

outside[]  - number of good values outside [N1, N2]
outside()  - number of good values outside (N1, N2)

outside[)  - number of good values outside [N1, N2)
outside(]  - number of good values outside (N1, N2]

[ means closed on the left; i.e., the left endpoint is included in the interval. ] means closed on the right; i.e., the right endpoint is included in the interval.

( means open on the left; i.e., the left endpoint is not included in the interval. ) means open on the right; i.e., the right endpoint is not included in the interval.

PARAMETERS

variable
Name of the single variable to investigate. Wildcards * and ? are allowed. There is no default.
output
One of the following options: count, total_percent, good_percent. The default is count.
criteria
One of the criteria listed above. The default is total.
number
Number to compare against when criteria is equal, neq, lt, le, gt, or ge. The default is 0.
limits
Pair of numbers defining the interval when criteria is [], (), [), (], outside[], outside(), outside[), or outside(]. There is no default interval.

EXAMPLES

The following example shows two ways how to compute the percentage of non-missing values in a C-shell script:

set PERCENT_GOOD = `count variable=mcsst output=total_percent
                                         criteria=good testdata`

set TOTAL = `count variable=mcsst output=count criteria=total testdata`
set GOOD  = `count variable=mcsst output=count criteria=good  testdata`
@ PERCENT_GOOD = ( $GOOD * 100 ) / $TOTAL

The following example shows two ways how to compute the percentage of positive data values, not counting missing data, in a C-shell script:

set PERCENT_PLUS = `count variable=mcsst output=good_percent
                                         criteria=gt number=0 testdata`

set GOOD = `count variable=mcsst output=count criteria=good testdata`
set PLUS = `count variable=mcsst output=count criteria=gt number=0 testdata`
@ PERCENT_PLUS = ( $PLUS * 100 ) / $GOOD

Note that C-shell arithmetic rounds the percent down.

SEE ALSO

partcount, stats

NOTES

The following relationships must be true when output=count:

#total = #good + #bad
#total = #equal + #neq

#good  = #lt + #ge
#good  = #le + #gt

#good  = #[] + #outside[]
#good  = #() + #outside()

#good  = #(] + #outside(]
#good  = #(] + #outside(]

#le    = #lt + #equal
#ge    = #gt + #equal

Last Update: $Date: 1999/05/10 20:13:16 $