Filters partition an input file into two output files: one with data
conforming to the specification, the other with data containing
errors. Filters apply to data formats that contain an optional header
followed by a sequence of records.
21.1 Template Program
Because generating a filter from a Pads description is
a very routine task, Pads provides a template program to automate
the task for common data formats. In particular, the template applies
to data that can be viewed as an optional header followed by a
sequence of records.
When instantiated, the template program takes an optional command-line
argument specifying the path to the data source. If no argument is
given, it uses a default location for the data specified by the
template user. The location for the clean and error records can be
set in the template program.
The template first reads the optional header, echoing it to either the
clean or the error file, depending upon the resulting parse
descriptor. It then reads each record, echoing it to the approporiate
file until the data source is exhausted.
Like the accumulator template, the filter template is a C header
file parameterized by a number of macros that permit the user to customize
the template by defining appropriate values for these macros.
The following list describes the macros used by the
filter template:
-
DATE_IN_FMT
- If defined, this macro sets the default
input format for dates described by Pdate. See
Section 15.1.12 for more
information.
- DATE_OUT_FMT
- If defined, this macro sets the default
output format for Pdate and Pdate_explicit. See
Section 15.1.13 for more information.
- DEF_INPUT_FILE
- If defined, this macros specifies a
string representation of the path to the default data source. If no
path to the data is supplied at the command-line, this is the
location used for input data.
- EXTRA_BAD_READ_CODE
- If defined, this macro points to a C
statement that will be executed after any body record containing an
error.
- EXTRA_BEGIN_CODE
- If defined, this macro points to a C
statement that will be executed after all initialization code is
performed, but before the optional header is read.
- EXTRA_DECLS
- This optional macro defines additional C
declarations that proceed all accumulator code.
- EXTRA_DONE_CODE
- If defined, this macro points to a C
statement that will be executed after generating the accumulator report.
- EXTRA_GOOD_READ_CODE
- If defined, this macro points to a C
statement that will be executed after any body record not containing an
error.
- EXTRA_HEADER_READ_ARGS
- If the type of the header record
was parameterized, this macro allows the user to supply
corresponding parameters.
- EXTRA_READ_ARGS
- If the type of the repeated record was
parameterized, this macro allows the user to supply corresponding
parameters.
- IN_TIME_ZONE
- If set, this macro specifies the input time
zone of date types that do not include time zone information.
See Section 15.1.10 for more detail.
- IO_DISC_MK
- If defined, this macro specifies the
interpretation of Precord by indicating which IO discpline the
system should install. It specifies the discipline by naming the
function to create the discipline. Section 15.2
describes the available IO discipline creation functions. If the
user does not define this macro, the system installs the IO
discipline corresponding to new-line terminated ASCII records.
- MAX_RECS
- If defined, this macro specifies an integer that
limits the number of repeated records that the accumulator program
should read.
- OUT_TIME_ZONE
- If set, this macro specifies the output
time zone of date types.
See Section 15.1.11 for more detail.
- PADS_HDR_TY
- Intuitively, this macro defines the
type of the header record in the data source. This macro need only
be defined if the data source has a header record.
It defines a function used by the template
program to generate the various function and type names derived from
the name of the header record type, i.e., the type of the associated
in-memory representation, mask, parse descriptor, read function,
etc.
- PADS_TY
- Intuitively, this macro defines the
type of the repeated record in the data source, i.e., the type of
the value to be accumulated. This macro must be defined to use the
accumulator template. It defines a function used by the template
program to generate the various function and type names derived from
the name of the record type, i.e., the type of the associated
in-memory representation, mask, parse descriptor, read function,
etc.
- READ_MASK
- This macro specifies the mask to use in reading
the repeated record. If not defined by the user, the template uses
the value P_CheckAndSet.
- TIME_IN_FMT
- If defined, this macro sets the default
input format for Ptime. See
Section 15.1.12 for more
information.
- TIME_OUT_FMT
- If defined, this macro sets the default
output format for Ptime and Ptime_explicit. See
Section 15.1.13 for more information.
- TIMESTAMP_IN_FMT
- If defined, this macro sets the default
input format for Ptimestamp. See
Section 15.1.12 for more
information.
- TIMESTAMP_OUT_FMT
- If defined, this macro sets the default
output format for the Pads types Ptimestamp and Ptimestamp_explicit. See
Section 15.1.13 for more information.
- WSPACE_OK
- If defined, this macro indicates that leading
white space for variable-width ASCII integers is okay, as well as
leading and trailing white space for fixed-width ASCII integers.