Chapter 14 Using the generated library
In this chapter, we describe various functions provided by the core
Pads library for manipulating Pads handles.
Figure 2.6 shows a simple example of library use:
the call to P_open initializes the Pads handle, in this case
with a default Pads discipline and a default IO discipline;
the call to P_io_fopen indicates that the input data may be
found in the file located at path "data/sirius";
the function P_io_at_eof tests if the file has been exhausted;
the call to P_io_close closes the input file; and finally,
the call to P_close closes the Pads handle.
-
Perror_t P_open (P_t **pads_out, Pdisc_t
*disc, Pio_disc_t *io_disc)
-
This function creates a Pads handle, which is returned in the space
supplied by the first parameter. The second and third arguments are
input parameters pointing to a Pads discipline and an IO discipline
to use in the handle. Either or both of these may be NULL, in
which case default disciplines are used.
- Perror_t P_close (P_t *pads)
-
This function deallocates a Pads handle, freeing all associated resources.
If there is an installed IO discipline,
it is unmade; after this point it should NOT be used any more.
- Perror_t P_close_keep_io_disc(P_t *pads, int keep_io_disc)
-
This function is Like P_close, except takes an extra argument, keep_io_disc, which
if non-zero indicates the installed IO discipline (if any) should not be unmade;
in this case it CAN be used again, , in a future P_open
call.
- Pdisc_t * P_get_disc(P_t *pads)
-
This function returns NULL on error; otherwise, it returns a
pointer to the installed discipline.
- Perror_t P_set_disc(P_t *pads, Pdisc_t
*new_disc, int xfer_io)
-
This function installs a different discipline handle. If the
parameter xfer_io is non-zero, then the IO discipline from the
old handle is moved to the new handle.
- Perror_t P_set_io_disc(P_t* pads, Pio_disc_t* new_io_disc)
-
This function installs a different IO discipline into the
main discipline. If there is an open SFIO stream,
it is transferred to the new IO discipline after closing the old IO
discipline in a way that returns all bytes beyond the current IO cursor to
the stream. The old IO discipline (if any) is unmade. After this
point the old IO discipine should NOT be re-used.
- Perror_t P_set_io_disc_keep_old(P_t* pads, Pio_disc_t* new_io_disc, int keep_old)
-
This function is like P_set_io_disc, except it takes an extra argument, keep_old,
which if non-zero indicates that the old IO discipline
should not be unmade; in this case it CAN be used again, e.g., in a future
P_set_io_disc call.
- Tm_zone_t *P_cstr2timezone(const char *tzone_str)
-
Utility function for converting a C string to a time zone pointer.
It returns NULL if the string is an invalid time zone string.
The following table describes the time zone names understood by the
P_cstr2timezone function (columns two and three), as well as the
numeric representations of such (columns four and five).
Blank entries are represented by a dash.
Country | Standard | Savings | Minutes West of UTC | Saving Time Minutes Offset |
- | GMT | - | 0 | 0 |
- | UCT | - | 0 | 0 |
- | UTC | - | 0 | 0 |
- | CUT | - | 0 | 0 |
USA | HST | - | 600 | 0 |
USA | YST | YDT | 540 | -60 |
USA | PST | PDT | 480 | -60 |
USA | PST | PPET | 480 | -60 |
USA | MST | MDT | 420 | -60 |
USA | CST | CDT | 360 | -60 |
USA | EST | EDT | 300 | -60 |
CAN | AST | ADT | 240 | -60 |
CAN | NST | - | 210 | 0 |
GBR | - | BST | 0 | -60 |
EUR | WET | - | 0 | -60 |
EUR | CET | - | -60 | -60 |
EUR | MET | - | -60 | -60 |
EUR | EET | - | -120 | -60 |
ISR | IST | IDT | -180 | -60 |
IND | IST | - | -330 | 0 |
CHN | HKT | - | -480 | 0 |
KOR | KST | KDT | -480 | -60 |
SNG | SST | - | -480 | 0 |
JPN | JST | - | -540 | 0 |
AUS | AWST | - | -480 | 0 |
AUS | WST | - | -480 | 0 |
AUS | ACST | - | -570 | -60 |
AUS | CST | - | -570 | -60 |
AUS | AEST | - | -600 | -60 |
AUS | EST | - | -600 | -60 |
NZL | NZST | NZDT | -720 | -60 |
- Perror_t P_set_in_time_zone(P_t *pads, const char *new_in_time_zone)
-
- Perror_t P_set_out_time_zone(P_t *pads, const char *new_out_time_zone)
-
These functions set the input and output time zones, respectively.
See Section 15.1.10 and
Section 15.1.11 for more information.
- Perror_t P_io_set (P_t *pads, Sfio_t *io)
-
This function initializes or changes the current SFIO stream used for input.
If there is already an installed SFIO stream, P_io_close is
implicitly called first.
- Perror_t P_io_fopen(P_t *pads, const char
*path)
-
This function opens a file for reading (a higher-level alternative to io_set).
It uses pads->disc->fopen_fn if that value is non-null;
otherwise, it uses P_fopen. It always opens files with mode
"r". The function returns P_OK on success, and
P_ERR on error.
- Perror_t P_io_close(P_t *pads)
-
This function cleans up the IO discipline state. It attempts to return bytes that were
read from the underlying SFIO stream but not consumed by the parse
back to the stream.
If the underlying SFIO stream arose from a file open via P_io_fopen,
the file is closed. If the underlying Sfio stream was installed via
P_io_set, it is not closed. In this case, it is up to the
program that opened the installed SFIO stream to close it
(after calling P_io_close).
- Perror_t P_io_next_rec (P_t *pads, size_t *skipped_bytes_out)
-
This function advances the current IO position to start of the next record, if any.
It returns P_OK on success, P_ERR on failure, which
includes hitting EOF before EOR.
For the P_OK case, the function sets *skipped_bytes_out to the number of
data bytes that were passed over while searching for EOR.
- Perror_t P_io_skip_bytes(P_t *pads, size_t width, size_t *skipped_bytes_out)
-
This function advances the current IO position by specified number of bytes, or if that many
bytes cannot be skipped, then by as many bytes as available.
It sets *bytes_skipped_out to the number of bytes skipped.
It returns P_OK if the requested bytes were skipped, P_ERR if fewer
than the requested bytes were skipped. For record-based
disciplines, the function does NOT advance the IO position beyond the
current record.
- int P_io_at_eor(P_t *pads)
-
This function returns 1 if the current IO position is
at EOR; otherwise it returns 0.
- int P_io_at_eof(P_t *pads)
-
This function returns 1 if the current IO position is
at EOF; otherwise it returns 0.
- int P_io_at_eor_or_eof(P_t *pads)
-
This function returns 1 if the current IO position is
at EOR or EOF; otherwise it returns 0.
- const char * P_io_read_unit(P_t
*pads)
-
This function provides a description of the read unit used in Ppos_t
(e.g., "line", "1K block",
etc.). Returns NULL on error (if there is no installed IO discipline).
- Perror_t P_io_getPos(P_t *pads, Ppos_t *pos,
int offset)
-
This function fills in *pos with the current IO position.
If offset is zero, the current IO position is
used, otherwise the position used is offset bytes from the
current IO position.
The current IO position does not change. P_ERR is returned if
information about the specified position cannot be determined.
EOR marker bytes (if any) are ignored when moving forward or back
based on offset: offset only refers to data bytes.
- Perror_t P_io_getLocB(P_t *pads, Ploc_t *loc,
int offset)
-
This function fills in loc->b with the IO position.
See the description of P_io_getPos for a description of the
offset parameter.
- Perror_t P_io_getLocE(P_t *pads, Ploc_t *loc,
int offset)
-
This function fills in loc->e with the IO position.
See the description of P_io_getPos for a description of the
offset parameter.
- Perror_t P_io_getLoc(P_t *pads, Ploc_t *loc,
int offset)
-
This function fills in both loc->b and loc->e with the IO
position. See the description of P_io_getPos for a description of the
offset parameter.
14.1 Compiled regular expressions
The scan and read functions that take regular expressions as arguments
require pointers to compiled regular expressions of type
Pregexp_t*.
A Pregexp_t contains two things:
-
a boolean, valid, which indicates whether the Pregexp_t
contains a valid compiled regular expression.
- some private state (an internal represention of the compiled regular expression)
which should be ignored by the users of the library.
typedef struct Pregexp_s {
int valid;
P_REGEXP_T_PRIVATE_STATE;
} Pregexp_t;
If my_regexp.valid is non-zero, then my_regexp requires cleanup when no longer needed.
Upon declaring a Pregexp_t, one should set valid to 0.
You can do this directly, as in:
Pregexp_t my_regexp = 0 ;
or you can use the preferred method, which is to use the following
macro:
P_REGEXP_DECL_NULL(my_regexp);
When through with a Pregexp_t, one should call Pregexp_cleanup, as in:
Pregexp_cleanup(pads,my_regexp);
to clean up any private state that may have been allocated.
The following functions are used to compile a string into a Pregexp_t
and to cleanup a Pregexp_t when it is no longer needed. They should
passed a pointer to a properly initialized (null or valid) Pregexp_t.
-
Perror_t Pregexp_compile(P_t *pads, const Pstring
*regexp_str, Pregexp_t *regexp)
-
If regexp_str is a string containing a valid regular
expression, this function fills in (*regexp) and returns P_OK.
If the string is not a valid regular expression, it returns P_ERR.
- Perror_t Pregexp_compile_cstr(P_t *pads, const char *regexp_str, Pregexp_t *regexp)
-
This function is like Pregexp_compile, but it takes a
const char* argument rather than a const
Pstring* argument.
- Perror_t Pregexp_cleanup(P_t *pads, Pregexp_t
*regexp)
-
This function deallocates resources associated with regexp.
Both compile functions will perform a cleanup action if regexp->valid is
non-zero prior to doing the compilation, and they both set regexp->valid
to 0 if the compilation fails and to 1 if it succeeds.
Note that if you use a Pregexp_t to hold more than one compiled
regular expression over time, you only need to call Pregexp_cleanup
after the final use.
14.1.1 Regular expression macros
The P_RE_STRING_FROM macros convert their char or string args into
strings containing regular expressions that match exactly the
specified character or string. The string result is in temporary
storage, so it should be used immediately (e.g.., in a
Pregexp_compile_cstr call).
-
const char* P_RE_STRING_FROM_CHAR(P_t
*pads, Pchar char_expr)
-
This function produces a regular expression string that matches a single character.
Example: P_RE_STRING_FROM_CHAR(pads, ’a’) returns string
"/[a]/".
- const char* P_RE_STRING_FROM_CSTR(P_t
*pads, const char * str_expr)
-
Produces a regular expression string that matches a string.
Example: P_RE_STRING_FROM_CSTR(pads, "abc") returns
string "/abc/l".
- const char* P_RE_STRING_FROM_STR(P_t
*pads, Pstring *str_expr)
-
Same as above, but takes a Pstring* rather than a const
char*.
- void P_REGEXP_FROM_CHAR(P_t *pads, Pregexp_t
my_regexp, Pchar char_expr)
-
- void P_REGEXP_FROM_CSTR(P_t *pads, Pregexp_t
my_regexp, const char * str_expr)
-
- void P_REGEXP_FROM_STR(P_t *pads, Pregexp_t
my_regexp, Pstring * str_expr)
The P_REGEXP_FROM macros do the above conversions, and then do the added step
of compiling the result into Pregexp my_regexp. In each case,
one can check my_regexp.valid after the macro call to check whether the result
is a valid compiled regular expression.