Previous Up Next

Chapter 14  Using the generated library

In this chapter, we describe various functions provided by the core Pads library for manipulating Pads handles. Figure 2.6 shows a simple example of library use: the call to P_open initializes the Pads handle, in this case with a default Pads discipline and a default IO discipline; the call to P_io_fopen indicates that the input data may be found in the file located at path "data/sirius"; the function P_io_at_eof tests if the file has been exhausted; the call to P_io_close closes the input file; and finally, the call to P_close closes the Pads handle.

Perror_t P_open (P_t **pads_out, Pdisc_t *disc, Pio_disc_t *io_disc)
This function creates a Pads handle, which is returned in the space supplied by the first parameter. The second and third arguments are input parameters pointing to a Pads discipline and an IO discipline to use in the handle. Either or both of these may be NULL, in which case default disciplines are used.
Perror_t P_close (P_t *pads)
This function deallocates a Pads handle, freeing all associated resources. If there is an installed IO discipline, it is unmade; after this point it should NOT be used any more.
Perror_t P_close_keep_io_disc(P_t *pads, int keep_io_disc)
This function is Like P_close, except takes an extra argument, keep_io_disc, which if non-zero indicates the installed IO discipline (if any) should not be unmade; in this case it CAN be used again, , in a future P_open call.
Pdisc_t * P_get_disc(P_t *pads)
This function returns NULL on error; otherwise, it returns a pointer to the installed discipline.
Perror_t P_set_disc(P_t *pads, Pdisc_t *new_disc, int xfer_io)
This function installs a different discipline handle. If the parameter xfer_io is non-zero, then the IO discipline from the old handle is moved to the new handle.
Perror_t P_set_io_disc(P_t* pads, Pio_disc_t* new_io_disc)
This function installs a different IO discipline into the main discipline. If there is an open SFIO stream, it is transferred to the new IO discipline after closing the old IO discipline in a way that returns all bytes beyond the current IO cursor to the stream. The old IO discipline (if any) is unmade. After this point the old IO discipine should NOT be re-used.
Perror_t P_set_io_disc_keep_old(P_t* pads, Pio_disc_t* new_io_disc, int keep_old)
This function is like P_set_io_disc, except it takes an extra argument, keep_old, which if non-zero indicates that the old IO discipline should not be unmade; in this case it CAN be used again, e.g., in a future P_set_io_disc call.
Tm_zone_t *P_cstr2timezone(const char *tzone_str)
Utility function for converting a C string to a time zone pointer. It returns NULL if the string is an invalid time zone string. The following table describes the time zone names understood by the P_cstr2timezone function (columns two and three), as well as the numeric representations of such (columns four and five). Blank entries are represented by a dash.
CountryStandardSavingsMinutes West of UTCSaving Time Minutes Offset
-GMT-00
-UCT-00
-UTC-00
-CUT-00
USAHST-6000
USAYSTYDT540-60
USAPSTPDT480-60
USAPSTPPET480-60
USAMSTMDT420-60
USACSTCDT360-60
USAESTEDT300-60
CANASTADT240-60
CANNST-2100
GBR-BST0-60
EURWET-0-60
EURCET--60-60
EURMET--60-60
EUREET--120-60
ISRISTIDT-180-60
INDIST--3300
CHNHKT--4800
KORKSTKDT-480-60
SNGSST--4800
JPNJST--5400
AUSAWST--4800
AUSWST--4800
AUSACST--570-60
AUSCST--570-60
AUSAEST--600-60
AUSEST--600-60
NZLNZSTNZDT-720-60
Perror_t P_set_in_time_zone(P_t *pads, const char *new_in_time_zone)
Perror_t P_set_out_time_zone(P_t *pads, const char *new_out_time_zone)
These functions set the input and output time zones, respectively. See Section 15.1.10 and Section 15.1.11 for more information.
Perror_t P_io_set (P_t *pads, Sfio_t *io)
This function initializes or changes the current SFIO stream used for input. If there is already an installed SFIO stream, P_io_close is implicitly called first.
Perror_t P_io_fopen(P_t *pads, const char *path)
This function opens a file for reading (a higher-level alternative to io_set). It uses pads->disc->fopen_fn if that value is non-null; otherwise, it uses P_fopen. It always opens files with mode "r". The function returns P_OK on success, and P_ERR on error.
Perror_t P_io_close(P_t *pads)
This function cleans up the IO discipline state. It attempts to return bytes that were read from the underlying SFIO stream but not consumed by the parse back to the stream.

If the underlying SFIO stream arose from a file open via P_io_fopen, the file is closed. If the underlying Sfio stream was installed via P_io_set, it is not closed. In this case, it is up to the program that opened the installed SFIO stream to close it (after calling P_io_close).

Perror_t P_io_next_rec (P_t *pads, size_t *skipped_bytes_out)
This function advances the current IO position to start of the next record, if any. It returns P_OK on success, P_ERR on failure, which includes hitting EOF before EOR. For the P_OK case, the function sets *skipped_bytes_out to the number of data bytes that were passed over while searching for EOR.
Perror_t P_io_skip_bytes(P_t *pads, size_t width, size_t *skipped_bytes_out)
This function advances the current IO position by specified number of bytes, or if that many bytes cannot be skipped, then by as many bytes as available. It sets *bytes_skipped_out to the number of bytes skipped. It returns P_OK if the requested bytes were skipped, P_ERR if fewer than the requested bytes were skipped. For record-based disciplines, the function does NOT advance the IO position beyond the current record.
int P_io_at_eor(P_t *pads)
This function returns 1 if the current IO position is at EOR; otherwise it returns 0.
int P_io_at_eof(P_t *pads)
This function returns 1 if the current IO position is at EOF; otherwise it returns 0.
int P_io_at_eor_or_eof(P_t *pads)
This function returns 1 if the current IO position is at EOR or EOF; otherwise it returns 0.
const char * P_io_read_unit(P_t *pads)
This function provides a description of the read unit used in Ppos_t (e.g., "line", "1K block", etc.). Returns NULL on error (if there is no installed IO discipline).
Perror_t P_io_getPos(P_t *pads, Ppos_t *pos, int offset)
This function fills in *pos with the current IO position. If offset is zero, the current IO position is used, otherwise the position used is offset bytes from the current IO position. The current IO position does not change. P_ERR is returned if information about the specified position cannot be determined. EOR marker bytes (if any) are ignored when moving forward or back based on offset: offset only refers to data bytes.
Perror_t P_io_getLocB(P_t *pads, Ploc_t *loc, int offset)
This function fills in loc->b with the IO position. See the description of P_io_getPos for a description of the offset parameter.
Perror_t P_io_getLocE(P_t *pads, Ploc_t *loc, int offset)
This function fills in loc->e with the IO position. See the description of P_io_getPos for a description of the offset parameter.
Perror_t P_io_getLoc(P_t *pads, Ploc_t *loc, int offset)
This function fills in both loc->b and loc->e with the IO position. See the description of P_io_getPos for a description of the offset parameter.

14.1  Compiled regular expressions

The scan and read functions that take regular expressions as arguments require pointers to compiled regular expressions of type Pregexp_t*.

A Pregexp_t contains two things:

  1. a boolean, valid, which indicates whether the Pregexp_t contains a valid compiled regular expression.
  2. some private state (an internal represention of the compiled regular expression) which should be ignored by the users of the library.
typedef struct Pregexp_s {
  
int                  valid;
  P_REGEXP_T_PRIVATE_STATE;
} Pregexp_t;

If my_regexp.valid is non-zero, then my_regexp requires cleanup when no longer needed.

Upon declaring a Pregexp_t, one should set valid to 0. You can do this directly, as in:

Pregexp_t my_regexp = 0 ;

or you can use the preferred method, which is to use the following macro:

P_REGEXP_DECL_NULL(my_regexp);

When through with a Pregexp_t, one should call Pregexp_cleanup, as in:

Pregexp_cleanup(pads,my_regexp);

to clean up any private state that may have been allocated.

The following functions are used to compile a string into a Pregexp_t and to cleanup a Pregexp_t when it is no longer needed. They should passed a pointer to a properly initialized (null or valid) Pregexp_t.

Perror_t Pregexp_compile(P_t *pads, const Pstring *regexp_str, Pregexp_t *regexp)
If regexp_str is a string containing a valid regular expression, this function fills in (*regexp) and returns P_OK. If the string is not a valid regular expression, it returns P_ERR.
Perror_t Pregexp_compile_cstr(P_t *pads, const char *regexp_str, Pregexp_t *regexp)
This function is like Pregexp_compile, but it takes a const char* argument rather than a const Pstring* argument.
Perror_t Pregexp_cleanup(P_t *pads, Pregexp_t *regexp)
This function deallocates resources associated with regexp.

Both compile functions will perform a cleanup action if regexp->valid is non-zero prior to doing the compilation, and they both set regexp->valid to 0 if the compilation fails and to 1 if it succeeds.

Note that if you use a Pregexp_t to hold more than one compiled regular expression over time, you only need to call Pregexp_cleanup after the final use.

14.1.1  Regular expression macros

The P_RE_STRING_FROM macros convert their char or string args into strings containing regular expressions that match exactly the specified character or string. The string result is in temporary storage, so it should be used immediately (e.g.., in a Pregexp_compile_cstr call).

const char* P_RE_STRING_FROM_CHAR(P_t *pads, Pchar char_expr)
This function produces a regular expression string that matches a single character. Example: P_RE_STRING_FROM_CHAR(pads, ’a’) returns string "/[a]/".
const char* P_RE_STRING_FROM_CSTR(P_t *pads, const char * str_expr)
Produces a regular expression string that matches a string. Example: P_RE_STRING_FROM_CSTR(pads, "abc") returns string "/abc/l".
const char* P_RE_STRING_FROM_STR(P_t *pads, Pstring *str_expr)
Same as above, but takes a Pstring* rather than a const char*.
void P_REGEXP_FROM_CHAR(P_t *pads, Pregexp_t my_regexp, Pchar char_expr)
void P_REGEXP_FROM_CSTR(P_t *pads, Pregexp_t my_regexp, const char * str_expr)
void P_REGEXP_FROM_STR(P_t *pads, Pregexp_t my_regexp, Pstring * str_expr)

The P_REGEXP_FROM macros do the above conversions, and then do the added step of compiling the result into Pregexp my_regexp. In each case, one can check my_regexp.valid after the macro call to check whether the result is a valid compiled regular expression.


Previous Up Next