Pstructs are used to describe sequences of values with potentially unrelated types. Intuitively, they correspond to record-like structures externally and C-structs in memory.
The syntax for Pstructs is given by the following BNF grammar fragment:
qualifier | ::= | Pomit ∣ Pendian |
qualifiers | ::= | qualifier ∣ qualifier qualifiers |
constraint | ::= | : predicate |
ty | ::= | c_ty ∣ p_ty |
full_field | ::= | [qualifiers] p_ty identifier [constraint] ; [p_comment] |
comp_field | ::= | Pcompute [Pomit] ty identifier = expression [constraint] ; |
literal_field | ::= | p_coreliteral; |
array_field | ::= | [qualifiers] p_ty ‘[’p_size_spec‘]’ identifier [: p_array_constraints] ; [p_comment] |
opt_field | ::= | [qualifiers] Popt p_ty identifier [: opt_predicates] ; [p_comment] |
field | ::= | full_field ∣ comp_field ∣ literal_field ∣ array_field ∣ opt_field |
fields | ::= | field ∣ field fields |
struct_ty | ::= | Pstruct identifier [p_formals] { |
fields | ||
} [ Pwhere { predicate }] ; |
We explain the meaning of this syntax in the remainder of this chapter. All non-terminals not defined in this grammar fragment are defined elsewhere. Predicates (predicate) are described in Section 3.3. Pads types (p_ty) and formal parameters (p_formals) are described in Section 3.6. Pads comments (p_comment) are described in Section 3.2. Core literals (p_coreliteral) are described in Section 3.4. For in-line arrays, size specifications (p_size_spec) and array constraints appear in Chapter 7. Option constraints (opt_predicates) are defined in Section 9.1. Expressions (expression) represent any C expression, while c_ty denotes any C type.
The following Pstruct describes the request portion of a common-log format web-server log, an example of which is:
GET /research.att.com/projects/PADS/index.html HTTP/1.0
Pstruct http_request_t {
’\"’; http_method_t meth; /- Method used during request
’ ’; Pstring(:’ ’:) req_uri; /- Requested uri.
’ ’; http_v_t version : checkVersion(version, meth);
/- HTTP version
’\"’;
};
The Pstruct http_request_t has full fields meth, req_uri, and version that use the (omitted) auxiliary types http_method_t, Pstring, and http_v_t to describe the HTTP method, URI, and version formats, respectively. It has literal fields ’\"’ and ’ ’ to describe the quotations and spaces in the external representation. The version field uses the C function checkVersion:
int checkVersion(http_v_t version, http_method_t meth) {
if ((version.major == 1) && (version.minor == 0)) return 1;
if ((meth == LINK) || (meth == UNLINK )) return 0;
return 1;
}
to ensure that the obsolete HTTP methods LINK and UNLINK are only used with HTTP version 1.0.
Each full field in a Pstruct must include the name of the field and its type. The name serves to document the data and to permit later reference. The type determines how that piece of the Pstruct will be processed. Optionally, each full field may be preceeded by a qualifier sequence cf. Section 5.1.6.
Each full field may be followed by a constraint (cf. Section 3.3). Such a constraint is used to express the conditions under which a properly parsed value of the field type is a legal value for the field. The field itself and all earlier fields in the Pstruct are in scope in the constraint, as are any parameters to the Pstruct. In the example, the checkVersion predicate on the version field uses the values of the meth and version fields to determine if the version value is legal. If the constraint associated with a field evaluates to false (i.e., zero) after parsing, then the parse descriptor returned with the in-memory representation will indicate a user-constraint violation has occurred for the field.
Each full field in a Pstruct may optionally be followed by a Pads comment. Such comments are reflected by the Pads compiler into the output library as comments.
Instead of being read from an external source, the value of a computed field is set from an initializing expression. Such fields are marked by the Pcompute keyword. Each such field gives its name and the type to be included in the in-memory representation. If the given type is a Pads type, the field will behave exactly as if it were read from the external source. With a C type, some services may not be available in the generated library, such as automatic accumulation and printing. Each computed field also gives a C expression to initialize the field. This expression must have the type declared for the field. Previously read fields in the Pstruct and any parameters to the Pstruct are in scope in this expression. Like full fields, computed fields admit the Pomit qualifier and may have an associated constraint.
The computeExample Pstruct sets the value of its computed field index from the full field base and the offset parameter.
Pstruct computeExample(:int offset:){
Pint32 base;
Pcompute int index = base + offset;
};
Literal fields can be character, string, or regular expression literals. They are written using the notation described in Section 3.4.
In addition to specifying literals to consume from the external representation, literal fields also play a role in error recovery. If the generated parser encounters a syntactic error while parsing a full field, causing it to enter panic mode (cf. Section 3.10), the parser will scan to find the next literal, marking all intervening fields as errors in the associated parse descriptor. The library discipline has parameters that allow the library user to tune the extent of such scanning (cf. Section 15.1.5).
For conciseness, Pads allows anonymous option and array types to be declared within Pstruct field declarations.
In-line array declarations include a size specification after the type of the field. For example, the following Pstruct matches a resolved IP address (of the form 135.27.24.12) and an integer recording a number of bytes, separted by a vertical bar:
Pstruct log_t {
Puint8 [4] ip : Psep(’.’) && Pterm(Pnosep); /- resolved ip address
’|’;
Puint32 numBytes;
};
After the field name, Pads permits an optional colon followed by array constraints. Details about size specifications, array constraints, and the in-memory representation of arrays may be found in Chapter 7.
In-line options are marked by the keyword Popt. For example, the following Pstruct matches two optional integers separated by a vertical bar and terminated by a newline.
Precord Pstruct entry2{
Popt Puint32 f;
’|’;
Popt Puint32 g;
}
This declaration is equivalent to the entry1 type defined in Section 9.1.1. Fields with in-line option declarations admit the option form of constraints, which are described in Section 9.1.2. As an example, the Pstruct entry4
Precord Pstruct entry4{
Popt Puint32 x1 : Psome i => { i % 2 == 0};
Popt Puint32 x2 : Psome i => { i % 2 != 0};
’|’;
Popt Puint32 y1 : Psome i => { i % 2 == 0};
Popt Puint32 y : Psome i => { i % 2 != 0};
’|’;
};
uses option constraints to specify when the option should match. Type entry4 is equivalent to the type entry3 defined in Section 9.1.1 .
Details about the in-memory representation of options appear in Section 9.2.2.
Non-literal fields can take one or more qualifiers.
If given, a Pwhere clause expresses constraints over the entirety of a Pstruct value. The values of all previous fields and any parameters to the Pstruct are in scope. Within the context of a Pparsecheck clause, constants begin and end, each of type Ppos_t are available. Constant begin is bound to the input position of the beginning of the Pstruct; end is bound to its end. If the predicate given in the Pwhere clause evaluates to false (i.e., zero), the error code in the associated parse descriptor will indicate a user-constraint error has occurred.
The Pwhere clause in the whereExample Pstruct ensures that the sum of the first two fields is less than the given limit.
Pstruct whereExample(:int limit:){
Pint32 first;
Pint32 second;
} Pwhere {first + second < limit;};
The in-memory representation of a Pstruct is a C struct of the same name. Each field of the C struct corresponds to a full or computed field of the Pstruct. The type of each full field in the C struct is the in-memory representation of the Pads type associated with the field. The type of each computed field is the given C type.
The C type http_request_t is the in-memory representation of the Pads type of the same name.
The type Pstring is the in-memory representation of the base type Pstring (cf. Chapter 4). Note that literal fields do not appear in the in-memory representation.
The mask of a Pstruct with name myStruct is a C struct with name myStruct_m. For each full field in myStruct, there is a corresponding field in the mask struct, the type of which is the mask type for the field. In addition, there is a structLevel field, which has the base mask type. This field allows library users to toggle operations at the level of the structure as a whole.
For example, the mask type http_request_t_m has the following structure:
typedef struct http_request_t_m_s http_request_t_m;
struct http_request_t_m_s {
Pbase_m structLevel;
http_method_t_m meth; /* nested constraints */
Pbase_m req_uri; /* nested constraints */
http_v_t_m version; /* nested constraints */
Pbase_m version_con; /* struct constraints */
};
The parse descriptor of a Pstruct with name myStruct is a C struct with name myStruct_pd. This struct has the fields described in Section 3.13. In addition, for each full field in myStruct, there is a corresponding field in the parse descriptor struct, the type of which is the parse descriptor type for the field.
For example, the parse descriptor type http_request_t_pd has the following structure:
typedef struct http_request_t_pd_s http_request_t_pd;
struct http_request_t_pd_s {
Pflags_t pstate;
Puint32 nerr;
PerrCode_t errCode;
Ploc_t loc;
http_method_t_pd meth;
Pbase_pd req_uri;
http_v_t_pd version;
};
The operations generated by the Pads compiler for a Pstruct are those described in Chapter 3. For the Pstruct http_request_t, the prototypes of the generated functions appear inFigure 5.1
Perror_t http_request_t_init (P_t *pads,http_request_t *rep);
Perror_t http_request_t_pd_init (P_t *pads,http_request_t_pd *pd);
Perror_t http_request_t_cleanup (P_t *pads,http_request_t *rep);
Perror_t http_request_t_pd_cleanup (P_t *pads,http_request_t_pd *pd);
Perror_t http_request_t_copy (P_t *pads,http_request_t *rep_dst,
http_request_t *rep_src);
Perror_t http_request_t_pd_copy (P_t *pads,http_request_t_pd *pd_dst,
http_request_t_pd *pd_src);
void http_request_t_m_init (P_t *pads,http_request_t_m *mask,
Pbase_m baseMask);
Perror_t http_request_t_read (P_t *pads,http_request_t_m *m,
http_request_t_pd *pd,http_request_t *rep);
ssize_t http_request_t_write2buf (P_t *pads,Pbyte *buf,size_t buf_len,int *buf_full,
http_request_t_pd *pd,http_request_t *rep);
ssize_t http_request_t_write2io (P_t *pads,Sfio_t *io,
http_request_t_pd *pd,http_request_t *rep);
int http_request_t_verify (http_request_t *rep);
int http_request_t_genPD (P_t *pads, http_request_t *rep,
http_request_t_pd *pd);
The error codes for Pstructs are:
Code | Meaning |
P_NO_ERR | Indicates no error occurred |
P_STRUCT_FIELD_ERR | Indicates that an error occurred during parsing one of the fields of the Pstruct. The parse descriptor for each full field with an error will contain more information describing the precise nature of the error. |
P_STRUCT_EXTRA_BEFORE_SEP | Indicates that there were unexpected data before a literal field in the Pstruct. |
P_MISSING_LITERAL | Indicates that the read function failed to find a literal field |
If multiple errors occur during the parsing of a Pstruct, the errCode field will reflect the first detected error. The parse descriptors for nested pieces will describe any errors detected while reading those pieces.
Warning: At the moment, read functions do not check that all referenced data in constraint expressions are meaningful before checking the constraint. Referenced data might be meaningless either because there was an error parsing earlier data or because the supplied mask directed the read function to skip the field.
Accumulator functions for Pstructs are described in Chapter 16.
Histogram functions for Pstructs are described in Chapter 17.
Clustering functions for Pstructs are described in Chapter 18.