mirror of
https://github.com/php/web-php.git
synced 2026-03-24 07:12:16 +01:00
148 lines
6.7 KiB
Plaintext
148 lines
6.7 KiB
Plaintext
Migrating functionality:
|
|
|
|
Adding new internal functions to the parser involves the following:
|
|
1. Writing the function code (described below). The code shouldn't go
|
|
into internal_functions.c (that includes the internal function API
|
|
implementation), but rather, into basic_functions.c or some new .c file.
|
|
|
|
2. Registering the function in the internal_functions[] array, located in
|
|
internal_functions.c. The first field of the internal_function is the
|
|
function name, which should be a lower-case string. The second is a function
|
|
pointer of type (void (*)(INTERNAL_FUNCTION_PARAMETERS)), to the C-function that
|
|
handles this function. The third, is whether or not the function is compiled in
|
|
this compilation.
|
|
|
|
3. Adding an extern declaration in internal_functions.h or some .h file
|
|
that is included from it for the added function.
|
|
|
|
4. Updating the Makefile with the proper dependencies, if any (for
|
|
instance, if a new .h file is included from internal_functions.h, this file
|
|
should be added to internal_functions.h's dependency list).
|
|
|
|
No changes to the lexical and syntactic scanners are required (they
|
|
wouldn't be recompiled either if dependencies are properly kept).
|
|
|
|
The API itself:
|
|
|
|
We've already implemented a few functions, so one way to understand it
|
|
would be looking at basic_functions.c. In a function, there are basically
|
|
3 things one does - accepting arguments, executing the actual function
|
|
code, and returning a result. All arguments in our parser are of type
|
|
YYSTYPE *. YYSTYPE has a .value property, which is a union for all possible
|
|
values of the variable. It also contains a .type property, which denotes
|
|
the currently active type for the variable.
|
|
|
|
Variables are passed to internal functions using the HashTable structure.
|
|
However, one should not mess with the HashTable directly in order to obtain
|
|
these arguments, but use the getParameters() function instead.
|
|
getParameters() accepts the hash table, the amount of expected arguments
|
|
and a list of YYSTYPE **, and updates these YYSTYPEs with the
|
|
arguments. For example, if one expects two arguments, the beginning of the
|
|
function would look similar to this (the hashtable pointer is called ht):
|
|
|
|
void php3_foo(INTERNAL_FUNCTION_PARAMETERS)
|
|
{
|
|
YYSTYPE *arg1, *arg2;
|
|
if (getParameters(ht,2,&arg1,&arg2)==FAILURE) {
|
|
WRONG_PARAM_COUNT;
|
|
}
|
|
...
|
|
}
|
|
|
|
getParameters() would accept any number of arguments, but the number of
|
|
these arguments MUST match the argument_count that's supplied (e.g., if one
|
|
writes getParameter(ht,2,&arg1); this would break the program!).
|
|
In addition, the macro ARG_COUNT(ht) returns the number of arguments
|
|
supplied, which can be used for functions that accept a variable amount of
|
|
arguments. These kind of functions can benefit from the
|
|
getParametersArray() function. It's similar to getParameters(), only it
|
|
accepts an array of YYSTYPE * as an argument, instead of a list of YYSTYPE *'s.
|
|
e.g., if one calls getParametersArray(ht,7,yystype_array), the first
|
|
argument to the function would be placed in yystype_array[0], the second at
|
|
yystype_array[1], etc. Again, the array size must be big enough to contain
|
|
the supplied argument_count. This can be used to implement functions that
|
|
accept an arbitrary amount of arguments.
|
|
|
|
Executing the function code would probably have to use the argument values.
|
|
Using them is easy, but one must remember that only one type is valid for
|
|
each argument at any given time.
|
|
The longint value is stored at arg->value.lval, with arg->type set to IS_LONG.
|
|
The double value is stored at arg->value.dval, with arg->type set to IS_DOUBLE.
|
|
The string value is stored at arg->value.strval, with arg->strlen set to the
|
|
length of the string, and arg->type set to IS_STRING.
|
|
The array value is stored at arg->value.ht, with arg->type set to IS_ARRAY.
|
|
Internal functions can be sent array values (this is kind of alpha, as we
|
|
wrote the code during the time we were writing this line:).
|
|
|
|
To be sure the arguments are in the expected format them to be, one can use
|
|
convert_to_long(arg), convert_to_double(arg) and convert_to_string(arg)
|
|
(there are a few other functions, such as convert_double_to_long() which
|
|
would convert a double to long, but wouldn't convert a string).
|
|
|
|
Return values should be assigned to the 'return_value' global variable, and
|
|
its type should be properly set as well. If nothing is assigned to return_value
|
|
the default is the FALSE empty string "".
|
|
Here's a simple example of how to implement a simple concat() function as
|
|
an internal function (completely useless as its supported at the parser
|
|
level, but a good example):
|
|
|
|
void php3_concat(INTERNAL_FUNCTION_PARAMETERS)
|
|
{
|
|
YYSTYPE *arg1, *arg2;
|
|
|
|
if (getParameters(ht,2,&arg1,&arg2)==FAILURE) {
|
|
WRONG_PARAM_COUNT;
|
|
}
|
|
convert_to_string(arg1);
|
|
convert_to_string(arg2);
|
|
return_value.strlen = arg1->strlen+arg2->strlen;
|
|
return_value.value.strval = (char *) malloc (return_value.strlen+1);
|
|
if (!return_value.value.strval) {
|
|
var_reset(&return_value); /* resets return_value to the empty string */
|
|
return;
|
|
}
|
|
strcpy(return_value.value.strval,arg1->value.strval);
|
|
strcat(return_value.value.strval,arg2->value.strval);
|
|
return_value.type = IS_STRING;
|
|
}
|
|
|
|
Another important difference is that no matter how many arguments are sent
|
|
to a given function, the function doesn't have to accept them all, or even
|
|
any of them, in order to maintain program flow (there's no stack to be
|
|
ruined). The hash table that's used by the internal function is cleaned
|
|
automatically at the end of the function call by the internal function call
|
|
handler.
|
|
|
|
That about covers the internal function API.
|
|
|
|
Another thing that you'd personally have to use is direct access to the
|
|
global symbol table, so that you'd be able to add in the POST/GET variables
|
|
(with magic quotes) and environment variables at startup. As mentioned,
|
|
this symbol table is implemented using a HashTable, and is named
|
|
(shockingly) 'symbol_table'.
|
|
|
|
Adding a new entry to it is simple, here's a sample function that adds (or
|
|
updates) a (variable,value) set to the global symbol table:
|
|
int add_pair(char *varname, char *value)
|
|
{
|
|
YYSTYPE var;
|
|
|
|
var.value.strval = value;
|
|
var.strlen = strlen(value);
|
|
var.type = IS_STRING;
|
|
|
|
hash_update(&symbol_table, varname, strlen(varname), &var, sizeof(YYSTYPE));
|
|
}
|
|
|
|
Note that the char *value is inserted (indirectly) into the hash, and thus,
|
|
must not be free()'d nor changed after the hash_update() call (one can run
|
|
the yystype_copy_constructor(&var) before calling hash_update(), which
|
|
duplicates all of the dynamic memory in the yystype, thus allowing the use
|
|
of char *value freely after the hash_update() call). The char *varname
|
|
(which is used as the symbol table key) is copied inside the hash, and can
|
|
be used and free()d later.
|
|
|
|
Getting the hang of it may take some time, but once you get used to the few
|
|
simple mentioned rules, adding new functionality is a breeze.
|
|
|