Extending PHP

Extending PHP Adding functions to PHP Function Prototype All functions look like this: Even if your function doesn't take any arguments, this is how it is called. Function Arguments Arguments are always of type pval. This type contains a union which has the actual type of the argument. So, if your function takes two arguments, you would do something like the following at the top of your function: Fetching function arguments NOTE: Arguments can be passed either by value or by reference. In both cases you will need to pass &(pval *) to getParameters. If you want to check if the n'th parameter was sent to you by reference or not, you can use the function, ParameterPassedByReference(ht,n). It will return either 1 or 0. When you change any of the passed parameters, whether they are sent by reference or by value, you can either start over with the parameter by calling pval_destructor on it, or if it's an ARRAY you want to add to, you can use functions similar to the ones in internal_functions.h which manipulate return_value as an ARRAY. Also if you change a parameter to IS_STRING make sure you first assign the new estrdup()'ed string and the string length, and only later change the type to IS_STRING. If you change the string of a parameter which already IS_STRING or IS_ARRAY you should run pval_destructor on it first. Variable Function Arguments A function can take a variable number of arguments. If your function can take either 2 or 3 arguments, use the following: Variable function arguments 3 || getParameters(ht,arg_count,&arg1,&arg2,&arg3)==FAILURE) { WRONG_PARAM_COUNT; } ]]> Using the Function Arguments The type of each argument is stored in the pval type field. This type can be any of the following: PHP Internal Types IS_STRING String IS_DOUBLE Double-precision floating point IS_LONG Long integer IS_ARRAY Array IS_EMPTY None IS_USER_FUNCTION ?? IS_INTERNAL_FUNCTION ?? (if some of these cannot be passed to a function - delete) IS_CLASS ?? IS_OBJECT ??

If you get an argument of one type and would like to use it as another, or if you just want to force the argument to be of a certain type, you can use one of the following conversion functions: These function all do in-place conversion. They do not return anything. The actual argument is stored in a union; the members are: IS_STRING: arg1->value.str.val IS_LONG: arg1->value.lval IS_DOUBLE: arg1->value.dval Memory Management in Functions Any memory needed by a function should be allocated with either emalloc() or estrdup(). These are memory handling abstraction functions that look and smell like the normal malloc() and strdup() functions. Memory should be freed with efree(). There are two kinds of memory in this program: memory which is returned to the parser in a variable, and memory which you need for temporary storage in your internal function. When you assign a string to a variable which is returned to the parser you need to make sure you first allocate the memory with either emalloc() or estrdup(). This memory should NEVER be freed by you, unless you later in the same function overwrite your original assignment (this kind of programming practice is not good though). For any temporary/permanent memory you need in your functions/library you should use the three emalloc(), estrdup(), and efree() functions. They behave EXACTLY like their counterpart functions. Anything you emalloc() or estrdup() you have to efree() at some point or another, unless it's supposed to stick around until the end of the program; otherwise, there will be a memory leak. The meaning of "the functions behave exactly like their counterparts" is: if you efree() something which was not emalloc()'ed nor estrdup()'ed you might get a segmentation fault. So please take care and free all of your wasted memory. If you compile with "-DDEBUG", PHP will print out a list of all memory that was allocated using emalloc() and estrdup() but never freed with efree() when it is done running the specified script. Setting Variables in the Symbol Table A number of macros are available which make it easier to set a variable in the symbol table: SET_VAR_STRING(name,value) SET_VAR_DOUBLE(name,value) SET_VAR_LONG(name,value) Be careful with SET_VAR_STRING. The value part must be malloc'ed manually because the memory management code will try to free this pointer later. Do not pass statically allocated memory into a SET_VAR_STRING. Symbol tables in PHP are implemented as hash tables. At any given time, &symbol_table is a pointer to the 'main' symbol table, and active_symbol_table points to the currently active symbol table (these may be identical like in startup, or different, if you're inside a function). The following examples use 'active_symbol_table'. You should replace it with &symbol_table if you specifically want to work with the 'main' symbol table. Also, the same functions may be applied to arrays, as explained below. Checking whether $foo exists in a symbol table Finding a variable's size in a symbol table Arrays in PHP are implemented using the same hashtables as symbol tables. This means the two above functions can also be used to check variables inside arrays. If you want to define a new array in a symbol table, you should do the following. First, you may want to check whether it exists and abort appropriately, using hash_exists() or hash_find(). Next, initialize the array: Initializing a new array This code declares a new array, named $foo, in the active symbol table. This array is empty. Here's how to add new entries to it: Adding entries to a new array If you'd like to modify a value that you inserted to a hash, you must first retrieve it from the hash. To prevent that overhead, you can supply a pval ** to the hash add function, and it'll be updated with the pval * address of the inserted element inside the hash. If that value is &null; (like in all of the above examples) - that parameter is ignored. hash_next_index_insert() uses more or less the same logic as "$foo[] = bar;" in PHP 2.0. If you are building an array to return from a function, you can initialize the array just like above by doing: ...and then adding values with the helper functions: Of course, if the adding isn't done right after the array initialization, you'd probably have to look for the array first: value.ht... } ]]> Note that hash_find receives a pointer to a pval pointer, and not a pval pointer. Just about any hash function returns SUCCESS or FAILURE (except for hash_exists(), which returns a boolean truth value). Returning simple values A number of macros are available to make returning values from a function easier. The RETURN_* macros all set the return value and return from the function: RETURN RETURN_FALSE RETURN_TRUE RETURN_LONG(l) RETURN_STRING(s,dup) If dup is &true;, duplicates the string RETURN_STRINGL(s,l,dup) Return string (s) specifying length (l). RETURN_DOUBLE(d) The RETVAL_* macros set the return value, but do not return. RETVAL_FALSE RETVAL_TRUE RETVAL_LONG(l) RETVAL_STRING(s,dup) If dup is &true;, duplicates the string RETVAL_STRINGL(s,l,dup) Return string (s) specifying length (l). RETVAL_DOUBLE(d) The string macros above will all estrdup() the passed 's' argument, so you can safely free the argument after calling the macro, or alternatively use statically allocated memory. If your function returns boolean success/error responses, always use RETURN_TRUE and RETURN_FALSE respectively. Returning complex values Your function can also return a complex data type such as an object or an array. Returning an object: Call object_init(return_value). Fill it up with values. The functions available for this purpose are listed below. Possibly, register functions for this object. In order to obtain values from the object, the function would have to fetch "this" from the active_symbol_table. Its type should be IS_OBJECT, and it's basically a regular hash table (i.e., you can use regular hash functions on .value.ht). The actual registration of the function can be done using: The functions used to populate an object are: add_property_long( return_value, property_name, l ) - Add a property named 'property_name', of type long, equal to 'l' add_property_double( return_value, property_name, d ) - Same, only adds a double add_property_string( return_value, property_name, str ) - Same, only adds a string add_property_stringl( return_value, property_name, str, l ) - Same, only adds a string of length 'l' Returning an array: Call array_init(return_value). Fill it up with values. The functions available for this purpose are listed below. The functions used to populate an array are: add_assoc_long(return_value,key,l) - add associative entry with key 'key' and long value 'l' add_assoc_double(return_value,key,d) add_assoc_string(return_value,key,str,duplicate) add_assoc_stringl(return_value,key,str,length,duplicate) specify the string length add_index_long(return_value,index,l) - add entry in index 'index' with long value 'l' add_index_double(return_value,index,d) add_index_string(return_value,index,str) add_index_stringl(return_value,index,str,length) - specify the string length add_next_index_long(return_value,l) - add an array entry in the next free offset with long value 'l' add_next_index_double(return_value,d) add_next_index_string(return_value,str) add_next_index_stringl(return_value,str,length) - specify the string length Using the resource list PHP has a standard way of dealing with various types of resources. This replaces all of the local linked lists in PHP 2.0. Available functions: php3_list_insert(ptr, type) - returns the 'id' of the newly inserted resource php3_list_delete(id) - delete the resource with the specified id php3_list_find(id,*type) - returns the pointer of the resource with the specified id, updates 'type' to the resource's type Typically, these functions are used for SQL drivers but they can be used for anything else; for instance, maintaining file descriptors. Typical list code would look like this: Adding a new resource value.lval = php3_list_insert((void *) resource, LE_RESOURCE_TYPE); return_value->type = IS_LONG; ]]> Using an existing resource value.lval, &type); if (type != LE_RESOURCE_TYPE) { php3_error(E_WARNING,"resource index %d has the wrong type",resource_id->value.lval); RETURN_FALSE; } /* ...use resource... */ ]]> Deleting an existing resource value.lval); ]]> The resource types should be registered in php3_list.h, in enum list_entry_type. In addition, one should add shutdown code for any new resource type defined, in list.c's list_entry_destructor() (even if you don't have anything to do on shutdown, you must add an empty case). Using the persistent resource table PHP has a standard way of storing persistent resources (i.e., resources that are kept in between hits). The first module to use this feature was the MySQL module, and mSQL followed it, so one can get the general impression of how a persistent resource should be used by reading mysql.c. The functions you should look at are: php3_mysql_do_connect php3_mysql_connect() php3_mysql_pconnect() The general idea of persistence modules is this: Code all of your module to work with the regular resource list mentioned in section (9). Code extra connect functions that check if the resource already exists in the persistent resource list. If it does, register it as in the regular resource list as a pointer to the persistent resource list (because of 1., the rest of the code should work immediately). If it doesn't, then create it, add it to the persistent resource list AND add a pointer to it from the regular resource list, so all of the code would work since it's in the regular resource list, but on the next connect, the resource would be found in the persistent resource list and be used without having to recreate it. You should register these resources with a different type (e.g. LE_MYSQL_LINK for non-persistent link and LE_MYSQL_PLINK for a persistent link). If you read mysql.c, you'll notice that except for the more complex connect function, nothing in the rest of the module has to be changed. The very same interface exists for the regular resource list and the persistent resource list, only 'list' is replaced with 'plist': php3_plist_insert(ptr, type) - returns the 'id' of the newly inserted resource php3_plist_delete(id) - delete the resource with the specified id php3_plist_find(id,*type) - returns the pointer of the resource with the specified id, updates 'type' to the resource's type However, it's more than likely that these functions would prove to be useless for you when trying to implement a persistent module. Typically, one would want to use the fact that the persistent resource list is really a hash table. For instance, in the MySQL/mSQL modules, when there's a pconnect() call (persistent connect), the function builds a string out of the host/user/passwd that were passed to the function, and hashes the SQL link with this string as a key. The next time someone calls a pconnect() with the same host/user/passwd, the same key would be generated, and the function would find the SQL link in the persistent list. Until further documented, you should look at mysql.c or msql.c to see how one should use the plist's hash table abilities. One important thing to note: resources going into the persistent resource list must *NOT* be allocated with PHP's memory manager, i.e., they should NOT be created with emalloc(), estrdup(), etc. Rather, one should use the regular malloc(), strdup(), etc. The reason for this is simple - at the end of the request (end of the hit), every memory chunk that was allocated using PHP's memory manager is deleted. Since the persistent list isn't supposed to be erased at the end of a request, one mustn't use PHP's memory manager for allocating resources that go to it. When you register a resource that's going to be in the persistent list, you should add destructors to it both in the non-persistent list and in the persistent list. The destructor in the non-persistent list destructor shouldn't do anything. The one in the persistent list destructor should properly free any resources obtained by that type (e.g. memory, SQL links, etc). Just like with the non-persistent resources, you *MUST* add destructors for every resource, even it requires no destruction and the destructor would be empty. Remember, since emalloc() and friends aren't to be used in conjunction with the persistent list, you mustn't use efree() here either. Adding runtime configuration directives Many of the features of PHP can be configured at runtime. These configuration directives can appear in either the designated php3.ini file, or in the case of the Apache module version in the Apache .conf files. The advantage of having them in the Apache .conf files is that they can be configured on a per-directory basis. This means that one directory may have a certain safemodeexecdir for example, while another directory may have another. This configuration granularity is especially handy when a server supports multiple virtual hosts. The steps required to add a new directive: Add directive to php3_ini_structure struct in mod_php3.h. In main.c, edit the php3_module_startup function and add the appropriate cfg_get_string() or cfg_get_long() call. Add the directive, restrictions and a comment to the php3_commands structure in mod_php3.c. Note the restrictions part. RSRC_CONF are directives that can only be present in the actual Apache .conf files. Any OR_OPTIONS directives can be present anywhere, include normal .htaccess files. In either php3take1handler() or php3flaghandler() add the appropriate entry for your directive. In the configuration section of the _php3_info() function in functions/info.c you need to add your new directive. And last, you of course have to use your new directive somewhere. It will be addressable as php3_ini.directive. Calling User Functions To call user functions from an internal function, you should use the call_user_function function. call_user_function returns SUCCESS on success, and FAILURE in case the function cannot be found. You should check that return value! If it returns SUCCESS, you are responsible for destroying the retval pval yourself (or return it as the return value of your function). If it returns FAILURE, the value of retval is undefined, and you mustn't touch it. All internal functions that call user functions must be reentrant. Among other things, this means they must not use globals or static variables. call_user_function takes six arguments: HashTable *function_table This is the hash table in which the function is to be looked up. pval *object This is a pointer to an object on which the function is invoked. This should be &null; if a global function is called. If it's not &null; (i.e. it points to an object), the function_table argument is ignored, and instead taken from the object's hash. The object *may* be modified by the function that is invoked on it (that function will have access to it via $this). If for some reason you don't want that to happen, send a copy of the object instead. pval *function_name The name of the function to call. Must be a pval of type IS_STRING with function_name.str.val and function_name.str.len set to the appropriate values. The function_name is modified by call_user_function() - it's converted to lowercase. If you need to preserve the case, send a copy of the function name instead. pval *retval A pointer to a pval structure, into which the return value of the invoked function is saved. The structure must be previously allocated - call_user_function does NOT allocate it by itself. int param_count The number of parameters being passed to the function. pval *params[] An array of pointers to values that will be passed as arguments to the function, the first argument being in offset 0, the second in offset 1, etc. The array is an array of pointers to pval's; The pointers are sent as-is to the function, which means if the function modifies its arguments, the original values are changed (passing by reference). If you don't want that behavior, pass a copy instead. Reporting Errors To report errors from an internal function, you should call the php3_error function. This takes at least two parameters -- the first is the level of the error, the second is the format string for the error message (as in a standard printf call), and any following arguments are the parameters for the format string. The error levels are: E_NOTICE Notices are not printed by default, and indicate that the script encountered something that could indicate an error, but could also happen in the normal course of running a script. For example, trying to access the value of a variable which has not been set, or calling stat on a file that doesn't exist. E_WARNING Warnings are printed by default, but do not interrupt script execution. These indicate a problem that should have been trapped by the script before the call was made. For example, calling ereg with an invalid regular expression. E_ERROR Errors are also printed by default, and execution of the script is halted after the function returns. These indicate errors that can not be recovered from, such as a memory allocation problem. E_PARSE Parse errors should only be generated by the parser. The code is listed here only for the sake of completeness. E_CORE_ERROR This is like an E_ERROR, except it is generated by the core of PHP. Functions should not generate this type of error. E_CORE_WARNING This is like an E_WARNING, except it is generated by the core of PHP. Functions should not generate this type of error. E_COMPILE_ERROR This is like an E_ERROR, except it is generated by the Zend Scripting Engine. Functions should not generate this type of error. E_COMPILE_WARNING This is like an E_WARNING, except it is generated by the Zend Scripting Engine. Functions should not generate this type of error. E_USER_ERROR This is like an E_ERROR, except it is generated in PHP code by using the PHP function trigger_error. Functions should not generate this type of error. E_USER_WARNING This is like an E_WARNING, except it is generated by using the PHP function trigger_error. Functions should not generate this type of error. E_USER_NOTICE This is like an E_NOTICE, except it is generated by using the PHP function trigger_error. Functions should not generate this type of error. E_ALL All of the above. Using this error_reporting level will show all error types.