mirror of
https://github.com/php/pecl-caching-apc.git
synced 2026-03-23 22:42:11 +01:00
This patch adds missing newlines and trims multiple redundant final newlines into a single one. According to POSIX, a line is a sequence of zero or more non-' <newline>' characters plus a terminating '<newline>' character. [1] Files should normally have at least one final newline character. C89 [2] and later standards [3] mention a final newline: "A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character." Although it is not mandatory for all files to have a final newline fixed, a more consistent and homogeneous approach brings less of commit differences issues and a better development experience in certain text editors and IDEs. [1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206 [2] https://port70.net/~nsz/c/c89/c89-draft.html#2.1.1.2 [3] https://port70.net/~nsz/c/c99/n1256.html#5.1.1.2
361 lines
16 KiB
Plaintext
361 lines
16 KiB
Plaintext
APC Quick-Start Braindump
|
|
|
|
This is a rapidly written braindump of how APC currently works in the
|
|
form of a quick-start guide to start hacking on APC.
|
|
|
|
1. Install and use APC a bit so you know what it does from the end-user's
|
|
perspective.
|
|
user-space functions are all explained here:
|
|
|
|
2. Grab the current APC code from CVS:
|
|
|
|
cvs -d:pserver:cvsread@cvs.php.net:/repository login
|
|
Password: phpfi
|
|
cvs -d:pserver:cvsread@cvs.php.net:/repository co pecl/apc
|
|
|
|
apc/php_apc.c has most of the code for the user-visible stuff. It is
|
|
also a regular PHP extension in the sense that there are MINIT, MINFO,
|
|
MSHUTDOWN, RSHUTDOWN, etc. functions.
|
|
|
|
3. Build it.
|
|
|
|
cd pecl/apc
|
|
phpize
|
|
./configure --enable-apc --enable-mmap
|
|
make
|
|
cp modules/apc.so /usr/local/lib/php
|
|
apachectl restart
|
|
|
|
4. Debugging Hints
|
|
|
|
apachectl stop
|
|
gdb /usr/bin/httpd
|
|
break ??
|
|
run -X
|
|
|
|
Grab the .gdbinit from the PHP source tree and have a look at the macros.
|
|
|
|
5. Look through apc/apc_sma.c
|
|
It is a pretty standard memory allocator.
|
|
|
|
apc_sma_malloc, apc_sma_realloc, apc_sma_strdup and apc_sma_free behave to the
|
|
caller just like malloc, realloc, strdup and free
|
|
|
|
On server startup the MINIT hook in php_apc.c calls apc_module_init() in
|
|
apc_main.c which in turn calls apc_sma_init(). apc_sma_init calls into
|
|
apc_mmap.c to mmap the specified sized segment (I tend to just use a single
|
|
segment). apc_mmap.c should be self-explanatory. It mmaps a temp file and
|
|
then unlinks that file right after the mmap to provide automatic shared memory
|
|
cleanup in case the process dies.
|
|
|
|
Once the region has been initialized we stick a header_t at the beginning
|
|
of the region. It contains the total size in header->segsize and the number
|
|
of bytes available in header->avail.
|
|
|
|
After the header comes a bit of a hack. A zero-sized block is inserted just
|
|
to make things easier later on. And then a huge block that is basically
|
|
the size of the entire segment minus the two (for the 0-sized block, and this one)
|
|
block headers.
|
|
|
|
The code for this is:
|
|
|
|
header = (header_t*) shmaddr;
|
|
header->segsize = sma_segsize;
|
|
header->avail = sma_segsize - sizeof(header_t) - sizeof(block_t) - alignword(sizeof(int));
|
|
memset(&header->lock,0,sizeof(header->lock));
|
|
sma_lock = &header->lock;
|
|
block = BLOCKAT(sizeof(header_t));
|
|
block->size = 0;
|
|
block->next = sizeof(header_t) + sizeof(block_t);
|
|
block = BLOCKAT(block->next);
|
|
block->size = header->avail;
|
|
block->next = 0;
|
|
|
|
So the shared memory looks like this:
|
|
|
|
+--------+-------+---------------------------------+
|
|
| header | block | block |
|
|
+--------+-------+---------------------------------+
|
|
|
|
sma_shmaddrs[0] gives you the address of header
|
|
|
|
The blocks are just a simple offset-based linked list (so no pointers):
|
|
|
|
typedef struct block_t block_t;
|
|
struct block_t {
|
|
size_t size; /* size of this block */
|
|
size_t next; /* offset in segment of next free block */
|
|
size_t canary; /* canary to check for memory overwrites */
|
|
#ifdef __APC_SMA_DEBUG__
|
|
int id; /* identifier for the memory block */
|
|
#endif
|
|
};
|
|
|
|
The BLOCKAT macro turns an offset into an actual address for you:
|
|
|
|
#define BLOCKAT(offset) ((block_t*)((char *)shmaddr + offset))
|
|
|
|
where shmaddr = sma_shaddrs[0]
|
|
|
|
And the OFFSET macro goes the other way:
|
|
|
|
#define OFFSET(block) ((int)(((char*)block) - (char*)shmaddr))
|
|
|
|
Allocating a block with a call to apc_sma_allocate() walks through the
|
|
linked list of blocks until it finds one that is >= to the requested size.
|
|
The first call to apc_sma_allocate() will hit the second block. We then
|
|
chop up that block so it looks like this:
|
|
|
|
+--------+-------+-------+-------------------------+
|
|
| header | block | block | block |
|
|
+--------+-------+-------+-------------------------+
|
|
|
|
Then we unlink that block from the linked list so it won't show up
|
|
as an available block on the next allocate. So we actually have:
|
|
|
|
+--------+-------+ +-------------------------+
|
|
| header | block |------>| block |
|
|
+--------+-------+ +-------------------------+
|
|
|
|
And header->avail along with block->size of the remaining large
|
|
block are updated accordingly. The arrow there representing the
|
|
link which now points to a block with an offset further along in
|
|
the segment.
|
|
|
|
When the block is freed using apc_sma_deallocate() the steps are
|
|
basically just reversed. The block is put back and then the deallocate
|
|
code looks at the block before and after to see if the block immediately
|
|
before and after are free and if so the blocks are combined. So you never
|
|
have 2 free blocks next to each other, apart from at the front with that
|
|
0-sized dummy block. This mostly prevents fragmentation. I have been
|
|
toying with the idea of always allocating block at 2^n boundaries to make
|
|
it more likely that they will be re-used to cut down on fragmentation further.
|
|
That's what the POWER_OF_TWO_BLOCKSIZE you see in apc_sma.c is all about.
|
|
|
|
Of course, anytime we fiddle with our shared memory segment we lock using
|
|
the locking macros, LOCK() and UNLOCK().
|
|
|
|
That should mostly take care of the low-level shared memory handling.
|
|
|
|
6. Next up is apc_main.c and apc_cache.c which implement the meat of the
|
|
cache logic.
|
|
|
|
The apc_main.c file mostly calls functions in apc_sma.c to allocate memory
|
|
and apc_cache.c for actual cache manipulation.
|
|
|
|
After the shared memory segment is created and the caches are initialized,
|
|
apc_module_init() installs the my_compile_file() function overriding Zend's
|
|
version. I'll talk about my_compile_file() and the rest of apc_compile.c
|
|
in the next section. For now I will stick with apc_main.c and apc_cache.c
|
|
and talk about the actual caches. A cache consists of a block of shared
|
|
memory returned by apc_sma_allocate() via apc_sma_malloc(). You will
|
|
notice references to apc_emalloc(). apc_emalloc() is just a thin wrapper
|
|
around PHP's own emalloc() function which allocates per-process memory from
|
|
PHP's pool-based memory allocator. Don't confuse apc_emalloc() and
|
|
apc_sma_malloc() as the first is per-process and the second is shared memory.
|
|
|
|
The cache is stored in/described by this struct allocated locally using
|
|
emalloc():
|
|
|
|
struct apc_cache_t {
|
|
void* shmaddr; /* process (local) address of shared cache */
|
|
header_t* header; /* cache header (stored in SHM) */
|
|
slot_t** slots; /* array of cache slots (stored in SHM) */
|
|
int num_slots; /* number of slots in cache */
|
|
int gc_ttl; /* maximum time on GC list for a slot */
|
|
int ttl; /* if slot is needed and entry's access time is older than this ttl, remove it */
|
|
};
|
|
|
|
Whenever you see functions that take a 'cache' argument, this is what they
|
|
take. And apc_cache_create() returns a pointer to this populated struct.
|
|
|
|
At the beginning of the cache we have a header. Remember, we are down a level now
|
|
from the sma stuff. The sma stuff is the low-level shared-memory allocator which
|
|
has its own header which is completely separate and invisible to apc_cache.c.
|
|
As far as apc_cache.c is concerned the block of memory it is working with could
|
|
have come from a call to malloc().
|
|
|
|
The header looks like this:
|
|
|
|
typedef struct header_t header_t;
|
|
struct header_t {
|
|
int num_hits; /* total successful hits in cache */
|
|
int num_misses; /* total unsuccessful hits in cache */
|
|
slot_t* deleted_list; /* linked list of to-be-deleted slots */
|
|
};
|
|
|
|
Since this is at the start of the shared memory segment, these values are accessible
|
|
across all the apache processes and hence access to them has to be locked.
|
|
|
|
After the header we have an array of slots. The number of slots is user-defined
|
|
through the apc.num_slots ini hint. Each slot is described by:
|
|
|
|
typedef struct slot_t slot_t;
|
|
struct slot_t {
|
|
apc_cache_key_t key; /* slot key */
|
|
apc_cache_entry_t* value; /* slot value */
|
|
slot_t* next; /* next slot in linked list */
|
|
int num_hits; /* number of hits to this bucket */
|
|
time_t creation_time; /* time slot was initialized */
|
|
time_t deletion_time; /* time slot was removed from cache */
|
|
time_t access_time; /* time slot was last accessed */
|
|
};
|
|
|
|
The slot_t *next there is a linked list to other slots that happened to hash to the
|
|
same array position.
|
|
|
|
apc_cache_insert() shows what happens on a new cache insert.
|
|
|
|
slot = &cache->slots[hash(key) % cache->num_slots];
|
|
|
|
cache->slots is our array of slots in the segment. hash() is simply:
|
|
|
|
static unsigned int hash(apc_cache_key_t key)
|
|
{
|
|
return key.data.file.device + key.data.file.inode;
|
|
}
|
|
|
|
That is, we use the file's device and inode to uniquely identify it. Initially
|
|
we had used the file's full path, but getting that requires a realpath() call which
|
|
is amazingly expensive since it has to stat each component of the path to resolve
|
|
symlinks and get rid of relative path components. By using the device+inode we
|
|
can uniquely identify a file with a single stat.
|
|
|
|
So, on an insert we find the array position in the slots array by hashing the device+inode.
|
|
If there are currently no other slots there, we just create the slot and stick it into
|
|
the array:
|
|
|
|
*slot = make_slot(key, value, *slot, t)
|
|
|
|
If there are other slots already at this position we walk the link list to get to
|
|
the end. Here is the loop:
|
|
|
|
while (*slot) {
|
|
if (key_equals((*slot)->key.data.file, key.data.file)) {
|
|
/* If existing slot for the same device+inode is different, remove it and insert the new version */
|
|
if ((*slot)->key.mtime != key.mtime) {
|
|
remove_slot(cache, slot);
|
|
break;
|
|
}
|
|
UNLOCK(cache);
|
|
return 0;
|
|
} else if(cache->ttl && (*slot)->access_time < (t - cache->ttl)) {
|
|
remove_slot(cache, slot);
|
|
continue;
|
|
}
|
|
slot = &(*slot)->next;
|
|
}
|
|
|
|
That first key_equals() check sees if we have an exact match meaning the file
|
|
is already in the cache. Since we try to find the file in the cache before doing
|
|
an insert, this will generally only happen if another process managed to beat us
|
|
to inserting it. If we have a newer version of the file at this point we remove
|
|
it an insert the new version. If our version is not newer we just return without
|
|
doing anything.
|
|
|
|
While walking the linked list we also check to see if the cache has a TTL defined.
|
|
If while walking the linked list we see a slot that has expired, we remove it
|
|
since we are right there looking at it. This is the only place we remove stale
|
|
entries unless the shared memory segment fills up and we force a full expunge via
|
|
apc_cache_expunge(). apc_cache_expunge() walks the entire slots array and walks
|
|
down every linked list removing stale slots to free up room. This is obviously
|
|
slow and thus only happens when we have run out of room.
|
|
|
|
apc_cache_find() simply hashes and returns the entry if it is there. If it is there
|
|
but older than the mtime in the entry we are looking for, we delete the one that is
|
|
there and return indicating we didn't find it.
|
|
|
|
Next we need to understand what an actual cache entry looks like. Have a look at
|
|
apc_cache.h for the structs. I sort of glossed over the key part earlier saying
|
|
that we just used the device+inode to find a hash slot. It is actually a bit more
|
|
complex than that because we have two kinds of caches. We have the standard file
|
|
cache containing opcode arrays, but we also have a user-controlled cache that the
|
|
user can insert whatever they want into via apc_store(). For the user cache we
|
|
obviously don't have a device+inode. The actual identifier is provided by the user
|
|
as a char *. So the key is actually a union that looks like this:
|
|
|
|
typedef union _apc_cache_key_data_t {
|
|
struct {
|
|
int device; /* the filesystem device */
|
|
int inode; /* the filesystem inode */
|
|
} file;
|
|
struct {
|
|
char *identifier;
|
|
} user;
|
|
} apc_cache_key_data_t;
|
|
|
|
struct apc_cache_key_t {
|
|
apc_cache_key_data_t data;
|
|
int mtime; /* the mtime of this cached entry */
|
|
};
|
|
|
|
And we have two sets of functions to do inserts and finds. apc_cache_user_find()
|
|
and apc_cache_user_insert() operate on the user cache.
|
|
|
|
Ok, on to the actual cache entry. Again, because we have two kinds of caches, we
|
|
also have the corresponding two kinds of cache entries described by this union:
|
|
|
|
typedef union _apc_cache_entry_value_t {
|
|
struct {
|
|
char *filename; /* absolute path to source file */
|
|
zend_op_array* op_array; /* op_array allocated in shared memory */
|
|
apc_function_t* functions; /* array of apc_function_t's */
|
|
apc_class_t* classes; /* array of apc_class_t's */
|
|
} file;
|
|
struct {
|
|
char *info;
|
|
zval *val;
|
|
unsigned int ttl;
|
|
} user;
|
|
} apc_cache_entry_value_t;
|
|
|
|
And then the actual cache entry:
|
|
|
|
struct apc_cache_entry_t {
|
|
apc_cache_entry_value_t data;
|
|
unsigned char type;
|
|
int ref_count;
|
|
};
|
|
|
|
The user entry is pretty simple and not all that important for now. I will
|
|
concentrate on the file entries since that is what holds the actual compiled
|
|
opcode arrays along with the functions and classes required by the executor.
|
|
|
|
apc_cache_make_file_entry() in apc_cache.c shows how an entry is constructed.
|
|
The main thing to understand here is that we need more than just the opcode
|
|
array, we also need the functions and classes created by the compiler when it
|
|
created the opcode array. As far as the executor is concerned, it doesn't know
|
|
that it isn't operating in normal mode being called right after the parse/compile
|
|
phase, so we need to recreate everything so it looks exactly like it would at
|
|
that point.
|
|
|
|
7. my_compile_file() and apc_compile.c
|
|
|
|
my_compile_file() in apc_main.c controls where we get the opcodes from. If
|
|
the user-specified filters exclude the file from being cached, then we just
|
|
call the original compile function and return. Otherwise we fetch the request
|
|
time from Apache to avoid an extra syscall, create the key so we can look up
|
|
the file in the cache. If we find it we stick it on a local stack which we
|
|
use at cleanup time to make sure we return everything back to normal after a
|
|
request and call cached_compile() which installs the functions and classes
|
|
associated with the op_array in this entry and then copy the op_array down
|
|
into our memory space for execution.
|
|
|
|
If we didn't find the file in the cache, we need to compile it and insert it.
|
|
To compile it we simply call the original compile function:
|
|
|
|
op_array = old_compile_file(h, type TSRMLS_CC);
|
|
|
|
To do the insert we need to copy the functions, classes and the opcode array
|
|
the compile phase created into shared memory. This all happens in apc_compile.c
|
|
in the apc_copy_op_array(), apc_copy_new_functions() and apc_copy_new_classes()
|
|
functions. Then we make the file entry and do the insert. Both of these
|
|
operations were described in the previous section.
|
|
|
|
8. The Optimizer
|
|
|
|
The optimizer has been deprecated.
|
|
|
|
If you made it to the end of this, you should have a pretty good idea of where things are in
|
|
the code. I skimmed over a lot of things, so plan on spending some time reading through the code.
|