Copyright © 2000 by Zach Brown
Copyright © 2001 by David C. Merrill
This document is copyright 2000 by Zach Brown, and 2001 by David C. Merrill. It is released under the GNU Free Documentation License, which is hereby incorporated by reference.
phhttpd is an HTTP accelerator. It serves fast static HTTP fetches from a local file-system and passes slower dynamic requests back to a waiting server. It features a lean networking I/O core and an aggressive content cache that help it perform its job efficiently.
phhttpd features a very slim I/O core. It does all its networking work using non-blocking system calls driven by whatever event model is most appropriate for the host operating system. This allows a single execution context to handle as many client connections as the event model dictates.
phhttpd's job is to serve static content as quickly as it possibly can. To do this it maintains a cache of content in memory. When a request is serviced, phhttpd saves a reference to the on disk content and whatever HTTP headers are dependent on the content. The next time a request for this content is received, phhttpd can service it very quickly. This cache can be prepopulated (populated at run time), or can be built dynamically as requests come in. Its size may also be capped by the administrator so that it doesn't overwhelm a system.
phhttpd is a threaded stand alone daemon. The number of threads is currently statically defined at run time. Incoming connections are evenly balanced among the running threads, regardless of what content they may be serving. Connections are served by the thread that accepted them until the transfer is done.
phhttpd is currently only expected to build and run on Linux systems using glibc2.1 under a kernel that supports passing POLL* information over real-time SIGIO signals. This means later 2.3.x kernels or a 2.2.x kernel that has been patched.
I badly want this to change. If you're interested in doing porting work to other Operating Systems, please do let me know.
phhttpd uses an XML config file format to express how it should behave while running. More information on XML may be found near http://www.w3.org/XML/
phhttpd's configuration centers around the concept of virtual servers. For us, a virtual server may be thought of as the merging of a document tree and the actions phhttpd takes while serving that content.
phhttpd.conf may be thought of as having two main sections. The global section, which defines properties that are consistent across the entire running phhttpd server, and multiple virtual sections that describe properties of that only apply to a virtual server. There will only be one global section while multiple virtual sections are allowed.
The global section defines properties of the running server that don't apply to a single virtual server. It should be enclosed in
Global config entities
A Virtual Server can be thought of as the abstraction serving up a content tree ( "docroot" in Apache speak). There are a set of attributes that are used to define a virtual server. These attributes are used to decide which virtual server will process a client's request. Then there are attributes which define how the content is served.
A virtual server must have a docroot. The virtual tag in the config file has a docroot attribute that must be set.
Global Config Entities
"All kids love log!"
phhttpd maintains log buffers for each log it writes to. Logged events are put in these buffers at reporting time rather than being immediately written to disk. These logs are written as they are filled during normal operation, or at regular intervals. This greatly reduces the performance impact of keeping detailed logs.
phhttpd keeps interesting logs on a virtual server granularity. You can specify which logs you wish to keep by including an entity in the log section of a virtual server for each source you wish to log. There is an entity for each source of logging, and attributes to that entity define where it is logged. It looks something like this:
mode is the octal permissions mode of the file that is to be opened. As it is parsed by dumb routines, a leading 0 is highly recommended. file is the file to which the logged events will be written. The LOG_SOURCE is one of:
phhttpd log entries are contained with a single line in a text file. They contain the time the log entry was written, an opaque token that is associated with the connection that caused the log entry, followed by the actual entry.
The contents of the 'referer' and 'agent' log entries is simply the string that was given with the header. The contents of the 'access' log is a little more interesting. It has the decoded relative URL that was asked for, followed by the total bytes that were transfered, and the time in seconds that it took to transfer.
The first field is the time in seconds since the Unix epoch, a.k.a. time_t. The second field is associated with the client connection that caused the log entry. It is constant for the duration of the connection, and is written to all the logs entries, of whatever type, that are generated. This allows a log parser to do more complete connection granularity analysis. As it happens, this opaque token is currently built up of the time the client was connected, its remote and local network address, etc, but these values most _not_ be parsed as they may change in the future.
Entries generated by a thread will be written in chronological order. If, however, multiple threads are sharing an output file the resulting entries may not be written in chronological order. It is up to the parsing programs to use the 'time' field to sort by, if they care about chronological order.
While phhttpd is running it listens to a 'control' socket for messages from the administrator. The currently provided phhttpd_ctl program allows the administrator to minimally interact with phhttpd. This provides both control and status reporting.
phhttpd_ctl always wants a --control argument that specifies the control socket of the running phhttpd daemon. This should match the <control> tag specified in the config file.
phhttpd can be told to rotate its logs so that existing logs may be processed.
The --rotate argument to phhttpd_ctl tells phhttpd to rename the existing files to a unique name, open new files with the previously used names, then close the renamed logs and start using the newly created files. phhttpd_ctl will output the names of the newly created files which will be safe to use once the command exits.
The --reopen argument to phhttpd_ctl tells phhttpd to close the existing file logs and reopen the files with the filenames that were configured. This implies that an external entity has moved the files to new names and wants phhttpd to stop using them.
The --status argument to phhttpd_ctl tells phhttpd to return a quick status blurb about the server. It contains miscellaneous information about the running state of the server.