thttpd Web Server Patches (evanjones.ca)

[ 2004-March-20 15:57 ]

I have written a few patches for thttpd, a lightweight, high performance web server for unix. Some improve performance and others improve security.

MD5 Passwords

Most web servers support a simple username and password authentication protocol called Basic Authentication. Normally, these usernames and passwords are stored in a text file, typically called htpasswd. To provide some security, the passwords are not stored in this file. Instead, a one way hash of the password is computed, and is stored instead. Unfortunately, thttpd uses the standard UNIX crypt function. It is widely known that this function is insecure. Many password crackers can try thousands of keys a second, making it possible to quickly brute force the password file to recover the password.

Unix operating systems are transitioning to using the MD5 one way hash instead. Many systems ship with a crypt function that support both. My patch to thttpd modifies the password generator to detect if MD5 passwords are supported at runtime. If they are, it will use them instead, which dramatically increases the security of the passwords. One remaining item to do is to use FreeBSD's crypt implementation if the system's does not support MD5 passwords, for example, on Mac OS X. You can read my mailing list post and get the patch from there.

Stat Cache

thttpd makes a lot of calls to stat. This system call is rather fast, but the sheer volume of calls can still make a performance difference. I implemented and benchmarked a cache for thttpd, based on the file caching code. In my quick benchmark, it improved performance by 1-2%. Unfortunately, the cache introduces problems when the file is modified. However, if the file is replaced by moving a new file over top, the stat cache works fine. You can read my mailing list post and get the patch from there.

File Cache Hashing Performance

When I was teaching myself network programming, I decided to investigate the inner workings of thttpd. I ran it under a profiler and found a few places where the code could be optimized, so I spent some time making the changes, testing and benchmarking them. In all, they reduce the amount of CPU time consumed by the server by anywhere from 10 - 50%. These changes have been incorporated into thttpd version 2.21.

thttpd serves requests from memory mapped files, allowing the software and operating system to efficiently share the memory used for caching the files. Under a Unix operating system, memory maps are created with the mmap system call. thttpd stores the mappings in a cache to avoid needless system calls. The cache, implemented with a hash table, was the focus of my optimizations.

The first optimization was trivial. I noticed that when a file was unmapped with the mmc_unmap function, the linked list which stores the active file mappings was traversed to find the corresponding entry, which is a O( n ) operation. It was trivial to modify this function to use the hash table instead, a O( 1 ) operation. I profiled thttpd before and after this change and mmc_unmap went from being the function which used the most CPU time to somewhere in the middle. This change was included in thttpd 2.20.

The second optimization involved a fair amount of experimentation. I tried four different hash functions and tested them for execution time and for how well they distributed the hash keys. The contestants were the original XOR hash, HashPJW, ELF hash, and the hash used by Daniel Bernstien's CDB database. The winner, in execution time as well as key distribution, was the CDB hash. The old hash used a hash table that was a prime number size, but the new function worked well with a power of two sized table. This meant that the slow modulus operators (%) could be replaced with binary and (&) operations, which accelerated other functions as well.

Further Work

The timers used in the web server also use a hash table. A similar optimization could also be performed there, either with the CDB hash or potentially a different hash. Some sort of priority queue or sorted list based on the trigger time may also be more efficient. Only experimentation can answer which methods would be faster. Of course, most of the time spent by the web server is in system calls, so perhaps a more effective optimization would modifying thttpd to take advantage of alternate input/output routines. Caching the results of the URL to resource expansion may also be effective.