Dynamic Web Content is Bad

about | archive


[ 2006-September-06 22:23 ]

I've always hated the design of most web applications. The way they work is very simple: a request comes in, you execute some code to apply a template and then you return the data to the user. This makes it very simple to create templated web sites, and to do things like allow users to edit the content through the web. This is fine for your little personal site, but is horrible for things like Wikipedia because it is very, very inefficient.

The problem with this approach is in that "execute some code" part. Typically, this is PHP or Python or Ruby code we are talking about, and the execution is relatively slow. It might read a bunch of files off a disk, or make a request to a database. It probably takes a bit less than a hundred milliseconds or so to do all this work on computer systems today. This is not a problem if you get a handful of requests a second. However, if we are talking about popular web sites, this is a terrible system. This is the traditional reason why many web sites become unavailable as soon as slashdot or reddit posts about them: they are simply too slow to handle the sudden load. The solution, in my opinion, is simple: generate the HTML content once, store in on disk, then use a plain old web server to dish it out to clients. Today, on modern hardware, web servers can handle over 10 000 requests per second. This is fast enough to handle all but the busiest web sites on the Internet. For example, if Wikipedia were designed this way, they wouldn't need a huge ton of Squid cache servers, they could just scale by adding more web servers.

In an ideal world, the web application frameworks that are out there would make it easy to do this. I would love to see a good way to make this very easy to do. However, serving static output from a web application is a very powerful technique that developers should keep in mind when they need to make things go faster.