Welcome to my unobtrusive blog!
By default I blog in english but there is also german content.

Lighttpd as a reverse proxy might be a big problem

August 31st, 2008 Posted in english

Hi,

today I simply want to show you how you can realize a reverse proxy with lighty listening on port 80 proxying to a local apache listening on port 81. But keep in mind that you shouldn’t spend your time for a project like this because it might turn out that lighty is absolutely useless for you as a reverse proxy as it is for me which I first found out when it was too late and when I wasted a lot of time.

The motivation of a reverse proxy is to be as flexible as possible. Let’s take e.g. my needs. I need a webserver that can serve php scripts (over fastcgi or mod_php), perl over mod_perl2 and one that serves pike scripts like e.g. caudium or roxen and I want all webservers to be accessed over port 80 which is standard. So I need one “master” webserver which proxies all requests depending on defined criterias (vhost name e.g., but can be almost anything if you generate your configuration files dynamically with perl or whatever) to the appropriate webserver which bound themselves to the local ip 127.0.0.1 so they can’t be accessed from the outside.

I used the latest Lighttpd 1.5.x from here: http://opensu.se/~darix/lighttpd/

Here is the important snippet from the lighty config file:

$HTTP["host"] == “www.as.tl” {
proxy-core.protocol = “http”
proxy-core.balancer = “round-robin”
proxy-core.allow-x-sendfile = “enable”
proxy-core.rewrite-response = (”Location” => ( “^http://www.as.tl/(.*)” => “http://www.as.tl:81/” ) )
proxy-core.rewrite-request = ( (”^/(.*)$” => “http://www.as.tl/$1″) )
proxy-core.backends = ( “127.0.0.1:81″ )
}

Keep in mind that you’ll at least need “mod_proxy_core” and “mod_proxy_backend_http” in your server.modules definition.

If you got everything up and running you’ll be glad about the result because it simply works without much hassling. Congrats!

This configuration will proxy all requests made on port 80 to port 81 which is apache. Of course the virtual host has to be existent in the apache configuration. I used the “VirtualDocumentRoot /var/www/%0″ directive to be very dynamic but this is up to you how you configure your apache and it doesn’t really matter here. Just make sure that you will get something in return when requesting “http://www.as.tl:81″ (e.g.!).

Now it’s time to put a big file into the webroot of www.as.tl (it’s still an example!).

And then try to make a request and download this file by accessing http://www.as.tl/thebigfile.iso. This file will be served by lighty because it’s the proxy right? Exactly! Lighty will ask apache for this file and apache will nicely hand it out to lighty :P.

If you are going to download with “wget” you’ll notice a slight delay from the request to the actual downloading progress.

So what’s the problem here? The problem is that lighty’s mod_proxy or mod_proxy_core module by default caches every answer into the ram. This is practical, useful and gives a good performance on dynamic content and especially with slow clients you really will take off the load off apache. The idea is simple: lighty makes a request to apache and apache can respond in a very short time because the way between lighty and apache is short and fast, because they are both acting on localhost (to give you a visual example). Lighty then can serve the slow clients but apache in return could again serve other requests also done by lighty. This is theory but works out in practice quite well. But that heavily depends on the content you are going to serve.

Again, it’s not lighty actually caching it into ram but the proxy module. This is a difference which will play an important role when it comes to x-sendfile.

In practice it could be a disaster for your business. When a user e.g. downloads a 700mb file this would increase your ram usage by 700mb. And now make a guess when more than one users are downloading a file as big as that. You are first getting a slowly responding machine because your machine will run out of ram and starts to swap. If swap is filled you’ll get other serious problems ;).

If you read up the bugtracker there is no plan to cache on to the harddrive (which would still give a good performance imho) so there must be another solution…

I talked to icy and stbuehler who are residing in #lighttpd on irc.freenode.net and icy gave me the hint to work with “x-sendfile” headers which worked very well…but you have to activate this in the lighty configuration file as I already have shown before ( proxy-core.allow-x-sendfile = “enable” ).

This tells lighty to look for “x-sendfile” header entry in the actual http header.

But what the fuck is “x-sendfile” you might ask yourself.

X-Sendfile tells lighty to serve the by itself and not by mod_proxy or mod_proxy_core (blabla). The difference is that the big file doesn’t get completely cached in ram.

The x-sendfile header is simple. It simply tells the local path where the requested file is in so lighty can locally read it and serve it to the client. It’s never a security risk because the client never sees this local path. Only lighty sees it and it doesn’t tell the client.

To make this work you have to do configuration work on apache’s side.

Something like this….

<FilesMatch “\.(iso)$”>
Header add X-Sendfile “/var/www/www.as.tl/slackware-10.0-install-d1.iso”
</FilesMatch>

This tells apache to add the x-sendfile header on every request that matches *.iso. This is totally wrong here but it will work. The task here is to make this dynamically based on the request and this is almost impossible because how can apache guess the local file path on a request?

And that’s the complete catch! This makes lighty useless for me. At least as reverse proxy.

I hope you haven’t wasted your time trying this :). One simple solution would be to put all big files into a directory that is purely served by lighty and this would be probably fine for a new setup and for new projects but I got existing stuff that I don’t want to restructure.

I hope this text was useful for you.

Kind regards,

Andreas Schipplock.

my name is still Andreas Schipplock and I’m still a friend of lighttpd, a reliable and fast http daemon but today lighttpd really fucked me in the back.

follow comments via rss

Post a Comment