Blogs | Srijan

Varnish-ing 2G Raja Scam

Written by Team Srijan | Dec 9, 2010 8:00:00 AM

Preface

Recently we had a fire-fighting situation at Srijan, the reason being unplanned scaling of the website.

www.openthemagazine.com is well-equipped to handle requests at around 50K requests for a period of 7-10 days. However during the 2G Raja Scam episode, OPEN spilled the beans on BarkhaDuttGate and all hell broke loose. The server crashed many a times a day during 19th of November ( When the first tape was released.). People agonized over poor server performance.

How we did it earlier

As we all know, Drupal can be pretty heavy on resources. Especially with a lot of modules enabled, and there is just so much you can do about it in Drupal. Of course you can use modules like boost and caching tools like memcached and APC which will help taking a load off your mysql database and CPU, but that won’t prevent firing up Apache each time a request comes in, eating valuable and limited memory each time. And under high load (spikes, ’slashdot’ -effect) this will most likely grind your server to a halt. 

The server structure shows here two HA Proxies equally taking over the load for the requests put up on the website. On equal probability each request is being shot at either of the proxy servers. The request is then sent over to the common NFS Server. All files and resources like images and documents are being uploaded and retrieved from this server only. 

Similarly a common database ensures that the requests are made to one DB and hence use of optimum resources is maintained. All backups are maintained in the common storage device.

What have we done now?

After the posts going live for 2G scam, the number of requests increased unexpectedly to 215K. We had to buy two new servers overnight to ensure that the server does not have to gasp for breadth! But this was a reactionary measure. What ensured then was a long debate as to how we can reduce such collapses?? The answer lied with the change of underlying technology. The site came into existence as a Drupal product, however the need to implement varnish drove us to Pressflow 6 (More on Varnish a little later.). The server could surely take up more load (How much is what is yet to be determined?) but the problem remained with optimizing the number of requests.

This is where the Varnish comes in

What is Varnish

Varnish is an HTTP accelerator designed for content-heavy dynamic web sites. In contrast to other HTTP accelerators, many of which began life as client-side proxies or origin servers, Varnish was designed from the ground up as an HTTP accelerator. 

How does it work?

Your Drupal site contains a combination of static and dynamic content. Native caching support gives your site a huge performance boost when application and infrastructure caching are working together seamlessly. If the requested content is available in the static cache, the cache returns the request immediately.  When dynamic content is requested, this layer is tuned for high performance and handles HTTP-level processing, including SSL and file store requests for anonymous visitors, before passing connections immediately into the stack. We manage the front-end servers, including DNS, load balancing, and fail-over across the machines for optimum site responsiveness.

Varnish allows for Edge-Side Includes (ESI) which improves the websites' performance drastically  ! Essentially, Varnish stores copies of a page in memory cache, and if that exact page is requested again, it is served immediately back to the user without going to the whole request on your webserver. The results are amazing. Not only are pages served blazingly fast, but it decreases server load significantly. We noticed an immediate and structural drop in server load. 

So what's the downside?

Basically, this mostly works for anonymous users. Anytime a user is logged in (and a session is set), it negates the use of Varnish. In fact, this is also the reason you will need Pressflow or Drupal 7 be able to set a reverse proxy like Varnish and to get rid of the standard session cookie which is always set by Drupal 6 (even for anonymous users). But you will most likely want to use Pressflow anyway if you need high performance (see comparison chart for full report). Screenshot below.

Also you need to know exactly what modules you use that can break Varnish because they may set a cookie. 

Some stats for Varnish:

  1. Speed : The core C engine is extremely fast and efficient. We know that Varnish team has even achieved about 275,000 HTTP requests per second!
  2. Configurable : Varnish really stands apart from its peers with its rich configuration language (VCL). With VCL, you can specify rules for every part of the pipeline based on any HTTP header. We've used VCL to cache distinct pages based on browser type, to implement edge side includes (ESI), to combine caches across multiple URLs, and to ignore specific cookies. To make this fast, Varnish translates VCL into C which it then compiles and executes. Even complex rule sets don't significantly slow response time.
  3. Supported, Open Source : It's open source, which we love. Plus, there is a strong dev and support team behind it which makes my job easier when we run into the inevitable snafus. Varnish Software is an Acquia Technology Partner.