Skip to content.

Personal tools
You are here: Home » What is *Your* Web Framework Doing Under the Hood?

What is *Your* Web Framework Doing Under the Hood?

Pylons, Django, Grok, and repoze.bfg profiling output tells the tale.

NOTE : The numbers present in this blog post have been removed. They just weren't indicative of reality. New numbers are available from . However, I've retained the narrative here for context.

It can be a bit useless to benchmark web application frameworks. When you're commmitted to a particular framework, either it works or it doesn't for your particular application; often raw speed is not really a concern. You're probably not going to switch web frameworks in the middle of a project in order to get a 15% or even a 50% or 100% speed increase: you've got too much investment in the code that works under the framework to consider it. In my experience, very few people truly understand more than one web framework, and they tend to use that framework for everything even it it's slightly less optimal for any specific task; this is because the "switching cost" to go to another one is so high. So benchmarks aren't really all that interesting in the "real" web world; it all depends on context.

But if you haven't chosen a web framework yet (is there anyone?), or if you're falling out of love with your current web framework and you're considering using a different one, you might be able to learn something from profiling an application running under various frameworks nonetheless, even if you ignore the raw speed of the framework itself.

One measure of what you're going to be faced with with when your web application framework doesn't work as advertised is the complexity of what a it does to render a very simple page. If it does a lot of work to render a very simple page, you might need to understand a lot if it breaks or to extend it. If it does very little work, it's likely it will be easier to fix and/or extend than one that does a lot of work. Additionally, usually it's a corollary that the less work an application server does to render a response, the faster it will render that response. But that's not the point here, we're only concerned about work done.

I am the primary author of one of the frameworks shown here (repoze.bfg). As a result, "lies and benchmarks" applies, of course, more than it otherwise would. But I've done my level best to make each framework do as little work as possible to render the page. If I've not, I'm sure the various framework authors will tell me how to improve things.

I've done the following:

  • I've created four "hello world" applications (one for each framework tested) These applications are available at in the directories named after their respective framework. The four frameworks that I wrote applications for were repoze.bfg, Grok, Pylons, and Django.
  • I ensured I could run each using a WSGI server in order to be able to use
  • I placed the WSGI middleware into the pipeline in various ways within each application I tested.
  • I ran "ab" against the "hello world" page of a running instance of each application (via "ab n1000 -c4") with the profiling middleware turned on within the WSGI pipeline.
  • I scraped the output of the "__profile__" page provided by repoze.profile after the "ab" run for each framework was completed. This output gives a rough indicattion how much work was done during the run of "ab". repoze.profile uses the Python "profile" module to peek in to see what an application is doing under the hood.
  • I counted the number of lines outputted by the profiler (not including header information). This is a rough estimate of how "broad" the software is. Each line represents a specific function called. So the more lines, the more functions called, and (to some extent) the more you'll need to understand when it doesn't work right or when you need to change what it does or just plain-old understand what it does. By this measure, fewer lines is probably better. Of course, some frameworks might defer doing "one-time" work until the first request that others don't, so this isn't a perfect metric. That said, why would they? Why not get it over with at startup time?

Here are the results:

EDITED (see amended numbers at

Each link above shows how the application was configured, what software versions were involved, how the application was invoked, as well as the SVN link to the source code for the application, and any config tweaks attempted to make it faster. You can run these yourself if you want to; the results files and code taken together should contain all the information needed to replicate the results.

My blog signup is broken, and people without accounts can't comment, so if you don't have an account, please mail me if you want an account here in order to respond here.

EDIT: shameless self-promotion: I'll be giving a repoze bfg.tutorial this year at PyCon. Please sign up if you're interested.

Created by chrism
Last modified 2009-02-08 11:16 AM