Skip to content.

Personal tools
You are here: Home » Inversion of Control In Web Frameworks
 
 

Inversion of Control In Web Frameworks

You want control? Of course you do! Control is great! But I'll argue that sometimes getting *too much* control can mean creating more work for yourself when you write a web application.

This is going to be a painful blog entry to write, because I already know it won't go down very well with anybody who hasn't already drank the type of Kool-Aid served up by the likes of Zope. I want to speak to developers and users of common Python "MVC" (or "MTV") web frameworks (Pylons, TurboGears, Django, etc). Specifically, I want to talk about inversion of control.

One of the common complaints about Zope as a web framework is that its level of control inversion tends to be very high. People hate control inversion. You damn well know you're developing an application using Zope, because it tends to want to do a lot of stuff for you that you'd otherwise do "by hand" if you were using a different framework. For instance, it badly wants to help you enforce a security policy in its own idiosyncratic way. It also assumes you're willing to do some things its way that you might possibly do different ways in other frameworks. For instance, it assumes you're comfortable consulting your application's "model" objects using an object graph (typically within a ZODB database).

You should feel free to blast away on the Zope concepts where you believe they're wrong with respect to inversion of control, because you're going to be right much of the time; jamming your application logic into its constraints is often tiring and sometimes it's just plain makework. Wisdom of crowds indicates that control inversion isn't very popular these days, because it imposes constraints.

Part of the reason I was compelled to start writing another web framework (repoze.bfg) was to create a framework with reduced control inversion. BFG is a framework that is unashamedly of a Zope heritage. But it drastically reduces some of the sillier manifestations of control inversion perpetrated by Zope 2 and 3 in what I consider the most important places (for instance, it has no dependency on ZODB, nor do you need to understand interfaces, or adapters, etc to use it). But, that said, it also selectively retains some of the same control inversion patterns that Zope pioneered. Such retention might be heretical these days, but I strongly believe some control inversion helps to provide a better partitioning of concerns when writing web applications. Had other frameworks encouraged (or at least made possible without "swimming upstream" from the masses) the kind of control inversion I carried over from Zope into BFG, I would not have written BFG at all.

One of the most obvious of these explicitly retained control inversion cases is how BFG authorization (security) defaults work. For applications that require security, BFG provides a more or less complete set of tools that allow you to make declarative security assertions using ACLs composed of permissions, group names, and user names. The set of assumptions made by the default implementation can be swapped out for a different implementation by a user of the framework in a controlled way. But for the most part, nobody using the framework is going to do that because security code is really no fun to write; everybody is going to use the defaults, or they'll use a different framework. Anyway, one of the core concepts of the default security policy implementation is that ACLs are attached to model objects . Of course, this means there needs to be a manifestation of something to attach these assertions to (usually the notional "model object"), so they can later be found and processed when a request reaches the application.

When a request reaches the application within BFG, unlike most other frameworks I've seen (other than Zope, of course) the first step is not to find the controller responsible for "doing all the work". Instead, there is an intermediate step that resolves the request (usually via the PATH_INFO present in the WSGI environment) into a "model" object. Typical BFG applications use graph traversal for this step, but it'd also be quite possible to use a different strategy (for example, using Routes for this purpose; there's no reason a route couldn't find/generate a model instead of just finding a controller; it just finds code based on URL patterns). In BFG, once the model object is found, the view (aka controller) is subsequently found based on the type of the model object and data in the request. The model object is then passed to the "view"* (aka controller) along with a request object. The view is never responsible for composing a model object, this is handled by another subsystem. I can hear heads exploding now, but don't stop reading. ;-)

The embrace of "find the model first" step appears a big part of what differentiates BFG (and Zope) from most all of the the existing crop of Python web frameworks. You might consider it egregious, but retaining some control inversion, where the framework is responsible for more than finding and calling controller code, has a demonstrable benefit in a good number of cases. If you partition your framework so that it create the model object(s) first, the framework itself has a shot at enforcing a normalized security policy rather than needing to rely on ad-hoc application-space decorators attached to controller methods or imperative code in your application. BFG does just that: it examines the model object for an ACL, and compares it to the credentials in the request, and allows or denies the continuation of the call into view code based on that decision. If the user doesn't have permission, no application code is called at all. It's all handled by the framework itself.

In most Python MVC frameworks, there is an explicit reversion of control that tends to make this sort of pattern uncommon. ("It's my application, dammit, stop getting in my way! I just want to shove some HTML out to the scrreen from database tables!") In most Python web frameworks I've reviewed (e.g. Pylons and Django), the primary job of the framework is to find and invoke controller code. There is no step that finds model data first. Subsequently, it is the job of the controller (aka "view") to manufacture a model object "on the fly". In other words, there's no real persistent manifestation of a model object in most applications in these systems. Instead they are composed directly by controller code (usually using some ORM query). So "model objects" don't actually exist before the controller code is run. There's just nowhere to hang a hat.

From a security perspective, what this tends to lead to to is a proliferation of decorators (or plain old imperative code jammed into controller logic) that actually performs authorization. If you want a controller method to be protected with some sort of regularized security policy, you wrap it with a decorator that does some sort of work to figure out whether the calling user actually has the permssion to execute the controller code you're protecting. That's straightforward enough, and works for a good number of applications. And it can impose a fairly straightforward declarative security policy for any given application (with the caveat that you need to remember the decorator).

But there's one problem with using decorators on controller methods to do security enforcement: sometimes they can't know what to do! At the time the decorator code is executed, you might not have enough information to enforce a context-sensitve security policy. Sure, you might know that "Bob is allowed to edit blog entries" (based on security information attached to the incoming request that indicates the user is Bob and he has permission to edit blog entries). But what if you want to know if Bob can edit this blog entry (the one implied by the URL)? If you haven't engineered your application carefully enough, only the controller code itself can knows which model objects it's interacting with, because it creates them. But sometimes introspecting the model data is required to enforce any declarative security policy, especially a "context sensitive" one.

IMO, this is why it's so uncommon to see "row level security" in frameworks that don't have some level of control inversion for a security subsystem. Often folks who actually want "row level security" (aka context-sensitive declarative security) sometimes just give up and do imperative security checks imperatively within the controller code itself. This is because the controller code has enough information to allow the developer to make a decision. The really smart folks using these frameworks probably execute their own style of control inversion: they probably cause the security decorator to manufacture the model objects in order to get enough context within them to make some sort of decision. I can't know for sure, because I've just not read enough code or used these frameworks enough to know. Maybe you can tell me.

In any case, this is an example of how control inversion can actually help when creating a web application: you can write less code, and be reasonably confident that you can create a truly declarative context-senstive security policy that will be enforced properly. Code which enforces a security policy is hard to get right, and no fun at all to write. Why not just write it once and potentially let the framework handle it for you? Especially if you can get out of jail by swapping out your own security policy if the default one doesn't work out for you?

Note that this blog has been attacked by spammers recently so registration doesn't actually work. If you'd like to make a comment here, just drop me a mail and I'll create an account for you here. I'd love to hear opinions about this topic, it's bugging the hell out of me for various reasons.

Created by chrism
Last modified 2009-01-07 02:57 AM

Nicely put

I really like this explanation. :-)

I think there are other good reasons for using IoC. For one thing, decent IoC makes components easier to unit test. If a component is responsible for instantiating or looking up the services it uses to do its work, then it's often harder to provide mock implementations of those services in a unit testing scenario.

This also applies at runtime - by making services more swappable, you often have a hope of providing alternate implementations of services that can either change at run time or make a choice upon startup. In a system we were building (in Java, using the Spring IoC container) we had an external system that would sometimes go down. We have code that lists for that system going down and then swaps in a dummy/failsafe service for the service normally responsible for talking to that system.

The main problem with IoC is that you lose some degree of code locality (http://blog.ianbicking.org/2008/11/06/where-next-for-plone-development has some interesting opinions on this). If you have to cross-reference an external XML file to figure out the paths through your code, the framework can get tricky to use. This is partially and experience thing: In Zope we now use Zope 3 adapters for all kinds of things, which makes our applications very flexible, testable and re-usable, but when over-used they can make code hard to follow. When Zope 3 adapters were a new hammer, everything looked a bit like a nail.

But it's also a framework thing. Grok, for instance, tries to use convention-over-configuration to lead you down a path that makes code more local, whilst using some of the IoC principles in Zope 3 under the hood. BFG makes different choices, but also tries to steer you down a path of balance between IoC concepts and code locality.

Martin

swappability

There's some correlation between swappability and IoC, but I'm sort of less on about that than I am about the responsibility design of Python web frameworks. It's not that the current crop of un-Zope Python MVC frameworks don't make context-sensitive declarative securty swappable. They do, in a huge way! In fact, you *have* two swap something else in because there's just no default at all. It's maximally swappable, and totally optional. If you want maximal control, this is fantastic.

There are efforts now to try to create a generic WSGI "authorization layer" that pushes some of that responsibility up into middleware and library functions. But because none of the existing web frameworks have any control inversion for security, enforcing context-sensitive declarative security means that you have to write your own decorators for use in your framework that actually know about your application. *Then* you have to plug them in to the generic layer.

This sounds sort of like the worst of both worlds to me: you can't use the defaults (they're not context-sensitive), so you have to write your own authorization code. And then you *still* have to understand the framework you're plugging in to. In this circumstance, the framework's responsibility is very, very light. Literally: provide a way to call the predicate that I wrote! So why would I use the framework?

wsgi authz

Hi Chris, you asked for other approaches.

You say something along the lines of "In most Python MVC frameworks ... From a security perspective, what this tends to lead to to is a proliferation of decorators (or plain old imperative code jammed into controller logic) that actually performs authorization."

Or, in any python MVC framework that is WSGI compliant, you can use a WSGI filter, based on the URL. That's the approach we're following. Provided you have RESTful resources (I think that's a sufficient but not necessary condition), you can separate the logic for parsing the URL and handling the authn and authz out into a filter and an authz service.

You go on to say "At the time the decorator code is executed, you might not have enough information to enforce a context-sensitve security policy" and I think the implicit assumption is that if you are down at the row level in database access you might not have the authz criteria without extracting the row level data to get at it, in which case you are correct, you are in model dependent land. However, just as you argue for BFG, I would argue that you ought to be able to extract that information out into an access service as a seperate step (which means that you have built a service which has a controller which understands the model, but if you have row dependent authz you're doing this logic ... somewhere).

The WSGI filter approach has the significant advantage that you can then develop an authz structure which is independent of the service which it is protecting, which one might hope to have much greater utility.

I wrote the above before reading the comment from optilude and your reply, which is about exactly what I'm saying. However, you imply that my application needs to know about the security layer, well, no, the whole point is that I want to take any application, and provided I can understand the URL structure, I can write an access control module to plug into the wsgi authz layer. Yes, I have to do some work, but I don't have to touch my application code, which means that next week, when someone demands I support openid33 with saml56 authz (or whatever), I can do so, without mucking with my app.

I think the bottom line is that you can't hide the fact that one needs to make security assertions *somewhere*. Now I've reread what you've written, and what I've written, and I reckon my head is exploding as you said it might, but I think the URL pattern matching concept is the same. So why don't you want it in a WSGI layer?

@bryanl

Hi Bryan. No, you're absolutely right. It's not actually a need to find model data. It's just that we need to find a "security context". Zope happens to conflate the two, so I was using them interchangeably above, but they're logically independent.

I have written a WSGI authorization system a lot like the one you describe at http://www.plope.com/Members/chrism/decsec_proposal (proposal) and http://www.plope.com/Members/chrism/decsec_revisited (implementation notes). Pros: it handles the "global assertions" case fine (just make them at the root), but it can also handle the more granular assertions. Cons: it makes you have to consider the granular case when you read its documentation. I fear that folks just aren't willing to live with the level of inversion control that using something like it imposes.