Skip to content.

Personal tools
You are here: Home » Members » chrism's Home » Service Process Fascination

Service Process Fascination

Why "service process" fascination is often a symptom of poor design.

I have an admission to make. I not a fan of systems that are composed of an integration of heterogenous cooperating service processes.

When I say "service", I mean that the task provides some stovepiped service to a caller (perhaps a "web service" or a database network API). When I say "heterogenous", I mean that the process performs some task that it's "good at", implicitly leaving other processes to perform other different tasks that they are "good at."

When I say that these service processes are "cooperating", what I really mean is that some other process integrates the communications between all these service processes together to form some end-user-facing system. A corollary is that the end-user-facing system probably won't work (at least in any way that is explicable to the end user himself) if any of the service processes dies or gets wedged, even if the remainder of the service processes survive.

Such a setup makes sense sometimes. It makes sense particularly when each service process is a stable, well-tested, widely-used, well-supported, well-defined, well-documented subsystem written in a style or language widely divergent from any other service process. For example, using a Postgres database as a "service process" in a system otherwise composed of Python is not often very controversial.

But even when you use a non-controversial "service processes", the number of service processes in any integration matters. A lot. Each service process a system takes on implies the following:

  • Maintenance (e.g. Postgres "vacuum").
  • Crash monitoring / email notification.
  • Performance monitoring and amelioration.
  • Cognitive load related to setup and API.
  • Code integration.
  • Build automation.

It may be fun to think about tying Postgres, CouchDB, Solr, and some web service together via a frontend that integrates them all. Some folks may consider this highly practical because the result is termed an integration of "best of breed" components. The perceived benefit of such an integration is that the whole becomes better than the parts. The assumption is that a system that uses more off-the-shelf service components and less custom code will be easier to manage, or it will be faster, or it will be prettier, etc.

My experience says that such an assumption is usually just wrong, and that it's usually a mistake in any small or moderately-sized system to rely on more than a single service process when other potentially useful service processes overlap functionality of the single existing service process. When a system doesn't actually strictly need the functionality of extra processes running, I think a high number of service processes is symptom of either optimism about other system capabilities or complexity fascination.

Sometimes the amount of work to add a new service process is actually a net win, if when you introduce a new service process you remove more complexity from the system than you add. But often, for various reasons, you can't. Although you've added some amount of complexity by adding a new service process, but for various practical reasons, you find yourself unable to retire the code or process that the new service process was meant to replace, so you don't remove any complexity from the system. Instead, you just add some.

Let's take a concrete example. Let's say you've developed a system that uses Postgres as a persistence mechanism. Now let's say you have a new requirement: you need to add full-text indexing to the system. Postgres has fairly good full-text indexing capabilities. But you see that Solr has some shiny feature that promises to make your life much better if you'd just be willing to take the time to learn and integrate it.

I say don't fall for it. Just use the Postgres full-text indexer, and fill in any missing functionality with custom code. Your system will only gain complexity by adding Solr, it's almost guaranteed. This is because you're still going to need to manage Postgres; you can't replace it with Solr. Even though Solr's feature set might be better, and its promises may be shinier, the full-text indexing service which Postgres offers is probably good enough unless you're truly reaching to invent requirements.

The addition of a new service process should be a monumental event if you want your system to be as complexity-free and stable as possible. Even if it means the system is slightly slower, or some edge requirement becomes impossible to satisfy, a system that works all the time, every time, and which people can understand is usually more valuable than any individual outlying feature. A system with many moving parts is just naturally harder to understand and manage than one with fewer.

It's bad enough to pile on existing "best-of-breed" services to some integration. But there is also a degenerate case of adding service processes: writing service process code which is only useful in the context of a single integration. For example, some folks believe that composing an application out of many highly focused process-bounded services for every project is a good idea. It's not as if they're reusing some existing service process, they're inventing each service process for purpose of a single integration. Personally, I think this is just not sane. It may be easier to determine responsibilities by separating services between processes, but I think mostly this is best used as a mind game. Once you've figured out the responsibilities of each subsystem, if the potential that you're going to document some particular subsystem well enough for other people to use well is vanishingly small, just put all the subsystems into a single process. If it's not documented for use outside of a particular integration, it doesn't rate being its own service process.

Created by chrism
Last modified 2009-12-23 10:51 AM

Yes indeed.

Having spent most of 2008 (and a bit of 2007 and 2009) working *full time* on the code that runs, this really speaks to me. Back in 2007 for a while we had nearly all of the paid staff at TOPP hacking on that system. Heck, Ian even wrote a new build system to build the sucker (see if you missed it the first time around...) Implementing features took forever. Debugging was a nightmare.

Today, there's nearly universal agreement at TOPP that we would've had a much easier time if the whole stack had run in a single process, and that's how we tend to work on new projects (databases being the typical exception, and nowadays each of our current projects typically uses a single database).

It's an old truism: Systems with fewer moving parts are less fragile.