Skip to content.

plope

Personal tools
You are here: Home » Members » chrism's Home » A Theory On Collaborative Develpment Using Zope
 
 

A Theory On Collaborative Develpment Using Zope

** Note: This is a reprint. The old revision got lost in an inadvertent deletion. *** Zope is very different from other applications that might be used for the same purpose, such as JBoss, ACS or Vignette. Its major differentiator is that the persistent data that users of your application interact with does not need to be kept in a relational database or on the filesystem. Although Zope can make use of these storage systems, in many Zope applications, the data is kept in Zope's object database, the ZODB. This is both a boon and a curse.

It is a boon because thinking of your data in terms of objects is arguably easier than thinking of it in terms of columns and rows or files on a filesystem, particularly when the data is hierarchical. Additionally, if you keep your data in objects within the object database, you can make use of a lot of Zope built-in features such as the ZCatalog, which allows you to index and retrieve information quickly. Also, you can interact with your Zope objects through the Zope Management Interface (the ZMI) to quickly and interactively test different development scenarios.

It is a curse, too, however. The ZODB is written in a format that only the Python language can understand natively. This means that it is more difficult to interact with your data "files" by using your normal development tools such as text editors that only know about operating system file systems. There are interfaces that allow you to edit Zope objects in these tools, such as Zope's FTP and WebDAV integration, and the ExternalEditor product, but it's still not quite as easy to work with the data as it would be were it on your filesystem.

Zope is a multiuser system. In a lot of ways, it is the web equivalent of an operating system. If you were willing to throw away your existing development tools and OS and embrace Zope as your new operating system, it might be an interesting test. But the lack of tools for Zope as an OS in comparison to the tools that exist for other operating systems, such as Windows or UNIX, is dramatic. CVS is one of these tools. Zope has no equivalent system of sufficent maturity to replace CVS or any of the other mature version control systems available for standard operating systems. There are many other examples of the paucity of mature and available development tools within the Zope-as-OS environment. Thus, it's in your best short-term interest to interface with normal OS tools to whatever degree possible when writing code for Zope, allowing Zope to do what it's good at, and allowing your existing tools to do what they're good at.

The key to making this somewhat confusing situation less of a problem is to encourage a development practice which does the following:

 - Allows developers to interface their existing toolchain with
   Zope.

 - Devalues the data that goes into any given ZODB storage.

The first point is fairly obvious. When you write Zope product code, you write it via normal tools on a normal operating system via a normal text editor. While doing so, you write Python classes which Zope understands, and when you put the product code into a place where Zope can find it, users of your Zope application may interact with instances of these classes. They create instances, they destroy instances, the modify instances, and so forth. These instances are stored in Zope's object database. However, the classes from which these instances are created are stored on your filesystem and may be managed and edited with normal operating system tools.

But, ideally, the actual instance data that gets stored in the ZODB doesn't have any real enduring value until your application goes into production. During development, that data is typically limited to instances of data objects like documents, and some instances of "tools", which are required for your application to work properly. If you buy this argument, during development, it should be possible to completely erase your Zope database and start over again from some known-working point, requiring only the Python code that exists on the filesystem to do so. This is the key to successfully doing collaborative development of a Zope application. If you follow this advice, you will be able to use your normal filesystem tools to mange all of your code, making it much easier to do thing like version control and deployment.

The only real value in your application during development should exist in the Python filesystem code which makes up your Zope product. The data that exists in any given ZODB instance should be disposable. This allows your developers to work with their own copy of a ZODB database. As they perform work, they modify their own ZODB copy, but no other developer sees (or needs to see) the changes they make to that ZODB copy, as long as the developer provides a mechanism for recreating "his" ZODB instance data in a repeatable way that can be used by other developers.

To benefit from this model, you should prepare your development enviroment by:

 - creating a CVS repository to hold your Zope Python code and using
   it religiously.

 - writing a set of "buildout" scripts which put together all the
   right filesystem code such as Zope, third-party products, and
   your own applciation code in a repeatable way, so you can get
   people up and running quickly with a Zope environment as they
   join your development team.  This code should be put under
   version control and updated regularly as your system dependencies
   change.

 - writing and continually maintaining Python code to create a
   known-working baseline ZODB instance which any developer may run
   at any time after ditching an older copy.  This code may be
   referred to as "generator" code.  This code should also be put
   under version control and updated regularly as your own
   application code changes.

All of this is likely gibberish to you without a bit of context, so I will provide a few examples of what a particular set of buildout and generator scripts might do, and how different developers on the same team might interact with them.

Let's say we're embarking on a project to create a document management system in Zope. Bob, a developer, is in charge of creating the part of the application that ensures that document data is searchable, and that documents can be easily retrieved when a user performs a query through the application's web interface. Alice, another developer, is in charge of the part of the application which allows users to actually edit the documents that are stored in the system. Bob doesn't care (except on a empathic basis) about Alice's problems; nor does Alice care about Bob's problems. Although these folks are working together on the same team, and even on the same system, they have different roles and responsibilities. A problem is that they do need to share (and perhaps change) the same bits of code: the class or classes which represent documents.

Bob and Alice could work on the same machine against copies of the same files on the filesystem which holds their Zope product code. They also could work within a shared instance of Zope, intimating that they would be sharing the same object space and data for the course of the development project.

However, there's a bit of a problem. There is no equivalent to CVS in Zope itself. This means that the data that gets stored in Zope's ZODB cannot be versioned in the same controlled way as the filesystem code that Bob and Alice are sharing. If Bob and Alice do development within the same Zope instance (and thus in the same ZODB), they run the risk of stepping on each other's toes in a mighty way as a result. What to do?

Neither Bob nor Alice want to step on each other's toes. So they set up a CVS repository to manage changes to the code they share. Now both Bob and Alice can change the Python filesystem code which comprises their application in a controlled way.

For convenience's sake, Bob writes a script which, when run, puts a Zope instance on his filesystem with all the right products in the right places, representing a runnable copy of Zope with all the right filesystem product dependencies. He calls that a "buildout" script and gives it to Alice, so she can run it as well without needing to do all of the same work. At that point, they can each repeatably build out a Zope setup on their own systems, using CVS to share code between them. Problem solved? Not quite.

There is still the issue that both Bob and Alice are working on problems that, for better or worse, have ZODB dependencies, such as their "document tool" which performs utility actions against documents. This code is under heavy flux and both Bob and Alice need to change it from time to time. Worse, when they make changes to that code, because the underlying code performs actions against persistent objects which live in the ZODB, the changes made by Alice to the document tool may "break" Bob's environment because the code needs to make assumptions about the persistent objects it operates against. If the document or document tool classes change materially (also known as a schema change), these changes have the potential to "break" Bob's or Alice's Zope when either performs a CVS update to get the others' latest changes. Even seemingly innocuous changes can have this impact, so it's a real problem.

To avoid this situation, Bob and Alice resolve to create a script which, when run, generates a ZODB that reflects a nominal Zope "working environment". At any time, this script can be run, and it will generate a ZODB with the right bits of data and the right objects in it (such as the document tool and a demonstration set of documents), which represents their document management application in its current state. If a problem happens when Bob updates his sandbox with Alice's changes, he just runs the script and throws away his old database.

As an added bonus, over the course of the project, both the buildout scripts and the generator scripts are maintained in such a way that at any time a new developer can join the team, and within a few minutes, he will have a working environment that represents the current state of the document management system by running the buildout first, then the generator. As a doubly added bonus, we can do the same thing when the site eventually needs to go in to production; we just incorporate the generator script into the instructions on how to build the site.

Problem solved? Yep.

To make the most of Zope collaborative development, you should treat a particular ZODB as a throwaway entity, much like a old relational database instance that a developer has lying around after the project he's working on has decreed a number of schema changes which makes his database instance incompatible with the larger application.

Another document by Kapil Thangavelu espouses the same view .

Created by chrism
Last modified 2004-02-18 02:34 PM

A few apples and oranges

First...kd Lang and Cowboy Junkies, nice choices. I've never listened much to Leonard Cohen,
despite you telling me to through the years.

I fully sympathize with the point you are making. At the same time, though, the ZODB drawback
applies to RDBMS-based systems as well. I don't think many people would consider a CMS that had
no database, which used the filesystem as the database.

With RDBMS-based systems, do developers practice the approach you suggest? Or, do you think these
systems have native facilities? Perhaps instead, people have learned to accept the limitation.

yep...

You didn't mention Pantera or Black Sabbath. You don't know what you're missing. ;-)

I think it's likely that developers on projects that use an RDBMS do the same thing. From time to time, they junk a local copy of the database and build another one out from a dump file or whatever; the dump file and the script to import the dump file are the equivalent of the "generator" script in my example. There is just some extra confusion when it comes to Zope: it's not always obvious that you should devalue the persistent objects in the ZODB, because a lot of the objects in any given ZODB seem like code instead of data. But I'm asking you treat it like it only contains unimportant application data.

Some animals are more equal than others

Your advice about the ZODB works pretty well during initial development; where it
falls down is after the site has gone live; at that point, developers aren't free
to trash persistent data, because it is "real", rather than faked.

I would therefore add to your dicta:

- as soon as you discover any "precious" ZODB data (e.g., business rules
represented as object configurations), write an export / import tool for
said data.

The "export" dumps from such a tool can then be used as part of your generator;
from time to time, or after making an important change in the ZODB, you rerun
the exporters, and check in the resultant files.

Developers will then be free to buildout and test with a sandbox which actually
reflects such business logic.

good point.

Absolutely, for the places where configuration that is changed TTW can't be kept in files on the file system like, historically, workflow definitions or other tool settings, these objects need to have an import/export mechanism. I have to admit that dealing with ongoing development on a Zope site that has gone into production is something I haven't needed to to a lot of (which is pretty sad, actually).

That gets me to thinking. Given the general amount of confusion caused by the historical overhyping Zope *itself* as a collaborative development environment, at some level, I can empathize with AMK's "Why we don't use Zope" article (http://www.amk.ca/python/writing/why-not-zope.html). It's important that folks understand the limitations of the TTW development model; they risk becoming disillusioned with Zope if they don't. Ah well. At some point, the Zope-as-OS project will be completed. You know that day has come when we embed Emacs. Or vice versa. ;-)

Thank you for this!

Back in 2004, I relied on this document heavily to develop standards and scripts for the Plone app I built at my employer, ThoughtWorks. It worked well, so thank you very much for writing this! :-)

I solved the "but what happens when you go into production" problem by keeping everything in ZODB build-out and dump/import scripts as much as possible. Most of my important data was stored in Oracle, so updates to production involved one or two hand-written alter SQL statements from time-to-time when table schemas changed. (I never quite got comfortable using Zope to do equivalent updates and alters to class instances in the ZODB. Since the instance update process in the ZODB was opaque to me, I never really trusted it (or my ability to debug it). :-(

The only data that stayed in the ZODB was users and their passwords. On every production release, I wrote a script to dump the user information to CSV, then a script to reimport it when I deployed the new production version. If I had just used LDAP from the start, that would saved me a few steps.

Anyway, thanks again, this really helped me figure out "the right way" to do Zope development.