Skip to content.

plope

Personal tools
You are here: Home » Members » chrism's Home » Buildout Ghettoization
 
 

Buildout Ghettoization

zc.buildout's ghettoization effect on the Python package installation process.

Although I have resigned myself to supporting it, I'm still unhappy with zc.buildout. This is mostly because it tends to ghettoize the packages that are written to use some of its core features. The core features that tend towards ghettoization fall into dependency info, testing info, and script generation. Here's what I don't like.

Dependency Information Balkanization

Critical dependency version information is kept outside of setup.py in buildout.cfg, making it more likely that a package won't work when 'easy_install'ed. Example: roughly 90% of the Zope3-related packages and any packages that depend upon them were not 'easy_install'able from PyPI for a few months due to two external dependency pins done in buildout.cfg files. Installation via buildout worked, of course, and the packagers never noticed that the packages couldn't be 'easy_install'ed, even though the packages are all published to PyPI, and thus presumably 'easy_install'able.

Testing Info Balkanization

Because the defacto test regime is via the zope testrunner invoked via bin/test, buildout-centric package developers are hostile to test invocation via setup.py test, although setup.py test has been adopted in the larger Python community as the expected way to run a single package's tests. Example: another committer "fixed" a checkin of mine by removing stuff from the "tests_require" line of setup.py in a package housed on svn.zope.org. This makes setup.py test not work in a fresh checkout. bin/test after a buildout continues to work, so the packagers just never notice.

Script Generation

Scripts of dependent packages aren't installed automatically by buildout, while easy_install does install them. So the result of installing an egg by putting it in a buildout eggs= line is not equivalent to installing using setup.py install. This means that packages which have post-setup routines (such as creating an instance) which expect dependent packages scripts to be installed won't work. Example: the repoze.plone package depends on the repoze.zope2 package. The repoze.plone package is a meta-egg that names as its dependencies repoze.zope2 and all Plone product and library eggs. When repoze.plone is 'easy_install'ed, a script from its repoze.zope2 dependency named mkzope2instance is put into the bin directory. However, when it's named as an egg in a buildout.cfg, the script is not installed, making it impossible to create an instance. This is fixable by adding a zc.recipe.eggs:scripts recipe and naming each egg I want to install scripts for in that recipe's section, so it's not intractable, it just violates the rule of least suprise.

Scripts are generated to have long PYTHONPATHS because eggs go to locations that aren't already on the PYTHONPATH. The environment generated by a buildout is roughly equivalent to a virtualenv, so I don't understand why we shouldn't just make a virtualenv, and have zc.recipe.egg put eggs in its site-packages, and manage a .pth file to activate and deactivate them as necessary. Scripts wouldn't need to contain any PYTHONPATH info then. This would be more in line with what easy_install already does, which would make it easier for people who haven't yet drank the buildout Kool-Aid to understand.

In any case, I'm, as usual, bitter but dealing. ;-) One buildout for repoze.plone is on the burner.

Created by chrism
Last modified 2008-02-24 05:49 PM

Well..

I rather like buildout, as you know, because it's the only stable way we've managed to get an egg-based distribution of Plone to work. I think Zope 3 and Grok are experiencing similar thing. If you've watched a few non-Python people royally break their Python installations (or slightly better, a virtualenev if they manage to stay inside it) with easy_install then you may feel that pain.

I completely agree with the goal that we should not have buildout be a dependency - it's a choice of convenience and consistency. But practice has shown that it works out.

Which is why we're really grateful that you're making a repoze.plone buildout. We need it. :-)

To take your points in turn:

1) Dependency Information Balkanization

This indeed sucks, and should probably be seen as a stop-gap measure of the fact that Zope 3's eggification process was messy and probably not fully through through. However, no-one seems to have been able to come up with a proper way of managing a release process like Zope 3's where every package is its own egg and you need to track "known good" configurations without an external tracking of "known good sets" like what we currently do with buildout. If you have a practical way to do this, I'd love to hear it.

Of course, easy_install should work for a package. That it doesn't is definitely a bug that should be fixed.

2) Testing Balkanization

For Zope 2/Plone t least, this issue doesn't have anything to with buildout - it has to do with the way that ZopeTestCase works and Zope's product machinery. No-one particularly likes running through zopectl to run tests. I don't know enough about the machinery to support "python setup.py test" to comment on how/whether that's a problem.

3) Script Generation

I was puzzled at this too. I think buildout should generate scripts by default, or at least have the option to behave that way. I can see cases where this isn't necessary, of course.

Having buildout use virtualenev as an implementation detail may be an interesting choice. Why not bring it up on distutils-sig and see what Jim thinks?

Cheers,
Martin

scripts

Automatically creating bin/* scripts when they are defined in an egg should Just Happen imho. I was quite surprised that it didn't work when I included a docutils egg: "where is my doc/rst2html???".

@martin...

> I rather like buildout, as you know, because it's the only stable way
> we've managed to get an egg-based distribution of Plone to work. I
> think Zope 3 and Grok are experiencing similar thing. If you've
> watched a few non-Python people royally break their Python
> installations (or slightly better, a virtualenev if they manage to
> stay inside it) with easy_install then you may feel that pain.

I have indeed, and I sympathize. I obviously don't have an objection
to something that operates like buildout, and I recognize that it's
"good magic" for a lot of people who can't be bothered to figure out
how it all works.

> I completely agree with the goal that we should not have buildout be a
> dependency - it's a choice of convenience and consistency. But
> practice has shown that it works out.

Of course, you dance with who brung ya... it works, and it works
pretty well. But because of that, nobody's going to really care about
the larger balkanization issues, nor maybe should they. Although that
makes me sad, because the goal of of eggifying Zope was to make it
easier for non-Zope people to use the components by themselves. It's
less useful to eggify them if the eggs have to be installed by a
particular build system to work.

Encouraging people to use TTW scripting in the face of TTW script
security in Zope worked out too, but it drove a lot of "normal" Python
people away from Zope (thus balkanizing the Python webdev community,
and later ghettoizing Zope). Until buildout picks up steam in the
non-Zope world, we will hopefully work hard to be good Python citizens
by ensuring that eggs we release can be installed in a non-buildout
way.

> Which is why we're really grateful that you're making a repoze.plone
> buildout. We need it. :-)

Heh. It's coming up, whether anybody needs it or not. ;-)

> 1) Dependency Information Balkanization
> This indeed sucks, and should probably be seen as a stop-gap measure
> of the fact that Zope 3's eggification process was messy and
> probably not fully through through. However, no-one seems to have
> been able to come up with a proper way of managing a release process
> like Zope 3's where every package is its own egg and you need to
> track "known good" configurations without an external tracking of
> "known good sets" like what we currently do with buildout. If you
> have a practical way to do this, I'd love to hear it.

Of course! I've had hours of discussions about this on zope-dev and
in IRC. Make package indexes, each of which represents a given KGS
(representing a release, or an application, or a deployment, or a
customer project, or whatever), *do not include packages that are not
known to be good* (as the current Zope KGS does, mirroring all
non-Zope packages in PyPI in realtime), and instruct people that want
to set up a known-to-work project to easy_install into a virtualenv
only from one of them. If they want to install a package from some
other index into that virtualenv, they're free to do so, but, like
installing, say DAG or FreshRPMs as opposed to Fedora ones, it might
not always work out for them.

If it were easier to make a package index out of a working set, it
would be trivial for people to set up their own package indexes. Tres
did some work on this in "compoze":
http://svn.repoze.org/compoze/trunk/README.txt

It would be great to have tools that could work on *any* environment
to do installation and deinstallation. Lots of people have already
written egg tools that don't work well under a buildout-derived
system.

> 2) Testing Balkanization
>
> For Zope 2/Plone t least, this issue doesn't have anything to with
> buildout - it has to do with the way that ZopeTestCase works and
> Zope's product machinery. No-one particularly likes running through
> zopectl to run tests. I don't know enough about the machinery to
> support "python setup.py test" to comment on how/whether that's a
> problem.

For Zope 3 packages, it indeed does have to do with buildout, as
people *depend* on buildout to give them "bin/test" (and never try to
invoke the tests in any other way; setup.py test just returns nothing,
and you don't get bin/test when you install via easy_install). You
can cause "setup.py test" to invoke any test runner you like (see the
'test_suite=' line in
http://svn.zope.org/ZODB/trunk/setup.py?rev=84132&view=markup ) and
setup.py's "test_requires" stuff will go get the packages you need to
run the tests without installing it first.

It's also definitely possible to write Zope 2 products which can be
tested via "python setup.py test". But I don't use ZopeTestCase, so I
don't know what the issues are there. If ZopeTestCase doesn't run
tests in any other way than to be run under "zopectl test", it's
arguably broken. (aside: IMO, it's pretty broken anyway. I don't use
it because I don't need to initialize products to write unit tests. I
suspect the tests that people write with ZopeTestCase are not unit
tests, they are integration tests. IMO, integration tests shouldn't
be run when you run a set of unit tests, and should be shunted off to
another test invocation. IOW, I believe unit tests should test the
functionality of the package you're working on, not the interaction of
the package's components with components from other packages.) In any
case, however, the whole Plone unit testing situation seems totally
dependent on ZopeTestCase from the bottom-up currently, so if that
doesn't work under python setup.py test, we should probably fix that.

3) Script Generation
>
> I was puzzled at this too. I think buildout should generate scripts by
> default, or at least have the option to behave that way. I can see
> cases where this isn't necessary, of course.

Yup.

> Having buildout use virtualenev as an implementation detail may be
> an interesting choice. Why not bring it up on distutils-sig and see
> what Jim thinks?

Many of the core assumptions made by buildout aren't actually "core",
we can change them by changing the recipes, it's just a lot of work.
I talked with Jim about it at the PSPS and he's largely neutral on it.

Thanks for the thoughts!

Dependencies

I don't think dependency information can go in setup.py files. Dependency information grows over time, and can be complex; e.g., version 1.0 of X works, but for some feature you need version 1.1. What do you depend on then? You can fiddle around with extras, but it's all very awkward, and easy_install doesn't handle it very elegantly anyway. And you just can't know what the dependencies really are; there will be new versions of packages and you can't know whether those new versions will or won't be compatible. Pre-declaring them incompatible isn't good, but if you don't keep your versions pinned down you can't be sure that you won't break something. And editing setup.py everytime you learn something new is not feasible either. Do you re-release each time? Blech.

This is why I think we need a formal notion of a set of packages that work together, separate from any single package. I think .NET has some notion of external dependency information that you can add to a package, and maybe that offers something interesting to consider (or maybe not). I've been pretty happy with the requirement file, a simple list of packages. It's also easy to maintain, copy, and version. A package index seems way too heavy in comparison. I suppose you could run an index off such a file, but why bother? We also have a script to take a working set of packages and create a requirement file that specifies exact versions of every package, complete with svn version pins when there isn't a proper tag; the known-good issue. I think buildout config files basically have the same effect.