Skip to content.

plope

Personal tools
You are here: Home » Members » chrism's Home » Setuptools distribution_links Considered Harmful
 
 

Setuptools distribution_links Considered Harmful

distribution_links foils repeatability

What's wrong with this bit of setuptools-enabled setup.py?:

  setup(name='repoze.zope2',
        version=__version__,
        distribution_links =['http://dist.repoze.org'],
        description='Zope2 via WSGI and Paste',
        long_description=README + '\n\nCHANGES\n\n' + CHANGES,
        classifiers=[
          "Development Status :: 4 - Beta",
          "Intended Audience :: Developers",
          "Programming Language :: Python",
          "Topic :: Internet :: WWW/HTTP",
          "Topic :: Internet :: WWW/HTTP :: Dynamic Content",
          "Topic :: Internet :: WWW/HTTP :: WSGI",
          "Topic :: Internet :: WWW/HTTP :: WSGI :: Application",
          ],
        keywords='web application server wsgi zope',
        author="Agendaless Consulting",
        author_email="repoze-dev@lists.repoze.org",
        url="http://www.repoze.org",
        license="BSD-derived (http://www.repoze.org/LICENSE.txt)",
        packages=find_packages(),
        include_package_data=True,
        namespace_packages=['repoze'],
        zip_safe=False,
        .
        .
        .
        )

This is what's wrong with it:

        distribution_links = ['http://dist.repoze.org'],

I used distribution_links heavily a few months ago in various setup scripts for packages I maintain, because it "just made things work". If I didn't have a package up on PyPI, I just threw the package in the dist.repoze.org directory, added the distribution_links line to setup.py and Bob became avuncular.

However, Bob was not avuncular for very long. I noticed after a while that I wasn't really able to repeat a particular build to my satisfaction, despite having created my own package index for the package I wanted to install that included all of its dependencies. Despite that, easy_install kept visiting http://dist.repoze.org to try to find packages that were already in the index! And it did find them... the wrong ones! I was stumped.

It turned out that this was the fault of distribution_links in my setup.py. Upon removing it, easy_install preferred the packages in my index, and didn't try to look anywhere else for them.

As it turns out, adding distribution_links to any of your eggs basically lets easy_install know that it's OK to check there for any dependency encountered anywhere else during the run of the installer. So even though you might have put it into an egg meaning for it to only find a couple of immediate dependencies in a big directory full of eggs somewhere, if some one else depends upon your distribution-links-wielding package, easy_install adds that big directory full of eggs to his search path during the installation process, and if he shares other package dependencies during that install run, he'll be finding things in that directory that neither he nor you really meant him to. Worse, since it's promiscuous, if you change that big directory full of eggs, the next time, he'll be finding different stuff than the first time he tried to install. This is independent of whether or not he's maintaining his own index or installing from PyPI or whatever. It just happens.

For these reasons, I now consider the use of distribution_links that points to a directory within an egg's setup.py a packaging bug. It should not exist. Such eggs almost always need to be rerolled by downstream consumers if they're concerned about installation repeatability.

The main reason people use distribution_links is because they don't know how easy it is to make a package index instead of pointing distribution_links at a directory full of files. To make a package index, download this script and invoke it in a directory full of distutils/setuptools distribution files like so:

  python makeindex.py *.{gz,tgz,zip,egg} 

A directory named index will be created. This is a package index. Then when people want to install stuff from your index (including yourself), if the index directory is rooted at http://example.com/dist, the easy_install command is:

  easy_install -i http://example.com/dist/index my.package

But you say "whooa, hold on there! I want to use python setup.py {install|test|develop|} in a checkout!" No fear. Instead of using distribution_links in setup.py, put this in your distribution's setup.cfg:

  [easy_install]
  index = http://example.com/dist/index

Bob becomes avuncular, and you stop hosing people who want to use your software but want a completely repeatable build.

Created by chrism
Last modified 2008-06-30 07:42 PM

information

The problem is your index system is based on ignorance -- if you withhold almost all information from easy_install, you can coerce it into doing what you want. But if information leaks in it breaks it, as in the case of dependency_links.

I think you'll have quite an uphill battle trying to get people to remove dependency_links from their projects. And, in fact, I find them very useful -- poach-eggs reads that information to figure out how to check out code when freezing a set of requirements. And of course all the other traditional uses of dependency_links.

no.

With respect, that's not the problem. The problem is that there's no way to prevent easy_install from *using* dependency_links. I don't care that dependency_links *exists*, I'd just like a way to tell easy_install to ignore that information for a particular run. Without having that capability, it's impossible to know you're reproducing the same environment via any two successive invocations of easy_install, and that's a bug as far as I'm concerned. The setuptools code is not terribly easy to hack on either, or I would have just fixed it.