Department of Physics and Astronomy

The Forbes Group

Distributing Python Packages

$\newcommand{\vect}[1]{\mathbf{#1}} \newcommand{\uvect}[1]{\hat{#1}} \newcommand{\abs}[1]{\lvert#1\rvert} \newcommand{\norm}[1]{\lVert#1\rVert} \newcommand{\I}{\mathrm{i}} \newcommand{\ket}[1]{\left|#1\right\rangle} \newcommand{\bra}[1]{\left\langle#1\right|} \newcommand{\braket}[1]{\langle#1\rangle} \newcommand{\op}[1]{\mathbf{#1}} \newcommand{\mat}[1]{\mathbf{#1}} \newcommand{\d}{\mathrm{d}} \newcommand{\pdiff}[3][]{\frac{\partial^{#1} #2}{\partial {#3}^{#1}}} \newcommand{\diff}[3][]{\frac{\d^{#1} #2}{\d {#3}^{#1}}} \newcommand{\ddiff}[3][]{\frac{\delta^{#1} #2}{\delta {#3}^{#1}}} \DeclareMathOperator{\erf}{erf} \DeclareMathOperator{\Tr}{Tr} \DeclareMathOperator{\order}{O} \DeclareMathOperator{\diag}{diag} \DeclareMathOperator{\sgn}{sgn} \DeclareMathOperator{\sech}{sech} $

Distributing Python Packages

I have reached a point at which I find myself duplicating certain bits of code in various projects and now want to refactor this code into a common package. This post documents the process and contains some links to useful information about how to cleanly distribute a python package. This should culminate in the following clean packages for use in many of my subsequent projects:

Directory Structure

There are several possible directory structures. Presently I use the following structure or skeleton:

├─ packagename
│  ├─ __init__.py
│  └─ tests
│     └─ ...
├─ docs
│  └─ notebooks
│     ├─ README.ipynb
│     └─ ...
├─ README.rst        # Usually generated from README.ipynb
├─ .hg               # Mercurial repo
├─ .hgignore         # Ignore .pyc files etc.
├─ setup.py      
├─ setup.cfg         # Test configuration parameters
└─ requirements.txt

Ionel Cristian Mărieș gives a lucid argument that one should use a src directory

├─ src
│  └─ packagename
│     ├─ __init__.py
│     └─ ...
├─ tests
│  └─ ...
└─ setup.py

but this seems better for packages instead of code for active development. The arguments are sound, but not completely relevant for a working code repository.

doc vs. docs

Here is a random collection of choices from software that I have installed from source. I guess that doc wins.

doc: matplotlib, numpy, scipy, pycuda, sympy, theano, sphinx.

Doc: python.

docs: flake8, pep8, ipython, nikola.

Version Numbers

A useful tool is bumpversion which can be used to update the version number in several places:

bumpversion -n --new-version 0.3 minor

Testing

Several tools help greatly with testing:

  • py.test: Provides a very convenient way of gathering anything the looks like a test and running it. Has many useful plugins. I use: pytest-cov, pytest-flake8, and pytest-runner.
  • tox: Automation for running your tests in sandboxed environments. Can test against different versions of python, and ensures that your dependencies are completely specified.
  • flake8: Syntax checking against PEP8 (using pep8) and with PyFlakes. Optionally provides complexity testing with McCabe.

  • nose: I now use py.test as nose is no-longer maintained.

To run these multiple tools, I provide a cutom class in setup.py that overrides the standard test command so that everything I want is run with python setup.py test. Configuration options such as which warnings and errors to ignore are specified in setup.cfg:

[flake8]
# E221 Spaces before operators (x = 4  + 4)
# E225 Requires space arround *all* operators (I like compact x**2 for example.)
# E226 Requires whitespace around arithmetic operators
# E241 Extra spaces after commas a = (1,  2) is not allowed
# W293 Blank line containes whitespace
# W503 line break before binary operator
ignore = E225,E226,W293,W503
#jobs=1
max-complexity = 12

# This is the limit at which I can get 3 full emacs windows open.
max-line-length = 85

[aliases]
test=pytest

I also include the following pytest.ini file (note: you must insert your project name in two places):

[pytest]
testpaths =
    <PROJECT_NAME>
markers =
    bench: mark test as a benchmark.  (Might be slow, or platform dependent)
addopts =
    -m 'not bench'
    --doctest-modules
    --cov=<PROJECT_NAME>
    --cov-report=html
    --cov-fail-under=95
    --no-cov-on-fail
    -x
    #--pdb
    #--flake8

doctest_optionflags =
    ELLIPSIS
    NORMALIZE_WHITESPACE

Tox

Here is my tox configuration file tox.ini. Note: flake8 configuration can also be included here, but it seems that nosetests configurations need to be in setup.cfg so we need both files:

[tox]
envlist = py27

[testenv]
commands=
     python setup.py test

install_command = pip install --process-dependency-links {opts} {packages}

Note: Some of my dependencies are not released on PyPI so I need to specify the repository in my dependency_links section of setup.py. This was deemed a security risk, so pip does not honor this by default and the --process-dependency-links option needs to be manually specified.

Another issue is that tox does not support conda.

Requirements and Dependencies

The following articles helped clarify some of the issues with specifying dependencies:

  • setup.py vs requirements.txt: clarifies the difference between abstract dependencies (persist) and concrete dependencies (https://bitbucket.org/mforbes/mmfutils). The former should go in setup.py while the latter can be sepcified in requirements.txt.

To test this, it is useful to have a clean environment. Here is how to create one with conda:

conda create -n py2.6 python=2.6 --no-default-packages
conda create -n py2.7 python=2.7 --no-default-packages

Once this is created, you can make a throwaway copy for testing with:

conda create -n test --clone py2.6

then activate it with

. activate test

When done, deactivate and remove it:

. deactivate
conda remove -n test --all

I originally intended to release my projects on PyPi so that others could install them with pip install mmfutils for example. In the end I opted to leave this for packages that I felt really contributed somthing substantial to the community. Instead, for my uses, installation from http://bitbucket.org is sufficient:

pip install hg+https://bitbucket.org/mforbes/mmfutils
pip install hg+https://bitbucket.org/mforbes/persist

After some thought, and a comment from opensourcehacker.com I decided that I will probably only upload one of these, the persist project when it is stable and my mmf-setup package. The latter is my only "personal" package: I use it so I can bootstrap installations from anywhere with pip install mmf-setup.

Update: Nov 2017 You must now use updated tools for deploying. I am using twine which is recommended. Make sure you are using an updated version of twine.

Note: uploading packages is a bit tricky. Here are some brief suggestions, but I recommend reading a more comprehensive description as outlined in the References.

  • Make sure you bump the version for every upload. Use a micro-version number for this if needed.
  • Consider using twine. (Make sure that your setuptools is not version 0.6 as this has issues. You can check with twine --version.)
  • Check the metadata against the list classifiers. They must be exact matches.
  • Populate your ~/.pypirc file with something like this:

    # ~/.pypirc
     [distutils]
      index-servers =
         pypi
         pypitest
    
     [pypitest]
     repository = https://testpypi.python.org/legacy
     username:mforbes
     password:*********
    
     [pypi]
     username:mforbes
     password:*********
    
  • Check your distribution with:

     python setup.py check --restructuredtext --strict
  • Upload with:

    python setup.py sdist
     twine upload -r pypitest dist/*    # For testing
     twine upload -r pypi dist/*        # When you are ready
    
  • Install with:

    pip install --user -i https://testpypi.python.org/pypi mmf_setup   # From test server
    pip install --user mmf_setup                                       # From PyPI
    

Directory Structure

References

This really hits home, and was part of my original frustration with trying to upload my project to PyPI:

I have not really used the following much yet:

In [ ]: