Chapter 6. Shipping Great Code

This chapter focuses on best practices for packaging and distributing Python code. You’ll either want to create a Python library to be imported and used by other developers, or create a standalone application for others to use, like pytest.

The ecosystem around Python packaging has become a lot more straightforward in the past few years, thanks to the work of the Python Packaging Authority (PyPA)1—the people who maintain pip, the Python Package Index (PyPI), and much of the infrastructure relevant to Python packaging. Their packaging documentation is stellar, so we won’t reinvent the wheel in “Packaging Your Code”, but we will briefly show two ways to host packages from a private site, and talk about how to upload code to Anaconda.org, the commercial analogue to PyPI run by Continuum Analytics.

The downside of distributing code through PyPI or other package repositories is that the recipient must understand how to install the required version of Python and be able and willing to use tools such as pip to install your code’s other dependencies. This is fine when distributing to other developers but makes the method unsuitable for distributing applications to end users who aren’t coders. For that, use one of the tools in “Freezing Your Code”.

Those making Python packages for Linux may also consider a Linux distro package (e.g., a .deb file on Debian/Ubuntu; called “build distributions” in Python documentation). That route is a lot of work to maintain, but we give you some options in “Packaging for Linux-Built Distributions”. This is like freezing, but with the Python interpreter removed from the bundle.

Finally, we’ll share a pro tip in “Executable ZIP Files”: if your code is in a ZIP archive (.zip) with a specific header, you can just execute the ZIP file. When you know your target audience has Python installed already, and your project is purely Python code, this is a fine option.

Useful Vocabulary and Concepts

Until the PyPA’s formation, there wasn’t actually a single, obvious way to do packaging (as can be seen from this historical discussion on Stack Overflow). Here are the most important vocabulary words discussed in this chapter (there are more definitions in the PyPA glossary):

Dependencies

Python packages list their Python library depenencies either in a requirements.txt file (for testing or application deployment), or in the install_requires argument to setuptools.setup() when it is called in a setup.py file.

In some projects there can be other dependencies, such as a Postgres database, a C compiler, or a C library shared object. These may not be explicitly stated, but if absent will break the build. If you build libraries like these, Paul Kehrer’s seminar on distributing compiled modules may help.

Built distribution

A format of distribution for a Python package (and optionally other resources and metadata) that is in a form that can be installed and then run without further compilation.

Egg

Eggs are a built distribution format—basically, they’re ZIP files with a specific structure, containing metadata for installation. They were introduced by the Setuptools library, and were the de facto standard for years, but were never an official Python packaging format. They have been replaced by wheels as of PEP 427. You can read all about the differences between the formats in “Wheel vs Egg” in the Python Packaging User Guide.

Wheel

Wheels are a built distribution format that is now the standard for distribution of built Python libraries. They are a packaged as ZIP files with metadata that pip will use to install and uninstall the package. The file has a .whl extension, by convention, and follows a specific naming convention that communicates specifically what platform, build, and interpreter it is for.

Aside from having Python installed, regular Python packages, written only in Python, don’t need anything but other Python libraries that can be downloaded from PyPI (or eventually Warehouse—the upcoming newer location for PyPI) to run. The difficulty (which we tried to get ahead of with the extra installation steps in Chapter 2) comes when the Python library has dependencies outside of Python—on C libraries or system executables, for example. Tools like Buildout and Conda are meant to help, when the distribution gets more complicated than even the Wheel format can handle.

Packaging Your Code

To package code for distribution means to create the necessary file structure, add the required files, and define the appropriate variables to comform to relevant PEPs and the current best practice described in “Packaging and Distributing Projects” in the Python Packaging Guide,2 or the packaging requirements of other repositories, like http://anaconda.org/.

Conda

If you have Anaconda’s redistribution of Python installed, you can still use pip and PyPI, but your default package manager is conda, and your default package repository is http://anaconda.org/. We recommend following this tutorial for building packages, which ends with instructions on uploading to Anaconda.org.

If you are making a library for scientific or statistical applications—even if you don’t use Anaconda yourself—you will want to make an Anaconda distribution in order to easily reach the wide academic, commercial, and Windows-using audiences that choose Anaconda to get binaries that work without effort.

PyPI

The well-established ecosystem of tools such as PyPI and pip make it easy for other developers to download and install your package either for casual experiments, or as part of large professional systems.

If you’re writing an open source Python module, PyPI, more properly known as The Cheeseshop, is the place to host it.3 If your code isn’t packaged on PyPI, it will be harder for other developers to find it and to use it as part of their existing process. They will regard such projects with substantial suspicion of being either badly managed, not yet ready for release, or abandoned.

The definitive souce for correct, up-to-date information about Python packaging is the PyPA-maintained Python Packaging Guide.

Use testPyPI for Testing and PyPI for Real

If you are just testing your packaging settings, or teaching someone how to use PyPI, you can use testPyPI and run your unit tests before pushing a real version to PyPI. Like with PyPI, you must change the version number every time you push a new file.

Sample project

The PyPA’s sample project demonstrates the current best practice for packaging a Python project. Comments in the setup.py module give advice, and identify relevant PEPs governing options. The overall file structure is organized as required, with helpful comments about each file’s purpose and what it should contain.

The project’s README file links back to the packaging guide and to a tutorial about packaging and distribution.

Use pip, not easy_install

Since 2011, the PyPA has worked to clear up considerable confusion and considerable discussion about the canonical way to distribute, package, and install Python libraries. pip was chosen as Python’s default package installer in PEP 453, and it is installed by default with Python 3.4 (first released in 2014) and later releases.4

The tools have a number of nonoverlapping uses, and older systems may still need easy_install. This chart from the PyPA compares pip and easy_install, identifying what each tool does and does not offer.

When developing your own code, you’ll want to install using pip install --editable . so that you can continue to edit the code without reinstalling.

Personal PyPI

If you want to install packages from a source other than PyPI, (e.g., an internal work server for proprietary company packages, or packages checked and blessed by your security and legal teams), you can do it by hosting a simple HTTP server, running from the directory containing the packages to be installed.

For example, say you want to install a package called MyPackage.tar.gz, with the following directory structure:

.
|--- archive/
     |--- MyPackage/
          |--- MyPackage.tar.gz

You can run an HTTP server from the archive directory by typing the following in a shell:

$ cd archive
$ python3 -m SimpleHTTPServer 9000

This runs a simple HTTP server running on port 9000 and will list all packages (in this case, MyPackage). Now you can install MyPackage using any Python package installer. Using pip on the command line, you would do it like this:

$ pip install --extra-index-url=http://127.0.0.1:9000/ MyPackage
Warning

Having a folder with the same name as the package name is crucial here. But if you feel that the structure MyPackage/MyPackage.tar.gz is redundant, you can always pull the package out of the directory and install with a direct path:

$ pip install http://127.0.0.1:9000/MyPackage.tar.gz

Pypiserver

Pypiserver is a minimal PyPI-compatible server. It can be used to serve a set of packages to easy_install or pip. It includes helpful features like an administrative command (-U) which will update all its packages to their latest versions found on PyPI.

S3-hosted PyPI

Another option for a personal PyPI server is to host on Amazon’s Simple Storage Service, Amazon S3. You must first have an Amazon Web Service (AWS) account with an S3 bucket. Be sure to follow the bucket naming rules—you’ll be allowed to create a bucket that breaks the naming rules, but you won’t be able to access it. To use your bucket, first create a virtual environment on your own machine and install all of your requirements from PyPI or another source. Then install pip2pi:

$ pip install git+https://github.com/wolever/pip2pi.git

And follow the pip2pi README file for the pip2tgz and dir2pi commands. Either you’ll do:

$ pip2tgz packages/ YourPackage+

or these two commands:

$ pip2tgz packages/ -r requirements.txt
$ dir2pi packages/

Now, upload your files. Use a client like Cyberduck to sync the entire packages folder to your S3 bucket. Make sure you upload packages/simple/index.html as well as all new files and directories.

By default, when you upload new files to the S3 bucket, they will have user-only permissions. If you get HTTP 403 when trying to install a package, make sure you’ve set the permissions correctly: use the Amazon web console to set the READ permission of the files to EVERYONE. Your team will now be able to install your package with:

$ pip install \
  --index-url=http://your-s3-bucket/packages/simple/ \
  YourPackage+

VCS support for pip

It is possible to pull code directly from a version control system using pip; to do so, follow these instructions. This is another alternative to hosting a personal PyPI. An example command using pip with a GitHub project is:

$ pip install git+git://git.myproject.org/MyProject#egg=MyProject

In which the egg does not have to be an egg—it is the name of the directory in your project that you want to install.

Freezing Your Code

To freeze your code means to create a standalone executable bundle you can distribute to end users who do not have Python installed on their computer—the distributed file or bundle contains both the application code and the Python interpreter.

Applications such as Dropbox, Eve Online, Civilization IV, and BitTorrent client—all primarily written in Python—do this.

The advantage of distributing this way is that your application will just work, even if the user doesn’t already have the required (or any) version of Python installed. On Windows, and even on many Linux distributions and OS X, the right version of Python will not already be installed. Besides, end user software should always be in an executable format. Files ending in .py are for software engineers and system administrators.

A disadvantage of freezing is that it increases the size of your distribution by about 2–12 MB. Also, you will be responsible for shipping updated versions of your application when security vulnerabilities to Python are patched.

We compare the popular freezing tools in Table 6-1. They all interface with distuils in Python’s Standard Library. They cannot do cross-platform freezes,5 so you must perform each build on the target platform.

The tools are listed in the order they will appear in this section. Both PyInstaller and cx_Freeze can be used on all platforms, py2app only works on OS X, py2exe only works on Windows, and bbFreeze can work on both UNIX-like and Windows systems, but not OS X, and it has not yet been ported to Python 3. It can generate eggs, though, in case you need this ability for your legacy system.

Table 6-1. Freezing tools
pyInstaller cx_Freeze py2app py2exe bbFreeze

Python 3

Yes

Yes

Yes

Yes

 — 

License

Modified GPL

Modified PSF

MIT

MIT

Zlib

Windows

Yes

Yes

 — 

Yes

Yes

Linux

Yes

Yes

 — 

 — 

Yes

OS X

Yes

Yes

Yes

 — 

 — 

Eggs

Yes

Yes

Yes

 — 

Yes

Support for
pkg_resourcesa

 — 

 — 

Yes

 — 

Yes

One-file modeb

Yes

 — 

 — 

Yes

 — 

a pkg_resources is a separate module bundled with Setuptools that can be used to dynamically find dependencies. This is a challenge when freezing code because it’s hard to discover dynamically loaded dependencies from the static code. PyInstaller, for example, only says they will get it right when the introspection is on an egg file.

b One-file mode is the option to bundle an application and all its dependencies into a single executable file on Windows. InnoSetup and the Nullsoft Scriptable Install System (NSIS) are both popular tools that create installers and can bundle code into a single .exe file.

PyInstaller

PyInstaller can be used to make applications on OS X, Windows, and Linux. Its primary goal is to be compatible with third-party packages out of the box—so the freeze just works.6 They have a list of PyInstaller supported packages. Supported graphical libraries include Pillow, pygame, PyOpenGL, PyGTK, PyQT4, PyQT5, PySide (except for Qt plug-ins), and wxPython. Supported scientific tools include NumPy, Matplotlib, Pandas, and SciPy.

PyInstaller has a modified GPL license “with a special exception which allows [anyone] to use PyInstaller to build and distribute non-free programs (including commercial ones)”—so the license(s) you must comply with will depend on the libraries you used to develop your code. Their team even provides instructions for hiding the source code for those making commercial applications or wanting to prevent others from altering the code. But do read the license (consult a lawyer if it’s important or https://tldrlegal.com/ if it’s not that important) if you need to modify their source code to build your app, because you may be required to share that change.

The PyInstaller Manual is well organized and detailed. Check the PyInstaller requirements page to confirm that your system is compatible—for Windows, you need XP or later; for Linux systems, you’ll need several terminal applications (the documentation lists where you can find them); and for OS X, you need version 10.7 (Lion) or later. You can use Wine (a Windows emulator) to cross-compile for Windows while running under Linux or OS X.

To install PyInstaller, use pip from within the same virtual environment where you are building your app:

$ pip install pyinstaller

To create a standard executable from a module named script.py, use:

$ pyinstaller script.py

To create a windowed OS X or Windows application, use the --windowed option on the command line like this:

$ pyinstaller --windowed script.spec

This creates two new directories and a file in the same folder where you executed the pyinstaller command:

  • A .spec file, which can be rerun by PyInstaller to re-create the build.

  • A build folder that holds some log files.

  • A dist folder, that holds the main executable and some dependent Python libraries.

PyInstaller puts all the Python libraries used by your application into the dist folder, so when distributing the executable, distribute the whole dist folder.

The script.spec file can be edited to customize the build, with options to:

  • Bundle data files with the executable.

  • Include runtime libraries (.dll or .so files) that PyInstaller can’t infer automatically.

  • Add Python runtime options to the executable.

This is useful, because now the file can be stored with version control, making future builds easier. The PyInstaller wiki page contains build recipes for some common applications, including Django, PyQt4, and code signing for Windows and OS X. This is the most current set of quick tutorials for PyInstaller. Now, the edited script.spec can be run as an argument to pyinstaller (instead of using script.py again):

$ pyinstaller script.spec
Note

When PyInstaller is given a .spec file, it takes all of its options from the contents of that file and ignores command-line options, except for: --upx-dir=, --distpath=, --workpath=, --noconfirm, and --ascii.

cx_Freeze

Like PyInstaller, cx_Freeze can freeze Python projects on Linux, OS X, and Windows systems. However, the cx_Freeze team does not recommend compiling for Windows using Wine because they’ve had to manually copy some files around to get the app to work. To install it, use pip:

$ pip install cx_Freeze

The easiest way to make the executable is to run cxfreeze from the command line, but you have more options (and can use version control) if you write a setup.py script. This is the same setup.py as is used by distutils in Python’s Standard Library—cx_Freeze extends distutils to provide extra commands (and modify some others). These options can be provided at the command line or in the setup script, or in a setup.cfg configuration file.

The script cxfreeze-quickstart creates a basic setup.py file that can be modified and version controlled for future builds. Here is an example session for a script named hello.py:

$ cxfreeze-quickstart
Project name: hello_world
Version [1.0]:
Description: "This application says hello."
Python file to make executable from: hello.py
Executable file name [hello]:
(C)onsole application, (G)UI application, or (S)ervice [C]:
Save setup script to [setup.py]:

Setup script written to setup.py; run it as:
    python setup.py build
Run this now [n]?

Now we have a setup script, and can modify it to match what our app needs. The options are in the cx_Freeze documentation under “distutils setup scripts.” There are also example setup.py scripts and minimal working applications that demonstrate how to freeze applications that use PyQT4, Tkinter, wxPython, Matplotlib, Zope, and other libraries in the samples/ directory of the cx_Freeze source code: navigate from the top directory to cx_Freeze/cx_Freeze/samples/. The code is also bundled with the installed library. You can get the path by typing:

$ python -c 'import cx_Freeze; print(cx_Freeze.__path__[0])'

Once you are done editing setup.py, you can use it to build your executable using one of these commands:

$ python setup.py build_exe  1
$ python setup.py bdist_msi  2
$ python setup.py bdist_rpm  3
$ python setup.py bdist_mac  4
$ python setup.py bdist_dmg  5
1

This is the option to build the command-line executable.

2

This is modified by cx_Freeze from the original distutils command to also handle Windows executables and their dependencies.

3

This is modified from the original distutils command to ensure Linux packages are created with the proper architecture for the current platform.

4

This creates a standalone windowed OS X application bundle (.app) containing the dependencies and the executable.

5

This one creates the .app bundle and also creates an application bundle, then packages it into a DMG disk image.

py2app

py2app builds executables for OS X. Like cx_Freeze, it extends distutils, adding the new command py2app. To install it, use pip:

$ pip install py2app

Next, autogenerate a setup.py script using the command py2applet, like this:

$ py2applet --make-setup hello.py
Wrote setup.py

This makes a basic setup.py, which you can modify for your needs. There are examples with minimal working code and the appropriate setup.py scripts that use libraries including PyObjC, PyOpenGL, pygame, PySide, PyQT, Tkinter, and wxPython in py2app’s source code. To find them, navigate from the top directory to py2app/examples/.

Then, run setup.py with the py2app command to make two directores, build and dist. Be sure to clean the directories when you rebuild; the command looks like this:

$ rm -rf build dist
$ python setup.py py2app

For additional documentation, check out the py2app tutorial. The build may exit on an AttributeError. If so, read this tutorial about using py2app—the variables scan_code and load_module may need to be preceded with underscores: _scan_code and _load_module.

py2exe

py2exe builds executables for Windows. It is very popular, and the Windows version of BitTorrent was made using py2exe. Like cx_Freeze and py2exe, it extends distutils, this time adding the command py2exe. If you need to use it with Python 2, download the older version of py2exe from sourceforge. Otherwise, for Python 3.3+, use pip:

$ pip install py2exe

The py2exe tutorial is excellent (apparently what happens when documentation is hosted wiki-style rather than in source control). The most basic setup.py looks like this:

from distutils.core import setup
import py2exe

setup(
      windows=[{'script': 'hello.py'}],
)

The documentation lists all of the configuration options for py2exe and has detailed notes about how to (optionally) include icons or create a single file executable. Depending on your own license for Microsoft Visual C++, you may or may not be able to distribute the Microsoft Visual C++ runtime DLL with your code. If you can, here are the instructions to distribute the Visual C++ DLL alongside the .exe file; otherwise, you can provide your application’s users with a way to download and install the Microsoft Visual C++ 2008 redistributable packge or the Visual C++ 2010 redistributable packge if you’re using Python 3.3 or later.

Once you have modified your setup file, you can generate the .exe into dist directory by typing:

$ python setup.py py2exe

bbFreeze

The bbFreeze library currently has no maintainer and has not been ported to Python 3, but it is still frequently downloaded. Like cx_Freeze, py2app, and py2exe, it extends distutils, adding the command bbfreeze. In fact, older versions of bbFreeze were based on cx_Freeze. The appeal here may be for those who maintain legacy systems and would like to packge built distributions into eggs to be used across their infrastructure. To install it, use pip:

$ pip install bbfreeze  # bbFreeze can't work with Python3

It is light on documentation, but has build recipes for Flup, Django, Twisted, Matplotlib, GTK, and Tkinter, among others. To make executable binaries, use the command bdist_bbfreeze like this:

$ bdist_bbfreeze hello.py

It will create a directory dist in the location where bbfreeze was run that contains a Python interpreter and an executable with the same name as the script (hello.py in this case).

To generate eggs, use the new distutils command:

$ python setup.py bdist_bbfreeze

There are other options, like tagging builds as snapshots or daily builds. Get more usage information using the standard --help option:

$ python setup.py bdist_bbfreeze --help

For fine-tuning, you can use the bbfreeze.Freezer class, which is the preferred way to use bbfreeze. It has flags for whether to use compression in the created ZIP file, whether to include a Python interpreter, and which scripts to include.

Packaging for Linux-Built Distributions

Creating a Linux built distribution is arguably the “right way” to distribute code on Linux: a built distribution is like a frozen package, but it doesn’t include the Python interpreter, so the download and install are about 2 MB smaller than when using freezing.7 Also, if a distribution releases a new security update for Python, your application will automatically start using that new version of Python.

The bdist_rpm command from the distutils module in Python’s Standard Library makes it trivially easy to produce an RPM file for use by Linux distributions like Red Hat or SuSE.

Having said all that, here are links to the Python packaging instructions for some popular Linux distributions:

If you want a faster way to package code for all of the flavors of Linux out there, you may want to try the effing package manager (fpm). It’s written in Ruby and shell, but we like it because it packages code from multiple source types (including Python) into targets including Debian (.deb), RedHat (.rpm), OS X (.pkg), Solaris, and others. It’s a great fast hack but does not provide a tree of dependencies, so package maintainers may frown upon it. Or Debian users can try Alien, a Perl program that converts between Debian, RedHat, Stampede (.slp), and Slackware (.tgz) file formats, but the code hasn’t been updated since 2014, and the maintainer has stepped down.

For those interested, Rob McQueen posted some insights about deploying server apps at work, on Debian.

Executable ZIP Files

It’s a little-known secret that Python has been able to execute ZIP files containing a __main__.py since version 2.6. This is a great way to package pure Python apps (applications that don’t require platform-specific binaries). So, if you have a single file __main__.py like this:

if __name__ == '__main__':
    try:
        print 'ping!'
    except SyntaxError:  # Python 3
        print('ping!')

And you create a ZIP file containing it by typing this on the command line:

$ zip machine.zip __main__.py

You can then send that ZIP file to other people, and so long as they have Python, they can execute it on the command line like this:

$ python machine.zip
ping!

Or if you want to make it an executable, you can prepend a POSIX “shebang” (#!) to the ZIP file—the ZIP file format allows this—and you now have a self-contained app (provided Python is reachable via the path in the shebang). Here is an example that continues on from the previous code:

$ echo '#!/usr/bin/env python' > machine
$ cat machine.zip >> machine
$ chmod u+x machine

And now it’s an executable:

$ ./machine
ping!
Note

Since Python 3.5, there is also a module zipapp in the Standard Library that makes it more convenient to create these ZIP files. It also adds flexibility so that the main file need no longer be called __main__.py.

If you vendorize your dependencies by placing them inside your current directory, and change your import statements, you can make an executable ZIP file with all dependencies included. So, if your directory structure looks like this:

.
|--- archive/
     |--- __main__.py

and you are running inside a virtual environment that has only your dependencies installed, you can type the following in the shell to include your dependencies:

$ cd archive
$ pip freeze | xargs pip install --target=packages
$ touch packages/__init__.py

The xargs commands takes standard input from pip freeze and turns it into the argument list for the pip command, and the --target=packages option sends the installation to a new directory, packages. The touch command creates an empty file if none exists; otherwise, it updates the file’s timestamp to the current time. The directory structure will now look something like this:

.
|--- archive/
     |--- __main__.py
     |--- packages/
          |--- __init__.py
          |--- dependency_one/
          |--- dependency_two/

If you do this, make sure to also change your import statements to use the packages directory you just created:

#import dependency_one  # not this
import packages.dependency_one as dependency_one

And then just recursively include all of the directories in the new ZIP file using zip -r, like this:

$ cd archive
$ zip machine.zip -r *
$ echo '#!/usr/bin/env python' > machine
$ cat machine.zip >> machine
$ chmod ug+x machine

1 Rumor has it they prefer to be called the “Ministry of Installation.” Nick Coghlan, the BDFL-delegate for packaging related PEPs, wrote a thoughtful essay on the whole system, its history, and where it should go on his blog a few years ago.

2 There appear to be two URLs mirroring the same content at the moment: https://python-packaging-user-guide.readthedocs.org/ and https://packaging.python.org.

3 PyPI is in the process of being switched to the Warehouse, which is now in an evaluation phase. From what we can tell, they are changing the UI, but not the API. Nicole Harris, one of the PyPA developers, wrote a brief introduction to Warehouse, for the curious.

4 If you have Python 3.4 or higher without pip, you can install it on the command line with python -m ensurepip.

5 Freezing Python code on Linux into a Windows executable was attempted in PyInstaller 1.4, but dropped in 1.5 because it didn’t work well except for pure Python programs (so, no GUI applications).

6 As we’ll see when looking at other installers, the challenge is not just in finding and bundling the compatible C libraries for the specific version of a Python library, but also in discovering peripheral configuration files, sprites or special graphics, and other files that aren’t discoverable to the freezing tool by inspecting your source code.

7 Some people may have heard these called “binary packages” or “installers”; the official Python name for them is “built distributions”—meaning RPMs, or Debian packages, or executable installers for Windows. The wheel format is also a type of built distribution, but for various reasons touched on in a micro-rant about wheels it is often better to make platform-specific Linux distributions like described in this section.