This chapter walks through CPython installation on the Mac OS X, Linux, and Windows platforms. Sections on packaging tools (like Setuptools and pip) are repetitive, so you should skip straight to the section for your particular operating system, and ignore the others.
If you are part of an organization that recommends you use a commercial Python distribution, such as Anaconda or Canopy, you should follow your vendor’s instructions. There is also a small note for you in “Commercial Python Redistributions”.
If Python already exists on your system,
do not, on any account, allow anybody to change the
symbolic link to the python executable to point at anything
other than what it is already pointing at.
That would be almost as bad as reading
Vogon poetry
out loud. (Think of the system-installed code that
depends on a specific Python in a specific place…)
The latest version of Mac OS X, El Capitan, comes with its own Mac-specific implementation of Python 2.7.
You don’t need to install or configure anything else to use Python. But we strongly recommend installing Setuptools, pip, and virtualenv before you start building Python applications for real-world use (i.e., contributing to collaborative projects). You’ll learn more about these tools and how to install them later in this section. In particular, you should always install Setuptools, as it makes it much easier for you to use other third-party Python libraries.
The version of Python that ships with OS X is great for learning, but it’s not good for collaborative development. The version shipped with OS X may also be out of date from the official current Python release, which is considered the stable production version.1 So, if all you want to do is write scripts for yourself to pull information from websites, or process data, you don’t need anything else. But if you are contributing to open source projects, or working on a team with people that may have other operating systems (or ever intend to in the future2), use the CPython release.
Before you download anything, read through the end of the next few paragraphs for notes and warnings. Before installing Python, you’ll need to install GCC. It can be obtained by downloading Xcode, the smaller Command-Line Tools (you need an Apple account to download it), or the even smaller osx-gcc-installer package.
If you already have Xcode installed, do not install osx-gcc-installer. In combination, the software can cause issues that are difficult to diagnose.
While OS X comes with a large number of Unix utilities, those familiar with Linux systems will notice one key component missing: a decent package manager. Homebrew fills this void.
To install Homebrew, open Terminal or your favorite OS X terminal emulator and run the following code:
$ BREW_URI=https://raw.githubusercontent.com/Homebrew/install/master/install$ruby -e"$(curl -fsSL${BREW_URI})"
The script will explain what changes it will make and prompt
you before the installation begins.
Once you’ve installed Homebrew, insert the Homebrew directory
at the top of your PATH environment variable.3
You can do this by adding the following
line at the bottom of your ~/.profile file:
exportPATH=/usr/local/bin:/usr/local/sbin:$PATH
And then to install Python, run this once in a terminal:
$ brew install python3
Or for Python 2:
$ brew install python
By default, Python will then be installed in
/usr/local/Cellar/python3/ or /usr/local/Cellar/python/
with symbolic links4
to the interpreter at /usr/local/python3 or /usr/local/python.
People who use the --user option to pip install
will need to work around a bug
involving distutils
and the Homebrew configuration. We recommend
just using virtual environments, described in
“virtualenv”.
Homebrew installs Setuptools and pip for you. The executable installed with pip will be mapped to pip3 if you are using Python 3 or to pip if you are using Python 2.
With Setuptools, you can download and install any
compliant5
Python software over a network (usually the Internet)
with a single command (easy_install). It also enables you
to add this network installation capability to your own
Python software with very little work.
Both pip’s pip command and Setuptools’s easy_install command are tools to install and manage Python packages. pip is recommended over easy_install
because it can also uninstall packages,
its error messages are more digestible, and
partial package installs can’t happen (installs that fail
partway through will unwind everything that happened so far).
For a more nuanced discussion, see pip vs easy_install in the
Python Packaging User Guide, which should be your first reference for current packaging
information.
To upgrade your installation of pip, type the following in a shell:
$ pip install --upgrade pip
virtualenv creates isolated Python environments.
It creates a folder containing all the necessary
executables to use the packages that a Python project would need.
Some people believe best practice is to install nothing
except virtualenv and Setuptools and to then always
use virtual environments.6
To install virtualenv via pip, run pip at the command line
of a terminal shell:
$ pip3 install virtualenv
or if you are using Python 2:
$ pip install virtualenv
Once you are in a virtual environment, you can always use the command pip, whether you are working with Python 2 or Python 3, so that is what we will do in the rest of this guide. “Virtual Environments” describes usage and
motivation in more detail.
Ubuntu started releasing with only Python 3 installed
(and Python 2 available via apt-get) as of Wily Werewolf (Ubuntu 15.10).
All of the details are on Ubuntu’s Python page.
Fedora’s release 23 is the first with only Python 3 (both Python 2.7
and 3 are available on releases 20–22),
and otherwise Python 2.7 will be available via its package manager.
Most parallel installations of Python 2 and Python 3
make a symbolic link from python2 to a Python 2
interpreter and from python3 to a Python 3 interpreter.
If you decide to use Python 2, the current recommendation on Unix-like
systems (see Python Enhancement Proposal [PEP 394])
is to explicitly specify python2 in your shebang notation (e.g., #!/usr/bin/env python2
as the first line in the file)
rather than rely on the environment python pointing where
you expect.
Although not in PEP 394, it has also become convention
to use pip2 and pip3 to link to the respective pip
package installers.
Even if pip is available through a package installer
on your system, to ensure you get the most recent version, follow these steps.
First, download get-pip.py.7
Next, open a shell, change directories to the same location as get-pip.py, and type:
$wget https://bootstrap.pypa.io/get-pip.py$sudo python3 get-pip.py
or for Python 2:
$wget https://bootstrap.pypa.io/get-pip.py$sudo python get-pip.py
This will also install Setuptools.
With the easy_install command that’s installed with Setuptools, you can download and install any
compliant8
Python software over a network (usually the Internet). It also enables you
to add this network installation capability to your own
Python software with very little work.
pip is a tool that helps you easily install and manage
Python packages. It
is recommended over easy_install
because it can also uninstall packages,
its error messages are more digestible, and
partial package installs can’t happen (installs that fail
partway through will unwind everything that happened so far).
For a more nuanced discussion, see “pip vs easy_install” in the
Python Packaging User Guide, which should be your first reference for current packaging
information.
Almost everyone will at some point want
to use Python libraries that depend on C extensions.
Sometimes your package manager will have these, prebuilt, so you
can check first (using yum search or apt-cache search);
and with the newer wheels format
(precompiled, platform-specific binary files), you
may be able to get binaries directly from PyPI, using pip.
But if you expect to create C extensions in the future, or if the
people maintaining your library haven’t made wheels for your platform,
you will need the development tools for Python: various C libraries, make,
and the GCC compiler.
The following are some useful packages that use C libraries:
The linear algebra library NumPy
The numerical toolkit SciPy
The machine learning library scikit-learn
The plotting library Matplotlib
The interface to the HDF5 data format h5py
The PostgreSQL database adapter Psycopg
The database abstraction and object-relational mapper SQLAlchemy
On Ubuntu, in a terminal shell, type:
$sudo apt-get update --fix-missing$sudo apt-get install python3-dev# For Python 3$sudo apt-get install python-dev# For Python 2
Or on Fedora, in a terminal shell, type:
$sudo yum update$sudo yum install gcc$sudo yum install python3-devel# For Python 3$sudo yum install python2-devel# For Python 2
and then pip3 install --user desired-package will be able to build tools that must be compiled. (Or pip install --user desired-package for Python 2.) You also will need the tool itself installed (for details on how to do this, see the HDF5 installation documentation). For PostgreSQL on Ubuntu, you’d type this in a terminal shell:
$ sudo apt-get install libpq-dev
$ sudo yum install postgresql-devel
virtualenv is a command installed with the virtualenv package that creates isolated Python environments.
It creates a folder containing all the necessary
executables to use the packages that a Python project would need.
To install virtualenv using Ubuntu’s package manager, type:
$ sudo apt-get install python-virtualenv
or on Fedora:
$ sudo yum install python-virtualenv
Or via pip, run pip at the command line
of a terminal shell, and use the --user option to
install it locally for yourself rather than doing a
system install:
$ pip3 install --user virtualenv
or if you are using Python 2:
$ sudo pip install --user virtualenv
Once you are in a virtual environment, you can always use the command pip, whether you are working with Python 2 or Python 3, so that is what we will do in the rest of this guide. “Virtual Environments” describes usage and motivation in more detail.
Windows users have it harder than other Pythonistas—because it’s harder
to compile anything on Windows,
and many Python libraries
use C extensions under the hood.
Thanks to wheels, binaries can
be downloaded from PyPI using pip (if they exist), so things have
gotten a little easier.
There are two paths here: a commercial distribution (discussed in “Commercial Python Redistributions”) or straight-up CPython. Anaconda is much easier, especially when you’re going to do scientific work. Actually, pretty much everyone who does scientific computing on Windows with Python (except those developing C-based Python libraries of their own) will recommend Anaconda. But if you know your way around compiling and linking, if you want to contribute to open source projects that use C code, or if you just don’t want a commercial distribution (what you need is free), we hope you consider installing straight-up CPython.9
As time progresses, more and more packages with C libraries
will have wheels on PyPI, and so can be obtained via pip.
The trouble comes when required C library dependencies are not bundled
with the wheel. This dependency problem is another reason you may prefer commercial Python redistributions like Anaconda.
Use CPython if you are the kind of Windows user who:
Doesn’t need Python libraries that rely on C extensions
Owns a Visual C++ compiler (not the free one)
Can handle setting up MinGW
Is game to download binaries by hand10 and then pip install the binary
If you will use Python as a substitute for R or MATLAB, or just want to get up to speed quickly and will install CPython later if necessary (see “Commercial Python Redistributions” for some tips), use Anaconda.11
If you want your interface to be mostly graphical (point-and-click), or if Python is your first language and this is your first install, use Canopy.
If your entire team has already committed to one of these options, then you should go with whatever is currently being used.
To install the standard CPython implementation on Windows, you first need to download the latest version of Python 3 or Python 2.7 from the official website. If you want to be sure you are installing a fully up-to-date version (or are certain you really, really want the 64-bit installer12), then use the Python Releases for Windows site to find the release you need.
The Windows version is provided as an MSI package. This format allows Windows administrators to automate installation with their standard tools. To install the package manually, just double-click the file.
By design, Python installs to a directory with the version number embedded
(e.g., Python version 3.5 will install at C:\Python35\) so that you can
have multiple versions of Python on the same system without conflicts.
Of course, only one interpreter can be the default application for Python file types.
The installer does not automatically modify the PATH environment variable,13
so that you always have control over which copy of Python is run.
Typing the full path name for a Python interpreter each time quickly gets tedious,
so add the directories for your default Python version to the PATH. Assuming
that the Python installation you want to use is in C:\Python35\,
you will want to add this to your PATH:
C:\Python35;C:\Python35\Scripts\
You can do this easily by running the following in PowerShell:14
PSC:\>[Environment]::SetEnvironmentVariable("Path","$env:Path;C:\Python35\;C:\Python35\Scripts\","User")
The second directory (Scripts) receives command files when certain packages are installed, so it is a very useful addition. You do not need to install or configure anything else to use Python.
Having said that, we strongly recommend installing Setuptools, pip, and virtualenv before you start building Python applications for real-world use (i.e., contributing to collaborative projects). You’ll learn more about these tools and how to install them later in this section. In particular, you should always install Setuptools, as it makes it much easier for you to use other third-party Python libraries.
The current MSI packaged installers install Setuptools and pip for you with Python, so if you are following along with this book and just installed now, you have them already. Otherwise, the best way to get them with Python 2.7 installed is to upgrade to the newest release.15 For Python 3, in versions 3.3 and prior, download the script get-pip.py,16 and run it. Open a shell, change directories to the same location as get-pip.py, and then type:
PSC:\>pythonget-pip.py
With Setuptools, you can download and install any
compliant17 Python software over a network (usually the Internet)
with a single command (easy_install). It also enables you
to add this network installation capability to your own
Python software with very little work.
Both pip’s pip command and Setuptools’s easy_install command are tools to install and manage Python packages. pip
is recommended over easy_install
because it can also uninstall packages,
its error messages are more digestible, and
partial package installs can’t happen (installs that fail
partway through will unwind everything that happened so far).
For a more nuanced discussion, see “pip vs easy_install” in the
Python Packaging User Guide, which should be your first reference for current packaging
information.
The virtualenv command
creates isolated Python environments. It creates a folder containing all the necessary
executables to use the packages that a Python project would need.
Then, when you activate the environment using a command in
the new folder, it prepends that folder to your PATH environment
variable—the Python in the new folder becomes the first
one found, and the packages in its subfolders are the ones
used.
To install virtualenv via pip, run pip at the command line
of a PowerShell terminal:
PSC:\>pipinstallvirtualenv
“Virtual Environments” describes usage and
motivation in more detail. On OS X and Linux, because
Python comes installed for use by system or third-party software,
they must specifically distinguish between the Python 2 and Python 3
versions of pip. On Windows, there is no need to do this, so
whenever we say pip3, we mean pip for Windows users. Regardless of OS, once you are in a virtual environment, you can always use the command pip, whether you are working with Python 2 or Python 3, so that is what we will do in the rest of this guide.
Your IT department or classroom teaching assistant may have asked you to install a commercial redistribution of Python. This is intended to simplify the work an organization needs to do to maintain a consistent environment for multiple users. All of the ones listed here provide the C implementation of Python (CPython).
A technical reviewer for the first draft of this chapter said we massively understated the trouble it is to use a regular CPython installation on Windows for most users: that even with wheels, compiling and/or linking to external C libraries is painful for everyone but seasoned developers. We have a bias toward plain CPython, but the truth is if you’re going to be a consumer of libraries and packages (as opposed to a creator or contributor), you should just download a commercial redistribution and get on with your life—doubly so if you’re a Windows user. Later, when you want to contribute to open source, you can install the regular distribution of CPython.
It is easier to go back to a standard Python installation if you do not alter the default settings in vendor-specific installations.
Here’s what these commercial distributions have to offer:
The purpose of the Intel Distribution for Python is to deliver high-performance Python in an easy-to-access, free package. The primary boost to performance comes from linking Python packages with native libraries such as the Intel Math Kernel Library (MKL), and enhanced threading capabilities that include the Intel Threading Building Blocks (TBB) library. It relies on Continuum’s conda for package management, but also comes with pip. It can be downloaded by itself or installed from https://anaconda.org/ in a conda environment.18
It provides the SciPy stack and the other common libraries listed in the release notes (PDF). Customers of Intel Parallel Studio XE get commercial support and everyone else can use the forums for help. So, this option gives you the scientific libraries without too much fuss, and otherwise is a regular Python distribution.
Continuum Analytics’ distribution of Python is released under the BSD license and provides tons of precompiled science and math binaries on its free package index. It has a different package manager than pip, called conda, that also manages virtual environments, but acts more like Buildout (discussed in “Buildout”) than like virtualenv—managing libraries and other external dependencies for the user. The package formats are incompatibile, so each installer can’t install from the other’s package index.
The Anaconda distribution comes with the SciPy stack and other tools. Anaconda has the best license and the most stuff for free; if you’re going to use a commercial distribution—especially if you’re already comfortable working with the command line already and like R or Scala (also bundled)—choose this. If you don’t need all of those other things, use the miniconda distribution instead. Customers get various levels of indemnification (related to open source licenses, and who can use what when, or whom gets sued for what), commercial support, and extra Python libraries.
ActiveState’s distribution
is released under the ActiveState Community License and is
free for evaluation only; otherwise it requires a license.
ActiveState also provides solutions for Perl and Tcl.
The main selling point of this distribution is broad indemnification
(again related to open source licenses) for the more than
7,000 packages in its cultivated
package index,
reachable using the ActiveState pypm tool, a replacement for pip.
Enthought’s distribution
is released under the Canopy Software License, with a
package manager, enpkg, that is used in place of pip to connect to
Canopy’s package index.
Enthought provides free academic licenses to students and staff from degree-granting institutions. Distinguishing features from Enthought’s distribution are graphical tools to interact with Python, including its own IDE that resembles MATLAB, a graphical package manager, a graphical debugger, and a graphical data manipulation tool. Like the other commercial redistributors, there is indemnification and commercial support, in addition to more packages for customers.
1 Other people have different opinions. The OS X Python implementation is not the same. It even has some separate OS X–specific libraries. A small rant on this subject criticizing our recommendation is at the Stupid Python Ideas blog. It raises valid concerns about collision of some names for people who switch-hit between OS X’s CPython 2.7 and the canonical CPython 2.7. If this is a concern, use a virtual environment. Or, at the very least, leave the OS X Python 2.7 where it is so that the system runs smoothly, install the standard Python 2.7 implemented in CPython, modify your path, and never use the OS X version. Then everything works fine, including products that rely on Apple’s OS X–specific version.
2 The best option is to pick Python 3, honestly, or to use virtual environments from the start and install nothing but virtualenv and maybe virtualenvwrapper according to the advice of Hynek Schlawack.
3 This will ensure that the Python you use is the one Homebrew just installed, while leaving the system’s original Python exactly as it is.
4 A symbolic link is a pointer to the actual file location. You can confirm where the link points to by typing, for example, ls -l /usr/local/bin/python3 at the command prompt.
5 Packages that are compliant with Setuptools at a minimum provide enough information for the library to identify and obtain all package dependencies. For more information, see the documentation for Packaging and Distributing Python Projects, PEP 302, and PEP 241.
6 Advocates of this practice say it is the only way to ensure nothing ever overwrites an existing installed library with a new version that could break other version-dependent code in the OS.
7 For additional details, see the pip installation instructions.
8 Packages that are compliant with Setuptools at a minimum provide enough information for it to identify and obtain all package dependencies. For more information, see the documentation for Packaging and Distributing Python Projects, PEP 302, and PEP 241.
9 Or consider IronPython (discussed in “IronPython”) if you want to integrate Python with the .NET framework. But if you’re a beginner, this should probably not be your first Python interpreter. This whole book talks about CPython.
10 You must know at least what version of Python you’re using and whether you selected 32-bit or 64-bit Python. We recommend 32-bit, as every third-party DLL will have a 32-bit version and some may not have 64-bit versions. The most widely cited location to obtain compiled binaries is Christoph Gohlke’s resource site. For scikit-learn, Carl Kleffner is building binaries using MinGW in preparation for eventual release on PyPI.
11 Anaconda has more free stuff, and comes bundled with Spyder, a better IDE. If you use Anaconda, you’ll find Anaconda’s free package index and Canopy’s package index to be helpful.
12 Meaning you are 100% certain that any Dynamically Linked Libraries (DLLs) and drivers you need are available in 64 bit.
13 The PATH lists every location the operating system will look to find executable programs, like Python and Python scripts like pip. Each entry is separated by a semicolon.
14 Windows PowerShell provides a command-line shell and scripting language that is similar enough to Unix shells that Unix users will be able to function without reading a manual, but with features specifically for use with Windows. It is built on the .NET Framework. For more information, see Microsoft’s “Using Windows PowerShell.”
15 The installer will prompt you whether it’s OK to overwrite the existing installation. Say yes; releases in the same minor version are backward-compatible.
16 For additional details, see the pip installation instructions.
17 Packages that are compliant with Setuptools at a minimum provide enough information for the library to identify and obtain all package dependencies. For more information, see the documentation for “Packaging and Distributing Python Projects,” PEP 302, and PEP 241.
18 Intel and Anaconda have a partnership, and all of the Intel accelerated packages are only available using conda. However, you can always conda install pip and use pip (or pip install conda and use conda) when you want to.