Installing Packages: pip, conda, and dependencies
Before You Start
You should know: - How to create a virtual environment using venv or conda and activate it.
You will learn: - What a package manager is and why Python relies on them. - How to install, list, and remove packages using pip and conda. - What to do when pip fails on a binary package. - How to verify an installation actually worked. - How to freeze your environment into a reproducibility file. - Why mixing pip and conda in the same environment is risky.
Introduction
Out of the box, Python is small. It can do arithmetic, read text files, and manipulate strings. Ask it to load a satellite image or calculate a geodesic distance and it won’t know how.
For scientific computing you extend Python’s vocabulary by installing packages — bundles of code written by others that solve specific problems. Instead of downloading .zip files manually, you use a package manager that finds, downloads, and links the right version of a library into your active virtual environment.
There are two package managers you will encounter: pip (used with standard Python) and conda (used with Miniforge/Anaconda). Make sure your environment is activated before running any install command.
pip: The Standard Package Manager
pip pulls packages from PyPI — the Python Package Index — which hosts over 500,000 packages. It is installed automatically with standard Python.
Always activate your virtual environment first. Running pip install without an active environment injects the package into your global Python — the problem virtual environments exist to prevent.
Installing a package
pip install numpypip reaches out to PyPI, finds the latest stable version, downloads it, and places it into your .venv folder. You will see progress output as it runs.
Install multiple packages at once:
pip install pandas matplotlib scipyChecking what is installed
pip listYou will see packages you did not explicitly ask for. These are dependencies — libraries that your requested package itself depends on. When you install pandas, pip automatically installs pytz, python-dateutil, and numpy because pandas cannot function without them.
Verifying the installation worked
After installing, confirm Python can actually load the library:
python -c "import numpy; print(numpy.__version__)"The -c flag runs a short Python snippet directly from the terminal. If it prints a version number, the install worked. If it raises ModuleNotFoundError, something went wrong — the most common cause is that your environment was not activated when you ran pip install.
Uninstalling a package
pip uninstall numpypip will ask for confirmation before removing it.
When pip fails on a binary package
Some geographic libraries — rasterio, fiona, shapely, GDAL — wrap compiled C or C++ code. pip downloads these as wheels (pre-compiled binary packages) when they are available. If no wheel exists for your platform and Python version, pip tries to compile the code from source, which requires a C compiler and system headers you may not have.
On Windows especially, this fails with unhelpful error messages about missing .h files or cl.exe.
Solutions, in order of preference:
- Use conda instead of pip for these packages (see below). conda always distributes pre-compiled binaries.
- On Windows with pip: Install from an unofficial pre-compiled wheel. Christoph Gohlke’s site maintains wheels for many geographic libraries.
- Install build tools: On macOS,
xcode-select --installinstalls the compilers pip needs. On Windows, install Microsoft C++ Build Tools.
conda: The Data Science Package Manager
If you installed Miniforge, use conda as your primary package manager. conda maintains its own registry of pre-compiled packages, sourced from conda-forge — a community channel with excellent geographic library coverage.
conda’s key advantage is that it resolves the compiled dependencies for you. conda install rasterio works reliably on Windows, macOS, and Linux because conda ships the C libraries alongside the Python bindings.
Installing with conda
Ensure your environment is active (conda activate your-env-name), then:
conda install numpyconda pauses to “solve” the environment — it calculates whether the new package is compatible with everything already installed. It prints a summary of what it will add, change, or remove, then asks for confirmation. Type y.
Install from the conda-forge channel explicitly (Miniforge does this by default; full Anaconda users may need it):
conda install -c conda-forge geopandasCheck what is installed:
conda listVerifying a conda install
Same technique as pip:
python -c "import geopandas; print(geopandas.__version__)"The Golden Rule: Don’t Mix pip and conda in the Same Environment
It is tempting to use both in the same environment — perhaps conda install geopandas for the binary packages and pip install some-other-tool for everything else. This usually works initially and then breaks mysteriously later, because the two package managers do not know about each other’s installs and can overwrite shared files during upgrades.
The rule: - If you are using a venv environment: use pip for everything. - If you are using a conda environment: use conda for everything. Use pip only as a last resort, and only for packages that genuinely do not exist in conda-forge.
When you do have to use pip inside a conda environment, install all your conda packages first, then pip packages last. Running conda install after pip install can silently overwrite pip-installed packages.
Reproducibility: Recording Your Environment
You completed six months of analysis. You send your script to a reviewer. It crashes on their machine because they have a different library version. This is avoidable.
With pip, export your environment to a requirements file:
pip freeze > requirements.txtThis writes every installed package and its exact version number to requirements.txt. Commit this file to Git along with your code.
A colleague rebuilds your environment:
python -m venv .venv
source .venv/bin/activate # or Windows equivalent
pip install -r requirements.txtWith conda, export to a YAML file:
conda env export > environment.ymlRebuild:
conda env create -f environment.ymlMake it a habit to regenerate requirements.txt or environment.yml every time you install a new package. The file should always reflect the current state of your environment.
Verify Your Work
- Activate your virtual environment.
- Install
numpy,pandas, andmatplotlib. - Run
pip listorconda listto confirm they are installed. - Verify numpy loaded correctly:
python -c "import numpy; print(numpy.__version__)". - Export your environment:
pip freeze > requirements.txt(orconda env export > environment.yml). - Run
lsto confirm the file was created.
Next Steps
With libraries installed and environments working, you are ready to start writing real interactive code in Jupyter Notebooks.