Libraries
Most of the power of a programming language is in its libraries. This is especially true for Python which is an interpreted language and is therefore very slow (compared to compiled languages). However, the libraries are often compiled (can be written in compiled languages such as C/C++) and therefore offer much faster performance than native Python code.
A library is a collection of functions that can be used by other programs. Python’s standard library includes many functions we worked with before (print, int, round, …) and is included with Python. There are many other additional modules in the standard library such as math:
print('pi is', pi)
import math
print('pi is', math.pi)
You can also import math’s items directly:
from math import pi, sin
print('pi is', pi)
sin(pi/6)
cos(pi)
help(math) # help for libraries works just like help for functions
from math import *
You can also create an alias from the library:
import math as m
print m.pi
Question 10.1
What function from the math library can you use to calculate a square root without usingsqrt
?
Question 10.2
You want to select a random character from the stringbases='ACTTGCTTGAC'
. What standard library would you most expect
to help? Which function would you select from that library? Are there alternatives?
Question 10.3
A colleague of yours typeshelp(math)
and gets an error: NameError: name 'math' is not defined
. What has your
colleague forgotten to do?
Question 10.4
Convert the angle 0.3 rad to degrees using the math library.Virtual environments and packaging
To install a package into the current Python environment from inside a Jupyter notebook, simply do (you will probably need to restart the kernel before you can use the package):
%pip install packageName # e.g. try bson
In Python you can create an isolated environment for each project, into which all of its dependencies will be installed. This could be useful if your several projects have very different sets of dependencies. On the computer running your Jupyter notebooks, open the terminal and type:
(Important: on a cluster you must do this on the login node, not inside the JupyterLab terminal.)
module load python/3.9.6 # specific to HPC clusters
pip install virtualenv
virtualenv --no-download climate # create a new virtual environment in your current directory
source climate/bin/activate
which python && which pip
pip install --no-index netcdf4 ...
pip install --no-index ipykernel # install ipykernel (IPython kernel for Jupyter) into this environment
python -m ipykernel install --user --name=climate --display-name "My climate project" # add your environment to Jupyter
...
deactivate
Quit all your currently running Jupyter notebooks and the Jupyter dashboard. If running on syzygy.ca, logout from your session and then log back in.
Whether running locally or on syzygy.ca, open the notebook dashboard, and one of the options in New
below Python 3
should be climate
.
To delete the environment, in the terminal type:
jupyter kernelspec list # `climate` should be one of them
jupyter kernelspec uninstall climate # remove your environment from Jupyter
/bin/rm -rf climate
Quick overview of some of the libraries
pandas
is a library for working with 2D tables / spreadsheetsnumpy
is a library for working with large, multi-dimensional arrays, along with a large collection of linear algebra functions- provides missing uniform collections (arrays) in Python, along with a large number of ways to quickly process these collections ⮕ great for speeding up calculations in Python
matplotlib
andplotly
are two plotting packages for Pythonscikit-image
is a collection of algorithms for image processingxarray
is a library for working with labelled multi-dimensional arrays and datasets in Python- “
pandas
for multi-dimensional arrays” - great for large scientific datasets; writes into NetCDF files
- “