Intro to high-performance computing (HPC)
Monday, June 19th
2:00pm–5:00pm Pacific Time
This course is an introduction to High-Performance Computing on the Alliance clusters.
Instructor: Alex Razoumov (SFU)
Prerequisites: Working knowledge of the Linux Bash shell. We will provide guest accounts to one of our Linux systems.
Software: All attendees will need a remote secure shell (SSH) client installed on their computer in order to
participate in the course exercises. On Windows we recommend
the free Home Edition of MobaXterm. On Mac and Linux computers SSH is
usually pre-installed (try typing ssh
in a terminal to make sure it is there).
- Please download a ZIP file with all slides (single PDF combining all chapters) and sample codes.
- We’ll be using the same training cluster as in the morning – let’s try to log in now.
Part 1
Click on a triangle to expand a question:
Question 1: cluster filesystems
Let’s log in to the training cluster. Try to access/home
, /scratch
, /project
on the training cluster. Note that
these only emulate the real production filesystems and have no speed benefits on the training cluster.
Question 2: edit a remote file
Edit a remote file innano
or vi
or emacs
. Use cat
or more
to view its content in the terminal.
Question 3: gcc compiler
Load the default GNU compiler withmodule
command. Which version is it? Try to understand what the module does: run
module show
on it, echo $PATH
, which gcc
.
Question 4: Intel compiler
Load the default Intel compiler. Which version is it? Does it work on the training cluster?Question 5: third compiler?
Can you spot the third compiler family when you domodule avail
?
Question 6: scipy-stack
What other modules doesscipy-stack/2022a
load?
Question 7: python3
How many versions of python3 do we have? What about python2?Question 8: research software
Think of a software package that you use. Check if it is installed on the cluster, and share your findings.Question 9: file transfer
Transfer a file to/from the cluster (we did this already in bash class) using either command line or GUI. Type “done” into the chat when done.Question 10: why HPC?
Can you explain (1-2 sentences) how HPC can help us solve problems? Why a desktop/workstation not sufficient? Maybe, you can give an example from your field?Question 11: tmux
Try left+right or upper+lower split panes intmux
. Edit a file in one and run bash commands in the
other. Try disconnecting temporarily and then reconnecting to the same session.
Question 12: compiling
InintroHPC/codes
, compile {pi,sharedPi,distributedPi}.c
files. Try running a short serial code on the login node
(not longer than a few seconds: modify the number of terms in the summation).
Question 13a: make
Write a makefile to replace these compilations commands withmake {serial,openmp,mpi}
.
Question 13b: make (cont.)
Add target all
.
Add target clean
. Try implementing clean
for all executable files in the current directory, no matter what they
are called.
Question 14: Julia
Julia parallelism was not mentioned in the videos. Let’s quickly talk about it (slide 29).Question 14b: parallelization
Suggest a computational problem to parallelize. Which of the parallel tools mentioned in the videos would you use, and why?
If you are not sure about the right tool, suggest a problem, and we can brainstorm the approach together.
Question 15: Python and R
If you use Python or R in your work, try running a Python or R script in the terminal.
If this script depends on packages, try installing them in your own directory with virtualenv
. Probably, only a few of
you should do this on the training cluster at the same time.
Question 16: other
Any remaining questions? Type your question into the chat, ask via audio (unmute), or raise your hand in Zoom.
Part 2
Click on a triangle to expand a question:
Question 17: serial job
Submit a serial job that runs hostname
command.
Try playing with sq
, squeue
, scancel
commands.
Question 18: serial job (cont.)
Submit a serial job based on pi.c
.
Try sstat
on a currently running job. Try seff
and sacct
on a completed job.
Question 19: optimization timing
Using a serial job, time optimized (-O2
) vs. unoptimized code. Type your findings into the chat.
Question 20: Python vs. C timing
Using a serial job, time pi.c
vs. pi.py
for the same number of terms (cannot be too large or too small – why?).
Python pros – can you speed up pi.py
?
Question 21: array job
Submit an array job for different values ofn
(number of terms) with pi.c
. How can you have different executable for
each job inside the array?
Question 22: OpenMP job
Submit a shared-memory job based onsharedPi.c
. Did you get any speedup? Type your answer into the chat.
Question 23: MPI job
Submit an MPI job based on distributedPi.c
.
Try scaling 1 → 2 → 4 → 8 cores. Did you get any speedup? Type your answer into the chat.
Question 24: serial interactive job
Test the serial code inside an interactive job. Please quit the job when done, as we have very few compute cores on the training cluster.
Note: we have seen the training cluster become unstable when using too many interactive resources. Strictly speaking, this should not happen, however there is a small chance it might. We do have a backup.
Question 25: shared-memory interactive job
Test the shared-memory code inside an interactive job. Please quit when done, as we have very few compute cores on the training cluster.Question 26: MPI interactive job
Test the MPI code inside an interactive job. Please quit when done, as we have very few compute cores on the training cluster.Question 27: debugging and optimization
Let’s talk about debugging, profiling and code optimization.Question 28: permissions and file sharing
Let’s talk about file permissions and file sharing.
Share a file in your ~/projects
directory (make it readable) with all other users in def-sponsor00
group.
Question 29: other
Are there questions on any of the topics that we covered today? You can type your question into the chat, ask via audio (unmute), or raise your hand in Zoom.Videos: introduction
- Introduction (3 min)
- Cluster hardware overview (17 min)
- Basic tools on HPC clusters (18 min)
- File transfer (10 min)
- Programming languages and tools (16 min)
Updates:
- WestGrid ceased its operations on March 31, 2022. Since April 1st, your instructors in this course are based at Simon Fraser University.
- Some of the slides and links in the video have changed – please make sure to download the latest version of the slides (ZIP file).
- Compute Canada has been replaced by the Digital Research Alliance of Canada (the Alliance). All Compute Canada hardware and services are now provided to researchers by the Alliance and its regional partners. However, you will still see many references to Compute Canada in our documentation and support system.
- New systems were added (e.g. Narval in Calcul Québec), and some older systems were upgraded.
Videos: overview of parallel programming frameworks
Here we give you a brief overview of various parallel programming tools. Our goal here is not to learn how to use these tools, but rather tell you at a high level what these tools do, so that you understand the difference between shared- and distributed-memory parallel programming models and know which tools you can use for each. Later, in the scheduler session, you will use this knowledge to submit parallel jobs to the queue.
Feel free to skip some of these videos if you are not interested in parallel programming.
- OpenMP (3 min)
- MPI (message passing interface) (9 min)
- Chapel parallel programming language (7 min)
- Python Dask (6 min)
- Make build automation tool (9 min)
- Other essential tools (5 min)
- Python and R on clusters (6 min)
Videos: Slurm job scheduler
- Slurm intro (8 min)
- Job billing with core equivalents (2 min)
- Submitting serial jobs (12 min)
- Submitting shared-memory jobs (9 min)
- Submitting MPI jobs (8 min)
- Slurm jobs and memory (8 min)
- Hybrid and GPU jobs (5 min)
- Interactive jobs (8 min)
- Getting information and other Slurm commands (6 min)
- Best computing / storage practices and summary (9 min)