Scripts, functions, and variables
Shell scripts
We now know a lot of UNIX commands! Wouldn’t it be great if we could save certain commands so that we could run them later or not have to type them out again? As it turns out, this is extremely easy to do. Saving a list of commands to a file is called a “shell script”. These shell scripts can be run whenever we want, and are a great way to automate our work.
$ cd /path/to/data-shell/molecules
$ nano process.sh
#!/bin/bash # this is called sha-bang; can be omitted for generic (bash/csh/tcsh) commands
echo Looking into file octane.pdb
head -15 octane.pdb | tail -5 # what does it do?
$ bash process.sh # the script ran!
Alternatively, you can change file permissions:
$ chmod u+x process.sh
$ ./process.sh
Let’s pass an arbitrary file to it:
$ nano process.sh
#!/bin/bash
echo Looking into file $1 # $1 means the first argument to the script
head -15 $1 | tail -5
$ ./process cubane.pdb
$ ./process propane.pdb
- head -15 “$1” | tail -5 # placing in double-quotes lets us pass filenames with spaces
- head $2 $1 | tail $3 # what will this do?
- $# holds the number of command-line arguments
- $@ means all command-lines arguments to the script (words in a string)
Question `file permissions`
Let’s talk more about file permissions.Question 8.2
In the molecules
directory (download link mentioned here),
create a shell script called scan.sh
containing the following:
#!/bin/bash
head -n $2 $1
tail -n $3 $1
While you are in that current directory, you type the following command (with space between two 1s):
./scan.sh '*.pdb' 1 1
What output would you expect to see?
- All of the lines between the first and the last lines of each file ending in
.pdb
in the current directory - The first and the last line of each file ending in
.pdb
in the current directory - The first and the last line of each file in the current directory
- An error because of the quotes around
*.pdb
You can watch a video for this topic after the workshop.
You can watch a video for this topic after the workshop.
If statements
Let’s write and run the following script:
$ nano check.sh
for f in $@
do
if [ -e $f ] # make sure to have spaces around each bracket!
then
echo $f exists
else
echo $f does not exist
fi
done
$ chmod u+x check.sh
$ ./check.sh a b c check.sh
- Full syntax is:
if [ condition1 ]
then
command 1
command 2
command 3
elif [ condition2 ]
then
command 4
command 5
else
default command
fi
Some examples of conditions (make sure to have spaces around each bracket!):
[ $myvar == 'text' ]
checks if variable is equal to ’text'[ $myvar == number ]
checks if variable is equal to number[ -e fileOrDirName ]
checks iffileOrDirName
exists[ -d name ]
checks ifname
is a directory[ -f name ]
checks ifname
is a file[ -s name ]
checks if filename
has length greater than 0
Question 8.3
Write a script that complains when it does not receive arguments.Variables
We already saw variables that were specific to scripts ($1, $@, …) and to loops ($file). Variables can be used outside of scripts:
$ myvar=3 # no spaces permitted around the equality sign!
$ echo myvar # will print the string 'myvar'
$ echo $myvar # will print the value of myvar
Sometimes you can see the notation:
$ export myvar=3
Using ’export’ will make sure that all inherited processes of this shell will have access to this variable. Try defining the variable newvar without/with ’export’ and then running the script:
$ nano process.sh
#!/bin/bash
echo $newvar
You can assign a command’s output to a variable to use in another command (this is called command substitution) – we’ll see this later when we play with ‘find’ command.
$ printenv # print all declared variables
$ env # same
$ unset myvar # unset a variable
Question `using a variable inside a string`
var="sun"
echo $varshine
echo ${var}shine
echo "$var"shine
Question `variable manipulation`
myvar="hello"
echo $myvar
echo ${myvar:offset}
echo ${myvar:offset:length}
echo ${myvar:2:3} # 3 characters starting from character 2
echo ${myvar/l/L} # replace the first match of a pattern
echo ${myvar//l/L} # replace all matches of a pattern
Environment variables are those that affect the behaviour of the shell and user interface:
$ echo $HOME
$ echo $PATH
$ echo $PWD
$ echo $PS1
It is best to define custom environment variables inside your ~/.bashrc file. It is loaded every time you start a new shell.
Question 8.6
Play with variables and their values. Change the prompt, e.g.PS1="\u@\h \w> "
.
You can watch a video for this topic after the workshop.
Functions
Functions are similar to scripts, but there are some differences. A bash script is an executable file sitting at a given path. A bash function is defined in your environment. Therefore, when running a script, you need to prepend its path to its name, whereas a function – once defined in your environment – can be called by its name without a need for a path. Both scripts and functions can take command-line arguments.
A convenient place to put all your function definitions is ~/.bashrc
file which is run every time you
start a new shell (local or remote).
Like in any programming language, in bash a function is a block of code that you can access by its name. The syntax is:
functionName() {
command 1
command 2
...
}
Inside functions you can access its arguments with variables $1
$2
… $#
$@
– exactly the same as in
scripts. Functions are very convenient because you can define them inside your ~/.bashrc file. Alternatively,
you can place them into a file and then source them whenever needed:
$ source allMyFunctions.sh
Here is our first function:
greetings() {
echo hello
}
Let’s write a function ‘combine()’ that takes all the files we pass to it, copies them into a randomly-named directory and prints that directory to the screen:
combine() {
if [ $# -eq 0 ]; then
echo "No arguments specified. Usage: combine file1 [file2 ...]"
return 1 # return a non-zero error code
fi
dir=$RANDOM$RANDOM
mkdir $dir
cp $@ $dir
echo look in the directory $dir
}
Question `swap file names`
Write a function to swap two file names. Add a check that both files exist, before renaming them.Question `archive()`
Write a function archive()
to replace directories with their gzipped archives.
$ ls -F
chapter1/ chapter2/ notes/
$ archive chapter* notes/
$ ls
chapter1.tar.gz chapter2.tar.gz notes.tar.gz
Question `countfiles()`
Write a function countfiles()
to count files in all directories passed to it as arguments (need to loop through all
arguments). At the beginning add the check:
if [ $# -eq 0 ]; then
echo "No arguments given. Usage: countfiles dir1 dir2 ..."
return 1
fi
You can watch a video for this topic after the workshop.
Scripts in other languages
As a side note, it possible to incorporate scripts in other languages into your bash code, e.g. consider this:
function test() {
randomFile=${RANDOM}${RANDOM}.py
cat << EOF > $randomFile
#!/usr/bin/python3
print("do something in Python")
EOF
chmod u+x $randomFile
./$randomFile
/bin/rm $randomFile
}
Here EOF
is a random delimiter string, and <<
tells bash to wait for the delimiter to end input. For example, try
the following:
cat << the_end
This text will be
printed in the terminal.
the_end