This is a temporary page, it shall be merged with the tutorial once it is finished.

After posting the message to the list, I realized that I was missing all the important basics, such as cd, cp, rm, ps, cat, less, mc. It would be great if somebody could write something about them.
The purpose of this section is to teach you several important commands of the bash. You will start with a very simple script and extend its functionality step by step. We will provide you with hints of how you can achieve the tasks but we will not give you the solution, and there are usually several ways to get to the desired output. The "man" program will be a valuable tool, call it for example like "man sort" to get information about the sort command. Another important tool that helps you to find commands whose name you do not know is "aprospos". Use "man aprospos" to find more about it. You will probably need to consult the "Advanced Bash-Scripting Guide" available online at
http://www.tldp.org/LDP/abs/html/ many times as well. You do not need to read the whole tutorial. Often you can get ideas by copy and pasting from the examples. This tutorial is not intended to be easy but it will help you a lot in every day work with unix machines -- for the bio grid but for your career at depaul as well.
Part 1: General Commands
- Create an executable bash script in a file called "hello" that prints "hello world" on z-cluster. Make sure that the file is executable so that you can execute it with "./hello".
Commands you will need: echo, chmod, a Sha-Bang line, Part 1.2 of the ABS Guide might help you to get started, just look at the examples.
- Replace the output with "Hello my name is ..." where ... is the name of the computer (z-cluster-01 for example). For that you should request the name of the machine and store this name into a variable MACHINENAME.
Commands you will need: hostname, backticks (bash buildin)
- Add the current date to the output ("Hello my name is ..., the current time is ...")
Commands you will need: date
- Create the following directory structure
./a
./b
./b/a
./b/b
Commands you will need: mkdir -p
- Create the following empty files
./a/A.cpp
./b/a/A.txt
./b/a/B.cpp
./b/a/C D.cpp (notice this directory has a space in the file name)
Commands you will need: touch, find out how to escape white spaces
- A function to your bash script that searches files in all sub directories. Make sure that find prints only filen names and not directory names.
Commands you will need: find
- Modify the find command to return only .cpp files
Commands you will need: find
- Modify your script to print information about the .cpp files given by "ls -l". There are three ways you can do that and you should know about all of them.
- pipe the output of find into "xargs ls -l", you will see that you have problems with the white spaces
- use the -exec parameter of find to execute "ls -l" with the filename.
- pipe the output of find into a while loop that uses "read line" to read one input line at a time and stores the result into the parameter $line, then executes "ls -1 $line".
Commands you will need: find, xargs, while (bash buildin), read (bash buildin)
- Modify your script to print only the filename (without the directory name) of the files.
Commands you will need: basename
- Delete the printing of files from your script and modify it to print only the number of the node name (01 for z-cluster-01) and the previous "Hello my name is ..., the current time is ...").
You will need the substitution commands of the bash and variables
- Modify your script to sleep for 2 seconds after printing the "Hello..." line if excecuted on z-cluster-01 to 10 and to sleep for 4 seconds if after printing the "Hello..." line if executed on z-cluster-11 to 20. Use the "time" command to verify that your program takes approximately 2 or 4 seconds to run.
Commands you will need: if or case statements, sleep, time
- Modify your script to accept a parameter so that you can execute it with "./hello foobar" and it will print "foobar: Hello my name is ..., the current time is ...".
Part 2: Distributing work to nodes of z-cluster
By now you should have a program that you can execute as "./hello foobar" that prints "foobar: Helly my name is {name of the node}, the current time is {current time}" to the screen and waits for 2 or 4 seconds depending on the node where you execute it. No we simulate some work by executing hello with the parameters 1 to 1000. This consumes a lot of time on a single node so we want to use several computers in parallel.
- Figure out how to execute a program on a remote computer over ssh without manually opening a shell (we want to automate the execution of commands on other nodes).
- Write a for loop that prints the numbers 1 to 1000
Commands you will need: seq, for (bash buildin)
- Combine the two previous steps to execute "./hello 1" to "./hello 10" on nodes 1 to 10 - in parallel!
You will use the & to run a job in the background
- Now we want to do task allocation by a second script "runparallel" as follows
for job 1 to 1000
wait until a node is available
run "hello $job" on this node
You know how to do the first and third line by now. How can we do the second one? We can use a producer/consumer pattern. Write a wrapper script "wrapper" that accepts as many parameters as necessary (look for $@ to reference these parameters) that does the following simple steps: - create a file name that corresponds to the hostname - execute the program given by the parameters - delete the file name With this script we can monitor whether the execution of a program on another node has terminated (this is the case if there is no file that corresponds to the name of that host). Therefore, figure out how to test with whether a file exists (hint: man bash). Now we can replace the wait look with something like
- empty node is -1
- while empty node is -1
- for all nodes allowed
- if current node is not busy (no lock file exists), set empty node to current node and beeak;
- if empty node is -1 (we haven't found anything), then sleep for 1 second
the list of allowed nodes should contain 3 nodes of z-cluster. You should start with no more than 10 jobs, because an error might jam up z-cluster starting thousands of jobs on a single node.
- The parallel jobs have produced output ("1: Hello..."). Now pipe the output of "runparallel" into tee to see the progress and to store the output to a file.
- Use "sort" on that output file so see that all jobs have been completed, use "wc" as well to verify that.
- Start "runparallel" in a screen environment (execute "screen", then "runparallel"), terminate the ssh connection and reconnect. Now use "screen -R -D" to reconnect to the session. This is how you can run jobs in the background.
- If you want to get notified that your calculation has finished, run "runparallel ; mail XXX", where XXX is your email address and some other parameters (see man page). You will receive a confirmation, when the job is done.
Part 3: Other stuff
- Read the man page of nice, to make sure that you know how to run jobs with low priority
- Try sorting the output of several different echo statements like: echo B; echo A; echo C by putting them into parentheses and piping everything to sort.
- maybe we want to add some more stuff about redirecting stderr to stdout, piping somthing into a program, checking error values with (&&, ||, and $?)
If you have completed the previous steps, you have learned a lot about bash scripting. Hopefully it was a little bit fun and you have learned to value the possibilities the bash provides to you. If you want to learn more, check out the Advanced Bash-Scripting Guide.
--
DominicBattre - 26 Oct 2005