3. Advanced Shell Features

Last lesson we focused on very basic principles and features of the UNIX shell.

This lesson we want to go one step further and make ourselves more productive.

Bash Code Illustration

There are lots of more features, little tricks and efficiency hacks, we can use on the shell.

Most presented features are still POSIX-compliant, i.e. work in every UNIX shell. But some may only work on certain more advanced shells such as Bash or ZSH.

I Know What You Typed Last Night: Shell History

Most modern shells save your commands in a history file. For Bash this is ~/.bash_history, for ZSH it is ~/.zsh_history.

Every line in that file corresponds to one command you ran.

To look at the last 5 entries from your history, run

In [25]:
history 5
 2066  head -n-1 examplefile
 2067  echo $?
 2068  mkdir --help
 2069  echo $?
 2070  history 5

The main purpose of the history is convenience. In the shell use the Up and Down arrow keys to recall previously executed commands, edit them as needed and hit enter to run them again.

History shortcut commands

Instead of the arrow keys you can also use shortcut commands:

  • !N – run the Nth command from the history again
  • !-N – run the Nth-last command again
  • !! – run the last command again (same as !-1)

For example:

~$ echo "Hello World"
Hello World
~$ !!
echo "Hello World"
Hello World

How is that useful?

Imagine, you mistyped a command and want to corrrect it. Instead of typing everything again, use the Up arrow, correct the command and run it again.

Similarly, you maybe forgot sudo in front of a command that requires root privileges:

~$ mkdir -v /etc/somefolder
mkdir: cannot create directory ‘/etc/somefolder’: Permission denied
~$ sudo !!
sudo mkdir /etc/somefolder
[sudo] password for cmdline:
mkdir: created directory /etc/somefolder

Incidents

Be aware, though, that on our seminar system, you won't be able to run commands with sudo. Doing so will result in a message like this:

$ sudo somecommand
cmdline is not in the sudoers file.  This incident will be reported.

If you wonder, where this report goes to:

Incident CC-BY-NC 2.5, XKCD

So, don't be naughty, be nice. ;-)

Fun with Filenames: Globbing

As we saw in the last lesson, many commands can take more than one file as arguments.

But imagine we have a directory with 100 files and want to copy all of them to a new folder. Do you want to write all their names manually into the command line? Surely not.

This is where globbing (wildcarding) comes into play.

Globbing is a very powerful, yet sometimes dangerous, feature of every modern UNIX shell.

Four basic operators exist:

?              one arbitrary character
*              0 or more arbitrary characters
[abc], [a-z]   one character from the given set
{abc,def}      one string from the given set

These work in basically every shell. Some shells specify more advanced globbing patterns. For example, Bash also supports [^abc] for negative patterns and ZSH supports recursive globbing via **.

Example 1: Move all files from one directory to another

In [26]:
ls -l exampledirectory1
total 8
-rw-r--r-- 1 cmdline cmdline 6 Oct 26 18:04 file1
-rw-r--r-- 1 cmdline cmdline 6 Oct 26 18:04 file2
In [27]:
ls -l exampledirectory2
total 0
In [28]:
mv ./exampledirectory1/* exampledirectory2

In [29]:
ls -l exampledirectory2
total 8
-rw-r--r-- 1 cmdline cmdline 6 Oct 26 18:04 file1
-rw-r--r-- 1 cmdline cmdline 6 Oct 26 18:04 file2

Example 2: Copy file1 and file2

In [30]:
cp ./exampledirectory2/{file1,file2} exampledirectory1

In [31]:
ls -l exampledirectory1
total 8
-rw-r--r-- 1 cmdline cmdline 6 Oct 26 18:28 file1
-rw-r--r-- 1 cmdline cmdline 6 Oct 26 18:28 file2

Example 3: List all files from both folders that end with the number 2 or 3

In [32]:
ls -l ./exampledirectory?/*[23]
-rw-r--r-- 1 cmdline cmdline 6 Oct 26 18:28 ./exampledirectory1/file2
-rw-r--r-- 1 cmdline cmdline 6 Oct 26 18:04 ./exampledirectory2/file2

Swimming with the Stream: Redirection and Pipes

There are three standard streams which are used for all input and output data:

  1. Standard Input (STDIN)
  2. Standard Output (STDOUT)
  3. Standard Error Output (STDERR)

If a program asks you interactively to enter some text, it goes through STDIN.

If a program prints some text to the console, it usually goes through STDOUT.

If that output is an error, the program may also send it through STDERR.

Every of the three streams can be accessed independently by its handle number (0, 1 and 2).

For redirecting these three streams to (or in case of STDIN from) files, we have the three operators

  • 0<
  • 1>
  • 2>

Because people are lazy, 1> can be abbreviated with > and 0< with <.

To make this clearer, let's write Hello World to a file:

In [33]:
echo "Hello World" > examplefile2

In [34]:
cat examplefile2
Hello World

We redirected STDOUT of the echo command (which would have gone to the screen otherwise) to a file.

What happens when we redirect STDERR instead?

In [35]:
echo "Hello World" 2> examplefile2
Hello World
In [36]:
cat examplefile2

Now Hello World goes to the screen (because we didn't touch STDOUT) and the file is empty because there was no error output.

Now it also makes sense why we have two standard output streams. This way we can handle “normal” command output and any error output separately.

Appending lines in files

You probably noticed that > and 2> truncate the file if it exists already. If you want to append lines, use >> instead:

In [37]:
echo "Hello World" > examplefile2

In [38]:
echo "Hello Line2" >> examplefile2

In [39]:
cat examplefile2
Hello World
Hello Line2

Where nothing comes ever back

In context of output redirection, let me introduce a very helpful device file: /dev/null.

/dev/null is basically the “black hole” of your Linux system. Everything that goes past its event horizon never comes back.

With this unique property, /dev/null is really helpful, when you are only interested in, e.g., STDOUT and want to disregard any (potential) error messages:

In [40]:
cat exampledirectory1/file1 2> /dev/null
File1

In the event that this file doesn't exist, the command would silently fail instead of printing an error message.

In the same way you can also redirect STDOUT to /dev/null or both:

In [41]:
cat exampledirectory1/file2 > /dev/null 2> /dev/null

Redirecting streams to each other

The last example of writing both STDOUT and STDERR to a file is very verbose. We can make it shorter:

In [42]:
cat exampledirectory1/file2 > /dev/null 2>&1

What did we do here?

We told STDOUT to go to hell /dev/null.

After that we told STDERR (2>) to go to STDOUT (&1).

The ugly duckling: STDIN redirection

There is nothing special to redirection of STDIN compared to STDOUT or STDERR. As shown before, we use < for writing a file to STDIN of the given command.

The only downside is: we hardly ever need it.

Almost all commands, that accept input from STDIN also let you specify files on the command line.

The following usage of cat is possible, but also pointless. (WHY?)

In [43]:
cat < exampledirectory1/file2
File2

Easier would be

In [44]:
cat exampledirectory1/file2
File2

Input from /dev/null is also possible, but only rarely useful when you need “nothing” as input for some reason:

In [45]:
cat < /dev/null

Where Things Get Interesting: Piping

Redirection to files is nifty, but not very exciting when you know about piping.

With piping we can redirect STDOUT of one command directly to STDIN of another command. You can pipe as many commands together as you like.

The syntax is:

command1 | command2 | command3 | ...

Want to get the second word of "Hello World"? Easy. (HOW?)

In [46]:
echo "Hello World" | cut -d" " -f2
World

Using echo to pipe arbitrary strings into commands is a pretty common shell pattern. But you can do way more advanced things.

Using only commands already presented in this course, we can cut out the third word of the second line of a file. (HOW?)

In [47]:
cat examplefile
This Is Line1
This Is Line2
This Is Line3
This Is Line4

Cut out the wanted part:

In [48]:
head -n2 examplefile | tail -n1 | cut -d" " -f3
Line2

Alternatively:

cut -d$'\n' -f2 examplefile | cut -d" " -f3

(works only with GNU Coreutils 8.23 and above)

Piping STDERR

Pipes only work in STDOUT. If we want to pipe STDERR, we need to redirect it to STDOUT first.

As a stupid example, let's count the words of the produced error message:

In [70]:
cat nonexistentfile 2>&1 | wc -m
48

Background Jobs

By default, any command that you run, will block the input prompt until it has finished.

By sending a program to the background, you can continue to use your shell while the process is doing its work.

For demonstrating this, we need a process that runs forever (or at least quite some time). We'll use:

bash -c "while true; do sleep 1; done"

Notice how your shell seems to hang because we just started an endless loop.

To interrupt this process, hit Ctrl+Z (^Z). This will send a STOP signal (SIGSTP) to the process. The STOP signal causes the process to suspend and gives control back to the shell.

If you now run the jobs command, you will see a list of all suspended processes in the current shell (i.e. “Jobs”):

$ jobs
[1]  + suspended  bash -c "while true; do sleep 1; done"

To resume the suspended process, we have the commands fg and bg to continue execution in either foreground or background mode.

Resuming the process in the foreground is like you never suspended it. Interesting is the bg command:

$ bg %1
[1]  + continued  bash -c "while true; do sleep 1; done"

Nothing seems to change, but in fact, the process is now running again. This time, however, in the background.

We can verify this by running jobs:

[1]  + running    bash -c "while true; do sleep 1; done"

You can bring the process back to the foreground at any time by typing fg:

$ fg %1
[1]  + running    bash -c "while true; do sleep 1; done"

Terminating a hanging process

Since your process is running forever, we want to manually terminate it at some point. To do this, hit Ctrl+C (^C). This will send an INTERRUPTION signal (SIGINT) which kindly asks the process to exit.

If the process is running in the background, you can also use kill instead of Ctrl+C:

$ kill -INT %1
[1]  + interrupt  bash -c "while true; do sleep 1; done"

Forcefully closing processes

If you use kill without the -INT flag, it will send a TERMINATION signal (SIGTERM) instead of an INTERRUPTION signal. SIGTERM is a direct request to the program to terminate itself. In most cases, this has the same behavior as SIGINT.

If the program is really frozen and doesn't react anymore, you can also ask the kernel to terminate that process for you by sending a KILL signal using `kill -KILL':

$ kill -KILL %1
[1]  + killed     bash -c "while true; do sleep 1; done"

This will end the process instantly, no matter what.

Signal numbers

Every signal also has a number that you can use instead. For instance, the number for SIGKILL is 9, therefore you can also run:

$ kill -9 %1
[1]  + killed     bash -c "while true; do sleep 1; done"

To get a list of all available signals and their numbers use

In [72]:
kill -l
 1) SIGHUP	 2) SIGINT	 3) SIGQUIT	 4) SIGILL	 5) SIGTRAP
 6) SIGABRT	 7) SIGBUS	 8) SIGFPE	 9) SIGKILL	10) SIGUSR1
11) SIGSEGV	12) SIGUSR2	13) SIGPIPE	14) SIGALRM	15) SIGTERM
16) SIGSTKFLT	17) SIGCHLD	18) SIGCONT	19) SIGSTOP	20) SIGTSTP
21) SIGTTIN	22) SIGTTOU	23) SIGURG	24) SIGXCPU	25) SIGXFSZ
26) SIGVTALRM	27) SIGPROF	28) SIGWINCH	29) SIGIO	30) SIGPWR
31) SIGSYS	34) SIGRTMIN	35) SIGRTMIN+1	36) SIGRTMIN+2	37) SIGRTMIN+3
38) SIGRTMIN+4	39) SIGRTMIN+5	40) SIGRTMIN+6	41) SIGRTMIN+7	42) SIGRTMIN+8
43) SIGRTMIN+9	44) SIGRTMIN+10	45) SIGRTMIN+11	46) SIGRTMIN+12	47) SIGRTMIN+13
48) SIGRTMIN+14	49) SIGRTMIN+15	50) SIGRTMAX-14	51) SIGRTMAX-13	52) SIGRTMAX-12
53) SIGRTMAX-11	54) SIGRTMAX-10	55) SIGRTMAX-9	56) SIGRTMAX-8	57) SIGRTMAX-7
58) SIGRTMAX-6	59) SIGRTMAX-5	60) SIGRTMAX-4	61) SIGRTMAX-3	62) SIGRTMAX-2
63) SIGRTMAX-1	64) SIGRTMAX	

Using kill with process IDs

kill can, of course, not only be used with job numbers, but also directly with process numbers.

Every process on a Linux system has a unique number which you can find out by running ps a or by using top (or htop).

If you're using process numbers, omit the leading % symbol in front of the number.

Starting processes directly in the background

If you want to start a process in the background, but don't want to go use Ctrl+Z and then bg %1, you can add an ampersand & to the end of the command line:

$ bash -c "while true; do sleep 1; done" &
[1] 13004

This will start the process in the background and print both the job and the process number.

Configuration at Runtime: Environment Variables

Probably one of the most import feature of any shell are environment variables.

Environment variables can be used to strore arbitrary strings or numbers and configure certain aspects of your working environment (therefore environment variables).

Your shell usually comes with a lot of pre-defined environment variables. For a list of all currently defined variables, write env.

Defining custom variables is easy:

In [50]:
VARNAME="some string"

Note, that there must not be any spaces around the assignment operator =.

Since shell variables are also called parameters, the replacement of a variable with its value is called parameter expansion. Expanding (i.e. using) variables is equally simple as defining them:

In [51]:
echo $VARNAME
some string

Alternative notation:

In [52]:
echo ${VARNAME}
some string

Parameter expansion in strings

Shell variables can also be used within double-quoted strings:

In [53]:
echo "This is ${VARNAME}"
This is some string

If you want to prevent expansion within strings, either escape the variable symbol or use single-quoted strings:

In [54]:
echo "This is \${VARNAME}"
This is ${VARNAME}
In [55]:
echo 'This is ${VARNAME}'
This is ${VARNAME}

Pitfalls and when quotes are really needed

Parameter expansion in shell environment works a little different than in other programming languages.

The contents of a variable can easily get unexpected syntactic meaning.

The following code snippet will most likely do something else than you intended:

In [56]:
touch "My Example File.txt"
alias cp="echo cp: target ‘File.txt’ is not a directory; true"

In [57]:
mkdir somedirectory
FILENAME="My Example File.txt"
cp $FILENAME somedirectory/$FILENAME
cp: target ‘File.txt’ is not a directory

(WHY?)

In [58]:
unalias cp

The cp expression gets expanded to:

cp My Example File.txt somedirectory/My Example File.txt

However, what you probably wanted was:

cp "My Example File.txt" "somedirectory/My Example File.txt"

To achieve that, you have to put the variables in quotes:

In [59]:
cp "$FILENAME" "somedirectory/$FILENAME"

In [60]:
rm "My Example File.txt"
rm -R somedirectory

Variable scopes

Shell variables have three different scopes in which they're visible:

  • local
  • global
  • exported

By default, any defined variable is global. That means, they're visible for every sub-routine in the current shell instance.

However, separate sub-processes will not inherit global variables:

In [61]:
unset FOO

In [62]:
FOO="bar"

In [63]:
bash -c 'echo "FOO says ${FOO}"'
FOO says 

To make the value of $FOO visible to sub-processes, we have to export the variable:

In [64]:
export FOO

In [65]:
bash -c 'echo "FOO says ${FOO}"'
FOO says bar

Exported variables are what we actually call environment variables.

Variables can also be declared locally which is mainly useful for scripting when you use sub-routines.

Local variables are declared with the keyword local:

In [66]:
some_sub_routine() {
    local local_variable="local"
    GLOBAL_VARIABLE="global"
    echo "From sub-routine: ${GLOBAL_VARIABLE}, ${local_variable}"
}

In [67]:
some_sub_routine
From sub-routine: global, local
In [69]:
echo "From main program: ${GLOBAL_VARIABLE}, ${local_variable}"
From main program: global, 

Pre-defined environment variables

There is quite a number of pre-defined environment variables. The three most important ones are:

  • $HOME – contains the absolute path to your home directory
  • $LANG – your current system locale
  • $PATH – contains a colon-separated (:) list of directories to search for commands

$PATH is the reason why you are able to type vim instead of the full path /usr/bin/vim. Because /usr/bin is in your $PATH, the shell searches that directory for an executable called vim.

The contents of a typical $PATH variable may look like:

In [97]:
echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Command substitution

Similar to parameter substitution, also commands can be evaluated and substituted.

Any command that appears between backticks `...` or within $(...) is evaluated and the result is passed back as a string to the command line:

In [99]:
echo "My current username is $(id -un)"
My current username is cmdline

Basic Scripting

If we can use variables, can we also use other elements of a regular programming (scripting) language?

Of course we can. In fact, any sequence of shell commands can also be written as a script file and then executed as a program.

In order to create a working shell script the script file needs to

  • be executable (+x permissions)
  • start with a so-called hashbang

Hashbang

A hashbang (also called shebang) is a “magic” character sequence at the beginning of an executable text file that specifies the interpreter that is used to execute the following script.

A hashbang consists of the sequence #! followed by a path.

For example:

#!/bin/bash

or

#!/usr/bin/python

“Dynamic” hashbangs

By using #!/bin/bash you do nothing wrong in most cases, but you might want to not hard-code the location of Bash (especially when you're coding for low-end machines such as embedded devices).

In that case you can use /usr/bin/env to find the path out for you:

#!/usr/bin/env bash

/usr/bin/env will look in in your $PATH variable for a suitable path and try to execute Bash from there.

Even though it's not needed in most cases, it's best practice to use /usr/bin/env.

Control structures

The hashbang can be followed by any code that is executable by the given interpreter. In this case, we will continue to use Bash.

Since we want to to “real” scripting, we need more than the ability to write a static sequence of commands. That's where control structures come into play.

Like most other programming languages, Bash also supports if (and elif/else), for and while. The syntax may be a little strange at first, though.

if, elif, else

A conditional statement is introduced with if which is then followed by any executable commands that returns a true (zero*) or false (non-zero) exit code. The block is terminated by fi.

In [96]:
if true; then
    echo "Hello World"
fi
Hello World

*) no, that's not a mistake

A condition may be notated within [ and ] (mind the spaces between brackets and statement):

In [88]:
if [ "a" != "b" ]; then
    echo "Alphabet still unchanged"
fi
Alphabet still unchanged

More branches can be introduced with elif and else:

In [79]:
if [ "a" == "b" ]; then
    echo "Huh?"
elif [ "b" == "c" ]; then
    echo "Woot?"
else
    echo "Nothing of that"
fi
Nothing of that

Special comparison operators exist for numeric values:

  • -eq – is equal to
  • -ne – is not equal to
  • -gt – is greater than
  • -ge – is greater than or equal to
  • -lt – is less than
  • -le – is less than or equal to

Example:

In [82]:
if [ 1 -lt 2 ]; then
    echo "Math still works"
fi
Math still works

There is also a set of (unary) file testing operators. The most common are:

  • -e – file exists
  • -f – file exists and is a regular file (not a directory or device file)
  • -s – file is not zero size
  • -d – file is a directory
  • -h – file is a symbolic link
  • -r – file has read permissions
  • -w – file has write permissions
  • -x – file has execute permissions

Example:

In [84]:
if [ -f examplefile ]; then
    cat examplefile
fi
This Is Line1
This Is Line2
This Is Line3
This Is Line4

Statements can also be negated with !. Multiple conditions can be chained with && (logical AND) or || (logical OR):

In [89]:
if [ ! -d examplefile ] && [ -f examplefile ]; then
    cat examplefile
fi
This Is Line1
This Is Line2
This Is Line3
This Is Line4

for

The for statement can be used to iterate over a set of space-separated values, such as a file listing. The loop body is notated between do and done:

In [91]:
for i in ./*; do
    echo "File name: $i"
done
File name: ./exampledirectory1
File name: ./exampledirectory2
File name: ./examplefile
File name: ./examplefile2
File name: ./seminar

while

The while statement executes a code block as long as its condition evaluates to true. The loop syntax is the same as for for, the syntax of the conditional expression the same as for if:

In [93]:
i=0
while [ $i -lt 5 ]; do
    echo $i
    i=$(($i + 1))
done
0
1
2
3
4

while may also be used together with piping. It's really simple to read a file line by line:

In [95]:
cat examplefile | while read line; do
    echo "Current line: $line"
done
Current line: This Is Line1
Current line: This Is Line2
Current line: This Is Line3
Current line: This Is Line4

The read command will read one line of input and write it into the variable $line. If an EOF (end of file) appears, the command will exit with a non-zero exit code (which evaluates to false).

You can use read also without piping to ask the user of a script to enter a line of text into your program.

Special variables

When you use Bash for scripting, you may also find the following special variables useful:

  • $0 – first positional parameter, i.e. the script name (similar to argv[0] in C)
  • $1..$9 – more positional parameters, i.e. script parameters
  • ${10}..${N} – two-digit positional parameters
  • $@ – All positional parameters, except $0
  • $# – The number of given arguments, not counting $0
  • $? – The return code of the last command you executed

End of Lesson 2

That marks the end of lesson 2.

We covered the most basic principles and techniques. Next time we'll focus on a list of useful commands that make your life a lot easier and prepare your way to ultimate shell mastery. :-)

Thanks for listening.