LINUX 4 FRESHER

Sunday, December 6, 2009

Different Kinds of Shells

The great-grandfather of all shells is /bin/sh, called simply sh or the Bourne Shell, named after its developer, Steven Bourne. When it was first introduced in the mid-1970s, this was almost a godsend as it allowed interaction with the operating system. This is the "standard" shell that you will find on every version in UNIX (at least all those I have seen). Although many changes have been made to UNIX, sh has remained basically unchanged.

All the capabilities of "the shell" I've talked about so far apply to sh. Anything I've talked about that sh can do, the others can do as well. So rather than going on about what sh can do (which I already did), I am going to talk about the characteristics of some other shells.

Later, I am going to talk about the C-Shell, which kind of throws a monkey wrench into this entire discussion. Although the concepts are much the same between the C-Shell and other shells, the constructs are often quite different. On the other hand, the other shells are extensions of the Bourne Shell, so the syntax and constructs are basically the same.

Be careful here. This is one case in which I have noticed that the various versions of Linux are different. Not every shell is in every version. Therefore, the shells I am going to talk about may not be in your distribution. Have no fear! If there is a feature that you really like, you can either take the source code from one of the other shells and add it or you can find the different shells all over the Internet, which is much easier.

Linux includes several different shells and we will get into the specific of many of them as we move along. In addition, many different shells are available as either public domain, shareware, or commercial products that you can install on Linux.

As I mentioned earlier, environment variables are set up for you as you are logging in or you can set them up later. Depending on the shell you use, the files used and where they are located is going to be different. Some variables are made available to everyone on the system and are accessed through a common file. Others reside in the user's home directory.

Normally, the files residing in a users home directory can be modified. However, a system administrator may wish to prevent users from doing so. Often, menus are set up in these files to either make things easier for the user or to prevent the user from getting to the command line. (Often users never need to get that far.) In other cases, environment variables that shouldn't be changed need to be set up for the user.

One convention I will be using here is how I refer to the different shells. Often, I will say "the bash" or just "bash" to refer to the Bourne-Again Shell as a concept and not the program /bin/bash. I will use "bash" to refer to the "Bourne Shell" as an abstract entity and not specifically to the program /bin/sh.

Why the Bourne-Again Shell? Well, this shell is compatible with the Bourne Shell, but has many of the same features as both the Korn Shell (ksh) and C-Shell (csh). This is especially important to me as I flail violently when I don't have a Korn Shell.

Most of the issues I am going to address here are detailed in the appropriate man-pages and other documents. Why cover them here? Well, in keeping with one basic premise of this book, I want to show you the relationships involved. In addition, many of the things we are going to look at are not emphasized as much as they should be. Often, users will go for months or years without learning the magic that these shells can do.

Only one oddity really needs to be addressed: the behavior of the different shells when moving through symbolic links. As I mentioned before, symbolic links are simply pointers to files or directories elsewhere on the system. If you change directories into symbolic links, your "location" on the disk is different than what you might think. In some cases, the shell understands the distinction and hides from you the fact that you are somewhere else. This is where the problem lies.

Although the concept of a symbolic link exists in most versions of UNIX, it is a relatively new aspect. As a result, not all applications and programs behave in the same way. Let's take the directory /usr/spool as an example. Because it contains a lot of administrative information, it is a useful and commonly accessed directory. It is actually a symbolic link to /var/spool. If we are using ash as our shell, when we do a cd /usr/spool and then pwd, the system responds with: /var/spool. This is where we are "physically" located, despite the fact that we did a cd /usr/spool. If we do a cd .. (to move up to our parent directory), we are now located in /var. All this seems logical. This is also the behavior of csh and sh on some systems.

If we use bash, things are different. This time, when we do a cd /usr/spool and then pwd, the system responds with /usr/spools. This is where we are "logically". If we now do a cd .., we are located in /usr. Which of these is the "correct" behavior? Well, I would say both. There is nothing to define what the "correct" behavior is. Depending on your preference, either is correct. I tend to prefer the behavior of ksh. However, the behavior of ash is also valid.

Pipes and Redirection

Perhaps the most commonly used character is "|", which is referred to as the pipe symbol, or simply pipe. This enables you to pass the output of one command through the input of another. For example, say you would like to do a long directory listing of the /bin directory. If you type ls -l and then press Enter, the names flash by much too fast for you to read. When the display finally stops, all you see is the last twenty entries or so.

If instead we ran the command

ls -l | more

the output of the ls command will be "piped through more". In this way, we can scan through the list a screenful at a time.

In our discussion of standard input and standard output in Chapter 1, I talked about standard input as being just a file that usually points to your terminal. In this case, standard output is also a file that usually points to your terminal. The standard output of the ls command is changed to point to the pipe, and the standard input of the more command is changed to point to the pipe as well.

The way this works is that when the shell sees the pipe symbol, it creates a temporary file on the hard disk. Although it does not have a name or directory entry, it takes up physical space on the hard disk. Because both the terminal and the pipe are seen as files from the perspective of the operating system, all we are saying is that the system should use different files instead of standard input and standard output.

Under Linux (as well as other UNIX dialects), there exist the concepts of standard input, standard output, and standard error. When you log in and are working from the command line, standard input is taken from your terminal keyboard and both standard output and standard error are sent to your terminal screen. In other words, the shell expects to be getting its input from the keyboard and showing the output (and any error messages) on the terminal screen.

Actually, the three (standard input, standard output, and standard error) are references to files that the shell automatically opens. Remember that in UNIX, everything is treated as a file. When the shell starts, the three files it opens are usually the ones pointing to your terminal.

When we run a command like cat, it gets input from a file that it displays to the screen. Although it may appear that the standard input is coming from that file, the standard input (referred to as stdin) is still the keyboard. This is why when the file is large enough and you are using something like more to display the file one screen at a time and it stops after each page, you can continue by pressing either the Spacebar or Enter key. That's because standard input is still the keyboard.

As it is running, more is displaying the contents of the file to the screen. That is, it is going to standard output (stdout). If you try to do a more on a file that does not exist, the message

file_name: No such file or directory

shows up on your terminal screen as well. However, although it appears to be in the same place, the error message was written to standard error (stderr). (I'll show how this differs shortly.)

One pair of characters that is used quite often, "<" and ">," also deal with stdin and stdout. The more common of the two, ">," redirects the output of a command into a file. That is, it changes standard output. An example of this would be ls /bin > myfile. If we were to run this command, we would have a file (in my current directory) named myfile that contained the output of the ls /bin command. This is because stdout is the file myfile and not the terminal. Once the command completes, stdout returns to being the terminal. What this looks like graphically, we see in the figure below.

Now, we want to see the contents of the file. We could simply say more myfile, but that wouldn't explain about redirection. Instead, we input

more
This tells the more command to take its standard input from the file myfile instead of from the keyboard or some other file. (Remember, even when stdin is the keyboard, it is still seen as a file.)

What about errors? As I mentioned, stderr appears to be going to the same place as stdout. A quick way of showing that it doesn't is by using output redirection and forcing an error. If wanted to list two directories and have the output go to a file, we run this command:

ls /bin /jimmo > /tmp/junk

We then get this message:

/jimmo not found

However, if we look in /tmp, there is indeed a file called junk that contains the output of the ls /bin portion of the command. What happened here was that we redirected stdout into the file /tmp/junk. It did this with the listing of /bin. However, because there was no directory /jimmo (at least not on my system), we got the error /jimmo not found. In other words, stdout went into the file, but stderr still went to the screen.

If we want to get the output and any error messages to go to the same place, we can do that. Using the same example with ls, the command would be:

ls /bin /jimmo > /tmp/junk 2>&1

The new part of the command is 2>&1, which says that file descriptor 2 (stderr) should go to the same place as file descriptor 1 (stdout). By changing the command slightly

ls /bin /jimmo > /tmp/junk 2>/tmp/errors

we can tell the shell to send any errors someplace else. You will find quite often in shell scripts throughout the system that the file that error messages are sent to is /dev/null. This has the effect of ignoring the messages completely. They are neither displayed on the screen nor sent to a file.

Note that this command does not work as you would think:

ls /bin /jimmo 2>&1 > /tmp/junk

The reason is that we redirect stderr to the same place as stdout before we redirect stdout. So, stderr goes to the screen, but stdout goes to the file specified.

Redirection can also be combined with pipes like this:

sort < names | head

or

ps | grep sh > ps.save

In the first example, the standard input of the sort command is redirected to point to the file names. Its output is then passed to the pipe. The standard input of the head command (which takes the first ten lines) also comes from the pipe. This would be the same as the command

sort names | head

which we see here:

In the second example, the ps command (process status) is piped through grep and all of the output is redirected to the file ps.save.

If we want to redirect stderr, we can. The syntax is similar:

command 2> file

It's possible to input multiple commands on the same command line. This can be accomplished by using a semi-colon (;) between commands. I have used this on occasion to create command lines like this:

man bash | col -b > man.tmp; vi man.tmp; rm man.tmp

This command redirects the output of the man-page for bash into the file man.tmp. (The pipe through col -b is necessary because of the way the man-pages are formatted.) Next, we are brought into the vi editor with the file man.tmp. After I exit vi, the command continues and removes my temporary file man.tmp. (After about the third time of doing this, it got pretty monotonous, so I created a shell script to do this for me. I'll talk more about shell scripts later.)

LINUX 4 FRESHER

Sunday, December 6, 2009

Different Kinds of Shells

Pipes and Redirection

Followers

Blog Archive