Sunday, December 6, 2009

Pipes and Redirection

Perhaps the most commonly used character is "|", which is referred to as the pipe symbol, or simply pipe. This enables you to pass the output of one command through the input of another. For example, say you would like to do a long directory listing of the /bin directory. If you type ls -l and then press Enter, the names flash by much too fast for you to read. When the display finally stops, all you see is the last twenty entries or so.

If instead we ran the command

ls -l | more

the output of the ls command will be "piped through more". In this way, we can scan through the list a screenful at a time.

In our discussion of standard input and standard output in Chapter 1, I talked about standard input as being just a file that usually points to your terminal. In this case, standard output is also a file that usually points to your terminal. The standard output of the ls command is changed to point to the pipe, and the standard input of the more command is changed to point to the pipe as well.

The way this works is that when the shell sees the pipe symbol, it creates a temporary file on the hard disk. Although it does not have a name or directory entry, it takes up physical space on the hard disk. Because both the terminal and the pipe are seen as files from the perspective of the operating system, all we are saying is that the system should use different files instead of standard input and standard output.

Under Linux (as well as other UNIX dialects), there exist the concepts of standard input, standard output, and standard error. When you log in and are working from the command line, standard input is taken from your terminal keyboard and both standard output and standard error are sent to your terminal screen. In other words, the shell expects to be getting its input from the keyboard and showing the output (and any error messages) on the terminal screen.

Actually, the three (standard input, standard output, and standard error) are references to files that the shell automatically opens. Remember that in UNIX, everything is treated as a file. When the shell starts, the three files it opens are usually the ones pointing to your terminal.

When we run a command like cat, it gets input from a file that it displays to the screen. Although it may appear that the standard input is coming from that file, the standard input (referred to as stdin) is still the keyboard. This is why when the file is large enough and you are using something like more to display the file one screen at a time and it stops after each page, you can continue by pressing either the Spacebar or Enter key. That's because standard input is still the keyboard.

As it is running, more is displaying the contents of the file to the screen. That is, it is going to standard output (stdout). If you try to do a more on a file that does not exist, the message

file_name: No such file or directory

shows up on your terminal screen as well. However, although it appears to be in the same place, the error message was written to standard error (stderr). (I'll show how this differs shortly.)

One pair of characters that is used quite often, "<" and ">," also deal with stdin and stdout. The more common of the two, ">," redirects the output of a command into a file. That is, it changes standard output. An example of this would be ls /bin > myfile. If we were to run this command, we would have a file (in my current directory) named myfile that contained the output of the ls /bin command. This is because stdout is the file myfile and not the terminal. Once the command completes, stdout returns to being the terminal. What this looks like graphically, we see in the figure below.


Now, we want to see the contents of the file. We could simply say more myfile, but that wouldn't explain about redirection. Instead, we input

more
This tells the more command to take its standard input from the file myfile instead of from the keyboard or some other file. (Remember, even when stdin is the keyboard, it is still seen as a file.)

What about errors? As I mentioned, stderr appears to be going to the same place as stdout. A quick way of showing that it doesn't is by using output redirection and forcing an error. If wanted to list two directories and have the output go to a file, we run this command:

ls /bin /jimmo > /tmp/junk

We then get this message:

/jimmo not found

However, if we look in /tmp, there is indeed a file called junk that contains the output of the ls /bin portion of the command. What happened here was that we redirected stdout into the file /tmp/junk. It did this with the listing of /bin. However, because there was no directory /jimmo (at least not on my system), we got the error /jimmo not found. In other words, stdout went into the file, but stderr still went to the screen.

If we want to get the output and any error messages to go to the same place, we can do that. Using the same example with ls, the command would be:

ls /bin /jimmo > /tmp/junk 2>&1

The new part of the command is 2>&1, which says that file descriptor 2 (stderr) should go to the same place as file descriptor 1 (stdout). By changing the command slightly

ls /bin /jimmo > /tmp/junk 2>/tmp/errors

we can tell the shell to send any errors someplace else. You will find quite often in shell scripts throughout the system that the file that error messages are sent to is /dev/null. This has the effect of ignoring the messages completely. They are neither displayed on the screen nor sent to a file.

Note that this command does not work as you would think:

ls /bin /jimmo 2>&1 > /tmp/junk

The reason is that we redirect stderr to the same place as stdout before we redirect stdout. So, stderr goes to the screen, but stdout goes to the file specified.

Redirection can also be combined with pipes like this:

sort < names | head

or

ps | grep sh > ps.save

In the first example, the standard input of the sort command is redirected to point to the file names. Its output is then passed to the pipe. The standard input of the head command (which takes the first ten lines) also comes from the pipe. This would be the same as the command

sort names | head

which we see here:


In the second example, the ps command (process status) is piped through grep and all of the output is redirected to the file ps.save.

If we want to redirect stderr, we can. The syntax is similar:

command 2> file

It's possible to input multiple commands on the same command line. This can be accomplished by using a semi-colon (;) between commands. I have used this on occasion to create command lines like this:

man bash | col -b > man.tmp; vi man.tmp; rm man.tmp

This command redirects the output of the man-page for bash into the file man.tmp. (The pipe through col -b is necessary because of the way the man-pages are formatted.) Next, we are brought into the vi editor with the file man.tmp. After I exit vi, the command continues and removes my temporary file man.tmp. (After about the third time of doing this, it got pretty monotonous, so I created a shell script to do this for me. I'll talk more about shell scripts later.)

No comments:

Post a Comment

Privacy policy

At http://linux4fresher.blogspot.com/, the privacy of our visitors is of extreme importance to us. This privacy policy document outlines the types of personal information is received and collected by http://linux4fresher.blogspot.com/ and how it is used.

Log Files
Like many other Web sites, http://linux4fresher.blogspot.com/ makes use of log files. The information inside the log files includes internet protocol ( IP ) addresses, type of browser, Internet Service Provider ( ISP ), date/time stamp, referring/exit pages, and number of clicks to analyze trends, administer the site, track user’s movement around the site, and gather demographic information. IP addresses, and other such information are not linked to any information that is personally identifiable.

Cookies and Web Beacons
http://linux4fresher.blogspot.com/ does use cookies to store information about visitors preferences, record user-specific information on which pages the user access or visit, customize Web page content based on visitors browser type or other information that the visitor sends via their browser.

DoubleClick DART Cookie
Google, as a third party vendor, uses cookies to serve ads on http://linux4fresher.blogspot.com/.
Google's use of the DART cookie enables it to serve ads to users based on their visit to http://linux4fresher.blogspot.com/ and other sites on the Internet.
Users may opt out of the use of the DART cookie by visiting the Google ad and content network privacy policy at the following URL - http://www.google.com/privacy_ads.html

Some of our advertising partners may use cookies and web beacons on our site. Our advertising partners include ....
Google Adsense


These third-party ad servers or ad networks use technology to the advertisements and links that appear on http://linux4fresher.blogspot.com/ send directly to your browsers. They automatically receive your IP address when this occurs. Other technologies ( such as cookies, JavaScript, or Web Beacons ) may also be used by the third-party ad networks to measure the effectiveness of their advertisements and / or to personalize the advertising content that you see.

http://linux4fresher.blogspot.com/ has no access to or control over these cookies that are used by third-party advertisers.

You should consult the respective privacy policies of these third-party ad servers for more detailed information on their practices as well as for instructions about how to opt-out of certain practices. http://linux4fresher.blogspot.com/'s privacy policy does not apply to, and we cannot control the activities of, such other advertisers or web sites.

If you wish to disable cookies, you may do so through your individual browser options. More detailed information about cookie management with specific web browsers can be found at the browsers' respective websites.

Disclaimer :
All the pictures and material on this blog are assumed to be taken from public domain. The copyright (if any) of these pictures and articles belongs to their orginal publisher / photographer / copyright holder as the case may be. We claim no ownership to them. If anybody has reservations / objection on the use of these material/images or find any copy-righted material on this site, then please e-mail us with the details of copy right etc. In case, the objections is found to be appropriate, the offensive material / pictures will be removed from this site immediately.