Csh - the C Shell

Last modified: Fri Nov 27 09:44:52 2020

Check out my other tutorials on the Unix Page, and my

Check my blog

You are allowed to print copies of this tutorial for your personal use, and link to this page, but you are not allowed to make electronic copies, or redistribute this tutorial in any form without permission.

Original version written in 1994 and published in the Sun Observer

This section describes C Shell (CSH/TCSH) programming. It covers conditional testing, control loops, and other advanced techniques.

This month begins a tutorial on the bad-boy of UNIX, lowest of the low, the shell of last resort. Yes, I am talking about the C shell. FAQ's flame it. Experts have criticized it. Unfortunately, this puts UNIX novices in an awkward situation. Many people are given the C shell as their default shell. They aren't familiar with it, but they have to learn enough to customize their environment. They need help, but get criticized every time they ask a question. Imagine the following conversation, initiated by a posting on USENET:

Novice: How do I do XYZ using the C shell?

Expert: You shouldn't use the C shell. Use the Bourne shell.

Novice: I try to, but I get syntax errors.

Expert: That's because you are using the C shell. Use the Bourne shell.

Novice: I've now using the Bourne shell. How to I create aliases and do command-line editing in the Bourne shell?

Expert: You can't. use bash, ksh or tcsh.

Novice: I don't have these shells on all of the systems I use. What can I use?

Expert: In that case, use the C shell.

Novice: But you told me I shouldn't use the C shell!?!

Expert: Well, if you have to, you can use the C shell. It's fine for interactive sessions. But you shouldn't use it for scripts.

Novice: It's really confusing trying to learn two shells at once. I don't know either shell very well, and I'm trying to learn JUST enough to customize my environment. I'd rather just learn one shell at a time.

Expert: Well, it's your funeral.

Novice: How do I do XYZ using the C shell?

Another Expert: You shouldn't be using the C shell. Use the Bourne shell.

Novice: @#%&!

C shell problems

The C shell does have problems. (See My top 10 reasons not to use the C shell.) Some can be fixed. Others cannot. Some are unimportant now, but later on might cause grief. I'll mention these problems. But I'll let you decide if you want to continue to use the C shell, or start using the Bourne shell. Switching shells is difficult, and some may wish to do so gradually. If you want to use the C shell, that's fine. I'll show you the pitfalls, so you can intelligently decide. No pressure. You can switch at any time. But be aware that the C shell is seductive. It does have some advantages over the Bourne shell. But sometimes what seems like an advantage turns into a disadvantage later. Let me discuss them in detail.

Quoting long strings, $ and !

The first problem I faced with the C shell involved another language. I had a problem that required a sed or awk script. The C shell has a "feature" that warns programmers if they forgot to terminate a quoted string. The following command will generate a warning in the C shell:

echo "This quote doesn't end

The Bourne shell would continue till the end of the script. This is a good feature for an interactive shell, as it warns you if you forgot to close a quote. But if you want to include a multi-line string, such as an awk script inside a C shell script, you will have problems. You can place a backslash at the end of each line, but this is error prone, and also hard to read. Some awk constructs require a backslash at the end. Using them inside a C shell script would require two backslashes in a row.

There are some other strange quoting problems. Like the Bourne shell, the C shell has three ways to quote characters. You can use the single quote, double quote and backslash. But combine them, and find some strange combinations. You can put anything inside single quotes in a Bourne shell script, except single quotes. The C shell won't let you do that. You can type

echo hi!

but the following generates an error:

echo 'hi!'

You have to use a backslash if you want to do this:

echo 'hi\!'

Also, the following works:

echo money$

but this generates an error:

echo "money$"

But in this case you cannot even use a backslash. The followings an error in the C shell:

echo "money\$"

Unix shells have many special characters, and quoting them marks them as normal ASCII - telling the shell not to interpret them. And this is true with every Unix shell there is, except the C shell. In the above cases, putting quotes around some characters makes them special in the C shell, instead of preventing the special interpretation. Strange, huh?

The ad hoc parser

The second problem is subtle, but may be the next problem you discover. The Bourne shell has a true syntax parser: the lines are scanned, and broken up into pieces. Some pieces are commands. Other pieces are quoted strings. File redirection is handled the same way. Commands can be combined on one line, or span several lines. It doesn't matter. As an example, you can use

if true; then echo yes; fi

if true
then
echo yes
fi

The parsing of file redirection is independent of the particular command. If and while commands can get file redirection anywhere. The following is valid in the Bourne shell:

echo abc | while read a
do
echo a is $a
done >/tmp/f1

The same holds true for other Bourne shell commands. Once you learn the principles, the behavior is predictable.

The C shell does not have a true parser. Instead, the code executes one section for the if command, and another for the while command. What works for one command may not work for another. The if command above cannot be done in the C shell. There are two file redirections, and the C shell can't do either. Also, in the C shell, certain words must be the first word on the line. Therefore you might try something that works with one command, only to discover that it doesn't work on other commands. I've reported a lot of bugs to Sun, and to their credit, many have been fixed. Try the same code on other systems, however, and you might get syntax errors.

The parsing problem is also true with shell built-in commands. Combine them, and discover strange messages. Try the following C shell sequence:

time | echo

versus

time | /bin/echo

and notice the strange error message. There are other examples of this. These are the types of problems that sneak up on you when you don't expect them. The Bourne shell has the -n flag, which lets you check the script for syntax errors, including branches you didn't take. You can't do this with the C shell. The C shell seems to act on one line at a time and some syntax errors may not be discovered unless they get executed.

Reading one line at a time

Sometimes you have to ask a person for input in the middle of a script. Sometimes you have to read some information from a file. The Bourne shell allows you to specify the source of information for each command. Even though a script is connected to a pipe, you can ask the user for input. The C shell does not have this flexibility. It has a mechanism to get a line from standard input, but that is all it can do. You cannot have a C shell script get input from both a file and the terminal.

File redirection

With respect to file redirection, the Bourne shell has no limitations, while the C shell is very limited. With the Bourne shell, you can send standard error to one place, and standard out to another file. You can discard standard output, but keep the error. You can close any file descriptor, save current ones, and restore them. The C shell can't do any of these steps.

Signals, Traps and child processes

If you want to make your script more robust, you must add signal processing to it. That is, your script must terminate gracefully when it is aborted. The C shell has limited abilities. You can either do nothing, ignore all signals, or trap all signals. It's an all or nothing situation. The Bourne shell can trap particular signals, and call a special routine when the script exits normally. You can retain the process ID of a background process. This allows you to relay signals received to other processes under your control. The C shell cannot do this.

A time bomb

You can use the C shell for simple scripts. If you don't add many new features, and only write scripts for yourself, the C shell may be fine for you. But it is a time bomb. There are many times I wanted to add a new feature to a C shell script, and couldn't because it didn't support the idea. Or else I tried to port a C shell script to a different system and found that it didn't work the same way. Yes, you can use the C shell. Use it for as long as you want. Your choice.

Tick.... Tick... Tick...

Quoting C Shell Meta-Characters

This is my second tutorial on the C shell. This month, I will discuss quoting and meta-characters.

Like all shells, the C shell examines each line, and breaks it up into words. The first word is a command, and additional words are arguments for the command. The command

more *

uses a meta-character, the asterisk. The shell sees the asterisk, examines the current directory, and transforms the single character to all of the files in the current directory. The "more" program then displays all of the files. There are many other meta-characters. Some are very subtle. Consider this meta-character example:

more a b

The meta-character? It's the space. In this case, the space indicates the end of one filename and the start of the next filename. The space, tab, and new-line-character are used by the C shell to indicate the end of one argument, and the beginning of the next. (The Bourne shell allows more control, as any character can be specified to have this function).

These meta-characters are an integral part of UNIX. Or rather, an integral part of the shell. A meta-character is simply a character with a special meaning. The file system doesn't really care about meta-characters. You can have a filename that contains a space, or an asterisk, or any other character. Similarly, you can specify any meta-character as an argument to any command. Understanding which characters are meta-characters, what they do, and how to prevent them from being special characters is a skill that must be learned. Most learn by trial and error. Trouble is, the C shell is trickier than other shells.

One way to discover these characters is to use the echo built-in command, and see which characters the C shell will echo, and which ones are treated special. Here is the list of meta-characters, and a quick description of the special meaning.

+-----------------------------------------------------------------------+
|		    List of C Chell Meta-Characters			|
+-----------------------------------------------------------------------+
|Meta-character	  Meaning						|
+-----------------------------------------------------------------------+
|newline	  End of command					|
|space		  End of word						|
|tab		  End of word						|
|!		  History						|
|#		  Comment						|
|$		  Variable						|
|&		  End of command arguments, launch in background	|
|(		  Start sub-shell					|
|)		  End sub-shell						|
|{		  Start in-line expansion				|
|}		  End in-line expansion					|
||		  End of command arguments, Pipe into next command	|
|<		  Input Redirection					|
|>		  Output Redirection					|
|*		  Multi-character Filename expansion (a.k.a. globbing)	|
|?		  Single-character Filename expansion (a.k.a. globbing) |
|[		  Character Set Filename expansion (a.k.a. globbing)	|
|]		  Character Set Filename expansion (a.k.a. globbing)	|
|;		  End of command					|
|'		  Strong quoting					|
|"		  Weak quoting						|
|`		  Command substitution					|
|		  Sometimes Special					|
+-----------------------------------------------------------------------+

If you do not want one of these characters to be treated as a meta-character, you must quote it. Another term for this is escape, as in "escape the normal behavior." The Bourne shell has a predictable behavior for quoting meta-characters:

Put a backslash before each character.
Put single quotes around all of the characters.
Put double quotes around all of the characters. Exceptions: the dollar sign ($) and back-quote (`) are special, but a back-slash before them will escape the special connotation.
The Bourne shell has an internal flag that specifies when quoting occurs. It is either on or off. If quoting is on, then meta-characters are escaped, and treated as normal. The C shell is similar, yet not identical. As you will see, the quoting mechanism is less predictable. In fact, it has some maddening exceptions, Let me elaborate.

Using the backslash

If you want to use a meta-character as an ordinary character, place a backslash before it. To delete a file called "a b" (there is a space in the filename between a and b), type

rm a\ b

Strings in single quotation marks

The second method for quoting meta-characters is specifying a string that begins and ends with single quotes:

rm 'a b'

In the Bourne shell, any character in single quotes is not a meta-character. This is not true in the C shell. There are two exceptions: the exclamation point, and new line. The Bourne shell allows this:

echo 'Hello!'

The C shell requires a backslash:

echo 'Hello!'

The exclamation point is a meta-character, and the C shell uses it for its alias and history features, which I will discuss later. The other exception is a new-line character. The Bourne shell allows:

echo 'New line ->
'

The C shell requires a backspace before the end-of-line:

echo 'New line ->
'

A novice programmer may consider this a feature, as any command with an unterminated string will generate an error. However, when your programmer skills increase, and you want to include a multi-line awk script in a shell script, the C shell becomes awkward. Sometimes awk needs a backslash at the end of a line, so in the C shell, you would need two backslashes:

#!/bin/csh -f
awk '{printf("%st%st%sn",
$3, $2, $1}'

Click here to get file: CshAwk.csh
An awk script in a C shell script is extremely awkward, if you pardon my choice of words.

Strings in double quotation marks

The last quoting mechanism is a string that starts and ends with the double-quote marks:

rm "a b"

This is similar to the single-quote form, but it intentionally escapes most meta-characters except the dollar sign and back-quote. This allows command substitution, and variable interpretation:

echo "I am $USER"
echo "The current directory is `pwd`"

Like the single quoted string, the exclamation point is an exception:

echo "Hello!"

I usually call the single quote the "strong" quote, and the double quote the "weak" quote. However, it is not so simple. The frustrating thing about the C shell is that inside a double-quote string, you cannot place a backslash before a dollar sign or back-quote to escape the special meaning. That is, when I execute the command

echo "$HOME"

the shell echoes

/home/barnett

So the backslash only works some of the time. The C shell if filled with special cases. You have to learn them all. To make it easy for you, here is a table that explains all of the exceptions. The first column is the meta-character. The second column shows what is required to get the meta-character in a string delineated by double quotes. The third column corresponds to single quotes. The last column shows what is needed when there are no quotation marks.

+--------------------------------------------------------------+
|	   Meta-character interpretation in strings	       |
+--------------------------------------------------------------+
|Meta-character	  "..."	       '....'	    no quotation marks |
+--------------------------------------------------------------+
|newline	  Requires    Requires    Requires 	       |
|space		  Quoted       Quoted	    Requires 	       |
|tab		  Quoted       Quoted	    Requires 	       |
|!		  Requires    Requires    Requires 	       |
|#		  Quoted       Quoted	    Requires 	       |
|$		  Impossible   Quoted	    Requires 	       |
|&		  Quoted       Quoted	    Requires 	       |
|(		  Quoted       Quoted	    Requires 	       |
|)		  Quoted       Quoted	    Requires 	       |
|{		  Quoted       Quoted	    Requires 	       |
|}		  Quoted       Quoted	    Requires 	       |
||		  Quoted       Quoted	    Requires 	       |
|<		  Quoted       Quoted	    Requires 	       |
|>		  Quoted       Quoted	    Requires 	       |
|*		  Quoted       Quoted	    Requires 	       |
|?		  Quoted       Quoted	    Requires 	       |
|[		  Quoted       Quoted	    Requires 	       |
|]		  Quoted       Quoted	    Requires 	       |
|;		  Quoted       Quoted	    Requires 	       |
|'		  Quoted       Impossible   Requires 	       |
|"		  Impossible   Quoted	    Requires 	       |
|`		  Impossible   Quoted	    Requires 	       |
|		  Quoted       Quoted	    Requires 	       |
+--------------------------------------------------------------+

The phrase "Quoted" means the meta-character does not have a special meaning. The phrase "Impossible" means the meta-character always has a special meaning, and cannot be quoted. The phrase "Requires " says that a backslash is required to escape the special meaning.

To use the table, imagine you have a file with the same name as a meta-character. Suppose the filename was "?" (a question mark). You have three methods of specifying the exact filename when deleting it:

rm "?"
rm '?'
rm ?

If the file had the name "!" (an exclamation mark), then the table states you always need a backslash:

rm "!"
rm '!'
rm !

Notice that some combinations don't work. That is, there is no way to place a single quote inside a string delineated by single quotes. Here comes the tough question. Suppose you wanted to do the impossible. How do you solve this problem?

Solving special cases

Normally, having different quotes is convenient. You can use one form of quote to contains the other form:

echo " ' "
echo ' " '

How can you place a quote within a quoted string, when the quote is the same type? The simple answer? You can't. But there is a simple trick that can be used for all complex cases. But it requires a different view of quoted strings. You see, they are not really strings. Most programmers think a string is defined by the quotes at the beginning and at the end. That is, you place quotes around the string, and insert special characters in the middle to get around any tricky conditions. This does not accurately describe a UNIX shell. There is no concept of strings in the shell. Instead, the shell has an internal flag which can be enabled and disabled ny the quote characters. The following examples are all identical:

rm "a b"
rm a" "b
rm "a "b
rm a" b"
rm "a"" ""b"
rm "a"' '"b"

In some cases, the letters "a" and "b" are quoted. In other cases, they are not, because they do not need to be escaped. The space, on the other hand, is always quoted in each example above. The secret is understanding which characters have to be quoted, and selecting the best way to quote them.

Now suppose you want to include a double quote inside a double quote? You can't. But you can switch the types of quotation marks at any point. The last example switches the quotation marks from double quotes to single quotes. This same technique can be used to delete a file with a double quote in the filename. Here are fours ways to do this:

rm 'a"b'
rm a'"'b
rm a"b
rm "a"'"'"b"

Here are some other examples:

echo "The date command returns " '"' `date` '"'
echo 'Here is a single quote:' "'"

Passing variables inside a string

A common question is how to pass a variable to the middle of a string. Suppose you wanted to write an AWK script that printed out one column. To print the first column is easy:

awk '{print $1}'

But this script always prints the first column. To pass the column number to the script requires the same techniques:

awk '{print $'$1'}'

This is hard to read, and you need a certain knack to become accustomed to understanding it. Just scan from left to right, keeping track of the current quote state, and which character was used to enable the quote condition. After a while, it becomes easy.

C shell Globbing

This is my third tutorial on the C shell. This month, I will discuss filename expansion, and globbing.

One of the primary functions of a shell is to provide an easy way to execute commands, and passing several files to the command. Before the shell determines which command to execute, it evaluates variables, and performs filename substitutions. This is a fancy way of saying you can abbreviate filenames. The Bourne shell only supports globbing. The C shell supports three forms of filename substitutions: globbing, in-line expansion, and home directory expansion:

Globbing

The first UNIX shell was called the Mashey shell. This was before the C shell and Bourne shell was written. The Mashey shell didn't have filename substitution. If you wanted to list every file in a directory, the shell did not support

ls *

Instead, you had to use the glob command:

ls `glob *`

Glob is a precise, scientific term that is defined as "a whole messa," which is not to be confused with "glom" which means "view" or "examine." An example would be "I wanna glom a whole messa files." The proper terminology, used by those with a doctorate in computer science, is of course "I wanna glom a globba files." And anyone with a similar doctorate will know exactly what this means. Try this at your next dinner party, and you too can impress the neighbors.

Needless to say, after creating, editing, printing, and deleting globs of files day after day, someone realized that life would be easier if the shell did the globbing automatically. And lo, UNIX shells learned to glob.

The C shell has this feature of course. The easiest way to learn how globbing works is to use the echo command:

echo *

This will echo every file in the directory. Well, not every one. By convention, files that start with a dot are not echoed. If you want to list all files, use

echo * .*

In other words, the dot must be explicitly specified. The slash "/" must also be matched explicitly. Other utilities, like find, use this same convention. The asterisk matches everything except these two characters, and means "zero or more characters." Therefore

echo a* b* c*

will echo all files that start with an "a," "b," or "c." Note that the shell sorts the files in alphabetical order. But each expansion is sorted separately. If you executed

echo c* b* a*

the order would first be the files starting with c, then with b, then with a. Within each group, the names would be sorted. Also note that the shell puts all of the files on one line. The echo command can be used as a simple version of the ls command. If the ls command is given the above command, it will sort the filenames again, so they will be in alphabetical order.

If you want, you can enable or disable globbing. The command

set noglob

disables globbing, and

unset noglob

enables it again. These two echo statements do the same thing:

set noglob
echo a* b* c*
echo 'a* b* c*'

Both echo the string "a* b* c*" instead of expanding to match filenames.

What happens if you type the command

echo Z*

and you have no files in your directory that start with a "Z?" The Bourne shell will assume you know what you are doing, and echo "Z*" without any complaints. Therefore if you have a Bourne shell script, and execute

myscript Z*

then the script myscript will get the argument "Z*" instead of a list of filenames. The C shell gives you a choice. If none of the filename substitutions find a file, you will get an error that says:

No match

But if you set the "nonomatch" variable:

set nonomatch

then the C shell behaves like the Bourne Shell.

You should also note that the asterisk can be anywhere in a filename, and you can use any number of asterisks. To match files that have the letters a, b and c in sequence, use

echo *a*b*c*

Match a single character

The meta-character "?" matches a single character. Therefore

echo ???

will match all filenames that have three letters, while

echo ???*

will match files that have three or more letters.

Matching character sets

You can match combination of characters, using square brackets. If any of the characters inside the square brackets match the single character in the filename, the pattern with match. You can also specify a range of characters using a hyphen. Therefore the following are equivalent:

echo a* b* c*
echo [abc]*
echo [a-c]*

To match any file that starts with a letter, you can use any of the following:

echo [a-zA-Z]*
echo [abcdefghijklmnopqrstuvwxyzA-Z]*
echo [ABCDEFGHIJKLMNOPQRSTUVWXYZa-z]*
echo [zyxwvutsrqponmlkjihgfedcbaA-Z]*
echo [A-Zzyxwvutsrqponmlkjihgfedcba]*

As you can see, the order doesn't matter, unless a hyphen is used. If you specify a range in a reverse alphabetical order, the results are unpredictable. The command

echo [c-a]*

will only match files that start with "c" using the C shell, while the Bourne shell will match files that start with "c" or "a." Use improper ranges, and the different shells give different results. The command

echo [c-b-a]*

will only match files that start with a "c" with the C shell, while the Bourne shell will match files that start with a, b or c. Apparently the Bourne shell will treat strange range values as individual characters, while the C shell ignores bogus ranges, except for the starting character.

Combining meta-characters

You can combine these meta-characters in any order. Therefore it makes sense to pick filenames that are easy to match with meta-characters. If you had ten chapters in a book, you don't want to name them

Part1.bk Part2.bk ... Part10.bk

You see, if you specified the chapters like this:

ls Part?.bk Part10.bk

the shell would expand all of the meta-characters, and then pass this to the ls command, which would then change the order. Therefore after file "Part1.bk" would be "Part10.bk" followed by "Part2.bk." Instead, use the names

Part01.bk Part02.bk ...Part10.bk

so the alphabetical order is always the logical order. This can be very useful if you have log files, and use the current date as part of the file name.

It is also important to note that the shell evaluates variables before expanding filenames. So you can specify

echo $USER??.out

and the shell will first evaluate the variable "USER," (to "barnett" in this case) and then perform the filename substitution, matching files like "barnett12.out" and finally sorting the filenames in alphabetical order. Then the list of filenames is passed to the echo command.

In-line expansion

The C shell has a unique feature, in that it can generate patterns of filenames even if they don't exist. The curly braces are used, and commas separate the pattern. Suppose you have the file "b0" in the current directory. The command

echo b[012]

will only echo

But the command

echo b{0,1,2,3,0}

will generate

b0 b1 b2 b3 b0

Notice that the order is not sorted alphabetically. Also note that "b0" occurs twice. If we change this to

echo [b]{0,1,2,3,0}

the output again becomes

The in-line expansion comes first, and then the filename substitution occurs. You can put meta-characters inside the curly braces. The following two are equivalent:

echo b* b?
echo b{*,?}

The number of characters can change within the commas. These two commands are equivalent:

echo A1B A22B A333B
echo A{1,22,333}B

This in-line expansion has a multiplying effect. The command

echo {0,1,2,3,4,5,6,7,8,9}{0,1,2,3,4,5,6,7,8,9}

will print out 100 numbers, from 00 to 99. The Bourne shell does not have this feature.

Home directory expansion

The last for of filename expansion is the "~" character. By itself, it expands to the build-in variable "home." You can also provide a username. This allows you to specify a filename in someone's home directory, without knowing what it is:

more ~smith/.login

The C shell determines this value by examining the password file.

C Shell Variable Usage

An essential part of understanding the C shell is mastering variables and variable lists. Setting variables are simple:

set myvariable = word
set myvariable = "a string"

The second form is necessary if spaces are needed inside the variable. I always put spaces around the equals sign, as the C shell seems to complain less with extra spaces. To clear a variable, set the variable to an empty string:

set myvariable = ""

to remove a variable, so it is no longer defined, use the unset command

unset myvariable

Passing arguments to a shell script

If you create your own shell script, you can pass parameters similar to the Bourne shell. The C shell even uses the same convention of $1 and $*:

#!/bin/csh
echo First argument is $1, second argument is $2
echo All of the arguments are $*

Click here to get file: Csh1.csh
However, there is no $# variable, which returns the number of arguments in the Bourne shell.

How can you check the number of arguments? The C shell has a second way of specifying the arguments passed to the shell script, using a predefined variable list, called "argv" The C shell borrowed this concept from the C programming language. Wonder why? (Hint: it is called the "C" shell for a reason.) The value $argv[1] is equivalent to $1, and $argv[2] is the same as $2. Furthermore, $argv[*] is equivalent to $*. The variable $#argv contains the number of arguments passed to the shell:

#!/bin/csh
echo There are $#argv arguments
echo First argument is $argv[1], second argument is $argv[2]
echo All of the arguments are $argv[*]

Click here to get file: Csh2.csh
There is a subtle difference between $2 and $argv[2]. If there is only one parameter, the line

echo Second argument is $2

will print

Second argument is

while

echo Second argument is $argv[2]

will generate an error, and print

Subscript out of range

The C shell has a mechanism to prevent this error. It is also used to specify a subset of the list of values:

echo All of the arguments but the first are $argv[2-$#argv]

This does not generate an error. With typical UNIX terseness, the above can be replaced with either of the two following statements:

echo All of the arguments but the first are $argv[2-*]
echo All of the arguments but the first are $argv[2-]

This is a general purpose mechanism, as any value can be used to specify the range:

echo first through third arguments are $argv[1-3]

If you specify a specific range, the argument has to be there. That is, $argv[2] will generate an error in the second argument does not exist. $argv[2-] will not. If the first range is missing, the default is 1, or the first parameter.

Variables are allowed inside square brackets.

set first = 1
set last = $#argv
echo first argument is $argv[$first]
echo last argument is $argv[$last]
echo all of the arguments are $argv[$first-$last]

Arguments generalized as array lists

As you can see, the C shell allows you to easily specify portions of the argument list using ranges of values. This can be very convenient. Apparently the author of the C shell agreed, because any variable can be an array list. The simplest way to create an array list is to use the backquote characters:

set mydate = `date`
# returns a date, like
# Tue Jan 7 17:26:46 EST 1997
echo Month is $mydate[2], year is $mydate[$#mydate]

The Bourne shell has an array list, but only one, and it doesn't have a name. It is the argument list, which is set when the script is called. You can change this with the "set" command. You must, however, remember to save the current values if you want to use them again. So if you need to manage several arrays, and perhaps use one index to access several arrays, the C shell might make the task easier. Yes, I realize some consider my statement blasphemous. Today, I'm wearing my "Don't bug me because I'm using the C shell" pin, so I don't care. If you want to give me grief, read my "Curse of the C shell" column three months ago. I'm not going to sugar coat the C shell beast. It's covered with warts, but it does have some convenient features. Just watch out for the warts. Speaking of which...

The C shell does not elegantly handle variables that contain spaces. There is no equivalent to the $@ variable. Consider this:

set a = "1 2 3"
set b = $a

Any sane person might expect that variables "b" and "a" have the same value. Surprise! This doesn't work in the C shell. On some versions of SunOS, the variable "b" is empty. On other systems, you get "set: syntax error."

How can you copy an array? The command below should work, but doesn't:

set copy = $argv[*]

It behaves like the example above. Unpredictable. If you enclose the variable in double quotes, like this:

set copy = "$argv[*]"

then all of the parameters are in $copy[1], and $copy[2] is not defined. A slight improvement. The C shell has a special mechanism for specifying arrays, using parenthesis:

set copy = ( $argv[*] )

Therefore the following two statements are equivalent:

set mydate = `date`
set mydate = ( `date` )

The parenthesis is preferred for several reasons. The following commands do not work:

set c = $*
set c = $argv[*]
set c = a b
set c = "a" "b"
set c = `date` `date`
set c = `date` $a "b c"

Some do nothing. Some generate an error. Some cause a core dump. Life would be dull without the C shell. However, adding parenthesis solves almost all of the problems:

set c = ( $* )
set c = ( $argv[*] )
set c = ( a b )
set c = ( "a" "b" )
set c = ( `date` `date` )
set c = ( `date` $a "b c" )

This eliminates the unpredictable behavior. The only problem left is spaces inside variables. Since there is no $@ variable, the only way to retain spaces in an array is to copy each element over, one by one:

set b[1] = "$a[1]"
set b[2] = "$a[2]"
etc.

This, by the way, uses a special for of the "set" command, that can modify a single array element. However, the array must exist first, so the code to copy an array is complicated, and darn ugly too.

In general, the C shell is not suited for variables that contain spaces. Also, if there is any possibility that an argument contains a space, you should use

set a = ( $argv[1] )

set a = "$argv[1]"

instead of

set a = $argv[1]

Alternately, just tell anyone who uses spaces in arguments to perform an impossible biological act. I've learned this from my colleges, who've come to the conclusion that anyone who worries about spaces inside arguments is someone who has way too much free time. I don't know anyone like that, however.

Clearing array lists

The command:

set a = ""

Defines variable $a[1], and sets it to an empty string. $#a is equal to 1. However, the command

set a = ()

empties the array, so variable $a[1] does not exist. The variable $#a is equal to 0. The command

unset a

removes the definition of the variable.

Testing for variable existence

The Bourne shell lets you refer to variables that do not exist. If you ask for the value, you will get an empty string. The C shell will give you a warning if the variable does not exist, or the array element does not exist. Using $1 instead of $argv[1] can help, as well as using ranges, like $argv[1-]. There is another method, by using the special variable $?x, where x is the name of the variable. If the variable does not exist, it will have the value of zero. If the variable exists, it will have the value of one. You can combine this with an "if" statement, which is somewhat more cumbersome than the Bourne shell technique.

The Shift command

The special command "shift" can be used to remove the first array element. It "pops" the first value of the stack of the argument list. The "shift" command can take an optional variable:

set a = ( a b )
shift a
# same as
# set a = ( b )

If you listen carefully, you can hear a slight noise as each variable pops off the stack. I've suddenly realized I've been working too hard. I'll continue next month. Until then, take care.

C Shell Flow Control

We've been talking about variables, lists, and strings. Time to start doing something useful with the C shell. Let's start with a simple way to branch.

myprogram && echo The program worked

If the program "myprogram" has no errors, the script echoes "The program worked." In other words, if the program "myprogram" exits with a status of zero, the "echo" program is executed. If any errors occur, the program would exit with a positive value, which typically indicates the error that occurred. To test for an error, use "||" instead:

myprogram || echo the program failed

These can be combined:

myprogram && echo program passed | echo program failed

This can be used for many tests, but there are some points to watch out for. If the program "myprogram" generates any output, it will be sent to standard output, which may not be what you want. You can capture this by redirecting standard output:

myprogram >/dev/null && echo program passed

If the program might generate an error, you can capture this by using the special combination ">&." This merges standard error with standard output:

myprogram >& /dev/null && echo program passed

This type of conditional operation can be enclosed in parenthesis to keep standard input flowing through the different programs. For instance, if you wanted to test standard input for the word "MATCH," and either add the word "BEFORE" if the match is found, or add "AFTER" if no match is found, you can use the following ugly code:

#!/bin/csh
tee /tmp/file |
( grep MATCH /tmp/file >/dev/null &&
( echo BEFORE; cat - ) || ( cat - ; echo AFTER) )
/bin/rm /tmp/file

Click here to get file: prog.csh
If you save this script in a file called "prog" and type

echo MATCH | prog

the script will echo

BEFORE
MATCH

If instead you execute

echo no | prog

the script will output

no
AFTER

The parenthesis are necessary, because the "echo" command does not read standard input. It discards it. Putting parenthesis around the "echo" and "cat" commands connects the pipe to the input of both programs.

The C shell has some similarities to the Bourne shell. In fact, the above script could be a Bourne shell script. The two shells act the same in this case. However, early versions of the C shell had a bug. The meaning of "&&" and "||" were reversed. This was a long time ago, and I think most versions of the C shell have this bug fixed. Still, it may not be portable. The second difference between the shells is the Bourne shell allows the curly braces to be used as well as the parenthesis. The parenthesis causes an extra shell to be executed. So the script above requires 4 shells to be executed. The Bourne shell could do it with one shell process. Still, I find it amusing that the above script works for both the C shell and the Bourne shell. But I don't get out much.

The second mechanism for doing tests in the C shell uses the "if" command. There are two formats, a short form and a long form:

# first the short form
if ( expression ) command
# then the long form
if ( expression ) then
command
endif

The expression is a numerical expression. If the results is zero, the expression is false. If non-zero, the expression is true. These three statements will all output "yes:"

if (1) echo yes
if (-1) echo yes
if (2) echo yes

If you want to use a program in the "if" statement, similar to the "&&" test, it can be placed in back quotes. I'll use the long form of the "if" statement:

if ( `myprogram` ) then
echo yes
endif

In this case, the exit status of the program is not used. Therefore, the only output from these statements is one "yes," while the other commands do not print:

if (`echo`) echo no
if (`echo 0`) echo no
if (`echo ""`) echo no
# Only the next statement is true:
if (`echo 1`) echo yes

This might strike you as inconsistent, and you would be right. The Bourne shell uses the "test" program to convert numbers and strings into status. The "&&" and "||" commands also use the status to branch, in both the C shell, and the Bourne shell. The "if" command does not use status. Assume I created a script, called "myexit," that was simply this:

#!/bin/csh
exit $*

Click here to get file: myexit.csh
The following expressions would be true:

true && echo yes
myexit 0 && echo yes
if (1) echo yes

This is confusing, but the trick is to remember that zero is true in the exit status, while non-zero is true in expressions. There is a way to get the exit status inside an expression. It's not a common technique, however. The solution is to enclose the command inside curly braces. If the command within curly braces is successful (status of zero), then the expression is 1, or true:

if ( { myprogram } ) echo myprogram worked

However, if you try to redirect standard output, it will not work:

if ( { grep pattern file >/dev/null } ) echo found pattern

This only tests for no errors. Most of the time people use the special variable "$?" which contains the actual error number found.

Optional forms of the if command

The short version of the "if" command without a "then" statement only takes one statement. You cannot use parenthesis to add additional commands. The following statement generates an error:

if (1) (echo a;cat) # an error

The "then" word must be included to allow multiple statements. If you have the longer form, you can optionally add "else" and "else if" commands:

if ($debug) then
echo debug mode
else if ($verbose) then
echo verbose mode
else
echo "normal mode"
endif

Any number of "else if" statements can be used, including none.

Problems with the if command

The C shell has many problems with the "if" command, that the Bourne shell does not have. At first, the C shell seems adequate, but when you try to accomplish something difficult, it may not work. I've already mentioned:

if (1) (echo a;cat) # an error

The solution is to use the long form, with the "then" word.

Suppose you wanted to optionally empty a file, removing all contents. You might try the following:

if ($empty) echo "" >file

However, this does not work as expected. The file redirection is started before the command is executed, and before the value of the variable is determined. Therefore the command always empties the file. The solution is again, to use the long form.

You can nest "if" statements inside "if" statements, but combining long and short forms doesn't always work. Notice a pattern? I personally avoid the short form, because if I add an "else" clause, or a nested "if" statement, I get a syntax error.

The next problem is more subtle, but an indication of the ad hoc parsing done by the C shell. The commands that deal with flow control must be the first command on the line. As you recall, the Bourne shell specifies commands have to be on a new line, but a semicolon works as well as a new line. The C shell requires the command to be the first word. "So what?" you may ask. You may have never experienced this problem, but I have. Try to pipe something into a conditional statement. The C shell won't let you do this without creating another script.

The While command

The second command used for flow control is the "while ...end" command. This statement will execute the statements inside the loop while the condition is true. I often type the following in a window to examine a print queue:

while (1)
lpq
sleep 10
end

This command runs until I press control-C. "While"" and "end" must be the only words on the line.

As for problems, it's hard to use the C shell to read a file one line at a time.

The Foreach command

The third command, "foreach," is used to loop through a list. One common use is to loop through the arguments of a script:

#!/bin/csh
# arguments are $*
foreach i ( $* )
echo file $i has `wc -l <$i` lines
end

Click here to get file: WordCount.csh
It can also be used with other ways of getting a list:

# remember the shell expands meta-characters
foreach file ( A*.txt B*.txt )
echo working on file $file
# rest of code
end

# here is another example
foreach word ( `cat file|sort` )
echo $word is next
# rest of code
end

The Switch statement

The last flow-control command is the "switch" statement. It is used as a multiple "if" statement. The test is not an expression, but a string. The case statement evaluates variables, and strings with the meta-characters "*," "?," "[" and "]" can be used. Here is an example that shows most of these variations. It prints out the day of the week:

#!/bin/csh
set d = (`date`)
# like
# set d = ( Sun Feb 9 16:08:29 EST 1997 )
# therefore
# d[1] = day of week
set day2="[Mm]on"
switch ( $d[1] )
case "*Sun":
echo Sunday; breaksw
case $day2:
echo Monday;breaksw;
case Tue:
echo Tuesday;breaksw;
case Wed:
echo Wednesday;breaksw;
case Th?:
echo Thursday;breaksw;
case F*:
echo Friday;breaksw;
case [Ss]at:
echo Saturday;breaksw;
default:
echo impossible condition
exit 2
endsw

Click here to get file: GetDate.csh
The statements can't be on the same line as the "case" statement. Again, the ad hoc parser is the reason. The "breaksw" command causes the shell to skip to the end. Otherwise, the next statement would be executed.

Complex Expressions

The Bourne shell uses the program expr to perform arithmetic. It uses the test program to compare the results. The C shell can both calculate complex expressions and test them at the same time. This provides simplicity, but there are penalties, which I will discuss later. Table 1 shows the list of operators, in order of precedence. Operators in the same box have the same precedence.

	+-----------------------------------+
	    |Operator	  Meaning		|
	    +-----------------------------------+
	    |(...)	  Grouping		|
	    +-----------------------------------+
	    |~		  One's complement	|
	    +-----------------------------------+
	    |!		  Logical negation	|
	    +-----------------------------------+
	    |*		  Multiplication	|
	    |/		  Division		|
	    |%		  Remainder		|
	    +-----------------------------------+
	    |+		  Addition		|
	    |-		  Subtraction		|
	    +-----------------------------------+
	    |<<		Bitwise shift left	|
	    |>>		Bitwise shift right	|
	    +-----------------------------------+
	    |<		Less than		|
	    |>		Greater than		|
	    |<=		Less than or equal	|
	    |>=		Greater than or equal |
	    +-----------------------------------+
	    |==		  Equal to		|
	    |!=		  Not equal to		|
	    |=~		  Pattern Match		|
	    |!~		  Pattern doesn't match |
	    +-----------------------------------+
	    |&		Bitwise AND		|
	    |^		  Bitwise OR		|
	    ||		  Bitwise inclusive OR	|
	    |&&		  Logical AND		|
	    |||		  Logical OR		|
	    +-----------------------------------+

The operators "==," "!=," "=~" and "!~" operate on strings. The other operators work on numbers. The "~" and "!" are unary operators. The rest require two numbers or expressions. Null or missing values are considered zero. All results are strings, which may also represent decimal numbers. Table 2 shows some expressions, and the value after being evaluated.

		+--------------------------+
		|Expression	   Results |
		+--------------------------+
		|~ 0			-1 |
		|~ 1			-2 |
		|~ 2			-3 |
		|~ -2			 1 |
		|! 0			 1 |
		|! 1			 0 |
		|! 2			 0 |
		|3 * 1			 3 |
		|30 / 4			 7 |
		|30 % 4			 2 |
		|30 + 4			34 |
		|4 - 30		       -26 |
		|512 >> 1	       256 |
		|512 >> 2	       128 |
		|512 >> 4		32 |
		|2 << 4			32 |
		|3 << 8			768 |
		|4 < 2			0 |
		|4 >= 2			1 |
		|ba =~ b[a-z]		1 |
		|7 & 8			0 |
		|7 ^ 8			15 |
		|15 & 8			8 |
		|15 ^ 8			 7 |
		|15 ^ 7			 8 |
		|15 | 48		63 |
		|15 | 7			15 |
		|15 && 8		 1 |
		|15 || 8		 1 |
		+--------------------------+

The C Shell also supports file operators, shown in table 3.

+-----------------------------------------------------------+
|Operator      Meaning					    |
+-----------------------------------------------------------+
|-r filename   Returns true, if the user has read access    |
|-w filename   Returns true, if the user has write access   |
|-x filename   Returns true, if the user has execute access |
|-e filename   Returns true, if the file exists		    |
|-o filename   Returns true, if the user owns the file	    |
|-z filename   Returns true, if the file is empty	    |
|-f filename   Returns true, if the file is a plain file    |
|-d filename   Returns true, if the file is a directory	    |
+-----------------------------------------------------------+

If the file does not exist, or inaccessible, the test returns false.

Commands that use expressions

Only two flow-control commands support complex expressions, "while," and "if." The "exit" command also takes a complex expression. This allows sophisticated passing of exit codes to a calling program, but I've never seen a C shell script that makes use of this. Surprisingly, the "set" command does not use complex expressions. If you execute the command

set a = ( 1 + 2 )

This creates a list of three elements, and $a[2] has the value of "+." There is a mechanism to assign complex expressions to variables. A special command, called "@" is used. You must have spaces after the "@" character. Spaces are almost always required between all operators and expression elements. The C shell likes spaces, and gets grumpy if you don't include enough. A vitamin deficiency, I guess. The "@" command also supports the "++" and "--" postfix operators, which increment or decrement the variable specified. This construct was taken from the C language.

Also borrowed from the C language is the assignment operators +=, -=, *=, /=, and %=. The expression

@ a %= 4

is the same as

@ a = $a % 4

Other examples of the "@" command are:

@ a++
@ b=$a + 4
@ c*=3
@ c=4 + $b

Examples

Suppose you want to source a file, but are afraid that someone might substitute it for another file. A crude example that checks if a file is owned by you, and readable would be:

if ( -o $file && -r $file ) then
	source $file
endif

This example isn't 100% secure, but it is slightly better than blindly sourcing a file without checking who owns it. Most of the time the file test operators are used to prevent runtime errors caused by files that are not readable, or executable.

Notice the for command does not support complex expressions. You can emulate the C language for construct using while. This code fragment counts up to 10 using a list:

foreach i ( 1 2 3 4 5 6 7 8 9 10 )
echo $i
end

However, if you wish to count to 100, this becomes impractical. Here is how you can do it using complex expressions:

@ a = 1
@ MAX=100

# count from 1 to $MAX

while ( $a <= $MAX )
echo $a
@ a++
end

Tricky expressions to test

One stumbling block people discover is looking for command line arguments. Suppose your script will accept a "-r" option. However, the following line will not work:

if ( $argv[1] =~ -r ) echo found it

If the first argument is "-r." then this is evaluated as:

if ( -r =~ -r ) echo found it

The C shell will assume you meant to use a file operator, and test the file "=~" to see if it is readable. Then it sees the next operator, which is again a "-r," but in this case there is no filename afterwards. This generates a syntax error. The solution is to place a dummy character before both strings:

if ( X$argv[1] =~ X-r ) echo found it

Parenthesis in the C shell

In complex expressions, parenthesis can be used to alter the default precedence in evaluation. To put it another way, when in doubt, use parenthesis. Both expressions below do the same thing:

if ( $a + 2 > 5 ) echo yes
if ( ( $a + 2 ) > 5 ) echo yes

However, parenthesis have several jobs. The context specifies how parenthesis are used. This is where the parsing of the C shell shows some additional warts. In these examples, the parenthesis are used to specify a list:

set i = ( a b c d e f g )
foreach i ( a b c d e f g )
echo $i
end

These show expressions:

if ( $x > 2 ) echo yes
while ( $x )
@ $x--
end

This is an example of creating a subshell:

(stty 9600;cat file) >/dev/printer

And this is an example of grouping:

@ x = ( ( $b + 4 ) & 255 ) << 2

And here is an example where the parenthesis have two different meanings:

if ( ( $b + 4 ) > 10 ) echo yes

I've tried to combine several of these uses into one statement, and it generates errors. I'm not surprised.

Break and continue

The C shell has special "escape" commands, used to exit from "while" and "foreach" loops. The break command will escape out and terminate the loop. The continue command will go to the end of the loop, but cycle through again. Here is a complete shell script that prints out the numbers 2, 4, 6 and 8, but it's nothing to cheer about.

#!/bin/csh

foreach i ( 1 2 3 4 5 6 7 8 9 10 11 12 )

# if 9, exit
if ( $i == 9 ) break
# if odd, then don't echo
if ( $i % 2 == 1 ) continue
# Even numbers only
echo $i
sleep 1
end
echo 'Who do we appreciate?'
sleep 1; echo $USER
sleep 1; echo $USER
sleep 1; echo $USER
sleep 1; echo 'Yeah!!!!!!'

Click here to get file: Cheer.csh
I think this covers most of the issues with complex expressions. Let me know if you have any questions.

Interactive Features of the C shell

Bill Joy's Legacy

In my discussion of the C shell, I've described the good points and bad points of the C shell. For those who are keeping score, the Bourne shell is ahead 10 to 2. Why is the C shell so popular? To explain this requires a short history lesson. Forgive me.

When I went to college in the early 70s, programming meant going to the keypunch station and carrying around decks of punch cards. The first time I used an interactive terminal, directly connected to a computer, it was a hard-copy device. A popular interactive terminal at that time was the Teletype. It had a keyboard, and printed on a ugly roll of yellow paper. Some models supported a paper tape reader and paper tape punch. It was a bargain, because one machine was a terminal, printer, and backup device. A complete I/O system in one unit, for less than $10,000!

We had a programmer whose job was to edit paper tapes, punching new holes, and splicing paper. The only device used was a Teletype and a tape splicer. If a mistake was made, the "programmer" would position the paper tape just right, and punch out all of the holes to erase that letter. In case you wondered, this is why the ASCII code for delete is 11111111 in binary. Each "1" corresponds to a hole, and a row of 8 holes corresponded to a deleted character. If you made a mistake, you could "erase" the mistake without starting a brand new tape. Just back up and punch out the error. We felt fortunate when the boss ordered several video terminals, which cost more than a complete PC system nowadays. Editing was done by the terminal, which had memory, and keys with arrow characters. We were truly excited. With our new program, that only worked with our new terminal, and 64K of RAM, and 5 MB of hard disk, we now had a real computer system. I imagine Berkeley had computers of a similar configuration, and was equally excited when they got their first VAX. Trouble was, the default editor, ed, was designed for those old-fashioned hard copy terminals. Ed has a consistent user interface. If you typed something right, it said nothing. If you typed something the program didn't understand, it printed out a single question mark. The authors felt that this was sufficient information, and the interpretation of the error message should be obvious. Nowadays people comment on this statement as proof that UNIX was not originally user-friendly. Wrong! You see, Teletypes were incredibly loud. Teletypes were also incredibly slow. I remember pounding on the keys on a model 33 Teletype. Pounding is the right word. I imagine martial-art students practiced on the keyboard, in their effort attempt to develop strong-as-steel fingers. Alas, I lacked the skill, and the instant I made a mistake, the Teletype let loose a 70 decibel machine-gun rat-a-tat-tat for 5.7 seconds, as it typed "Syntax error, please type command again - you stupid person." Needless to say, this irritated me immensely. When I am irritated, I make mistakes. If the operating systems I used only printed a question mark, the years of electro-shock therapy might not have been necessary.

There must be a better way. There was. Bill Joy decided the standard editor sucked, and wrote an editor that did not require special hardware, and allowed you to see what your file looked like while you were editing it. It was a WYSIWYG editor for ASCII characters. He wrote a library called termcap to go along with his vi editor. Most people forget what a major breakthrough this was. Going through my 1980 edition of the Berkeley UNIX manual, I see that Bill Joy wrote the Pascal interpretor and compiler, along with vmstat, apropos, colcrt, mkstr, strings, whereis, whatis, vgrind, soelim, msgs, and the Pascal modifications to ctags and eyacc. He also wrote a significant part of the virtual memory, demand paging, and page replacement algorithm for Berkeley Unix.

Bill Joy also wrote the C shell. Twenty years later it's easier to find fault in the C shell, compared with current shells. But at the time, the C shell had many new ideas, and many still favor it for interactive sessions. Several years elapsed before the Korn shell was written. And several more years elapsed before it or similar shells became commonly available. I'd like to meet someone who feels they could have done a better a better job that Bill, in the same conditions.

Enough ranting.

C shell File Redirection

All shells support file redirection using "program > file" and appending to a file using "program >> oldfile." This only redirects standard output. If you want to capture both standard output and error output, the C shell has a simple syntax. Just add a "&" to the angle brackets. This also works with pipes. A list of the different combinations follows:

+---------------------------------------------------------------+
|Characters	 Meaning					|
+---------------------------------------------------------------+
||		 Pipe standard output to next program		|
||&		 Pipe standard and error output to next program |
|>		 Send standard output to new file		|
|>&	 Send standard and error output to new file	|
|>>	 Append standard output to file			|
|>>&	 Append standard and error output to file	|
+---------------------------------------------------------------+

This is very simple, and takes care of 98% of the needs of the typical user. If you want to discard standard output, an keep the error output, you can use

(program >/dev/null) >& file

This takes care of 99% of the cases. It is true the Bourne shell is more flexible when it comes to file redirection, but the C shell is very easy to understand.

The noclobber variable

One of the features the C shell has for new users is a special variable that can prevent a user from "shooting oneself in the foot." Consider the following steps to create and execute a script:

vi script
chmod +x script
script > script .out

One small typo, i.e. the space between "script" and ".out," and the script is destroyed. Here is another example:

program1 >>log
program2 >>log
program3 >>log
program4 >>lag

Because of a typo in the last line, the information is sent to the wrong log file.

Both problems can be prevented very easily. Just set the noclobber variable:

set noclobber

In the first case, you will get an error that the file already exists. The second will generate an error that there is no such file. When the "noclobber" variable is set, ">" must only point to a new file, and ">>" must only point to a file that already exists.

This seems like a great idea, but there are times when this feature gets in the way. You may want to write over an existing file. Or you may want to append to a file, but don't know if the file exists or not. If you want to disable this feature, type

unset noclobber

You may wish to keep this feature enabled, but disable it on a line-by-line basis. Just add a "!" after the angle brackets. This is like saying "I really mean it!" Here are some examples:

# Create new file
program >out
# overwrite the same file
program >!out
# append to a file, even if it doesn't exist.
program >>!log

The noclobber variable also affects the ">&" an ">>&" combinations:

#capture error and standard output
program >&! file
program >>&! log

If you sometimes use the noclobber variable, you have to change your style to use the exclamation point when needed. That is, when you want to append to a file that doesn't exist, or write to a file that may exist. A complete list of all of the variations, and their meaning, is below. Notice how the meaning changes depending on the noclobber variable:

+--------------------------------------------------------------------------------------+
|Characters	  Noclobber	   Meaning					       |
+--------------------------------------------------------------------------------------+
||		  Doesn't Matter   Pipe standard output to next program		       |
||&		  Doesn't Matter   Pipe standard and error output to next program      |
|>		  Not set	   Send standard output to old or new file	       |
|>		  Set		   Send standard output to new file		       |
|>&	  Not set	   Send standard and error output to old or new file   |
|>&	  Set		   Send standard and error output to new file	       |
|>>	  Not set	   Append standard output to old or new file	       |
|>>	  Set		   Append standard output to old file		       |
|>>&	  Not set	   Append standard and error output to old or new file |
|>>&	  Set		   Append standard and error output to old file	       |
|>!		  Doesn't	   Send standard output to new or old file	       |
|>&!	  Doesn't Matter   Send standard and error output to new or old file   |
|>>!	  Doesn't Matter   Append standard output to old or new file	       |
|>>&!	  Doesn't Matter   Append standard and error output to old or new file |
+--------------------------------------------------------------------------------------+

Safe Aliases

A second modification that Berkeley made was to add the "-i" options to the cp, mv and rm commands. These options warned the user if a file was going to be destroyed. The C shell supported an alias feature that allows you to define new commands. If you type

alias move mv

then when you execute the command "move," the C shell performs a substitution, and executes the "mv" command. The new options, along with the alias command, allowed new users to specify

alias mv mv -i
alias cp cp -i
alias rm rm -i

and before any file is destroyed, the system would warn you. If you define these aliases, and wish to ignore them, just place a backslash before the command:

\rm *

This turns off the alias mechanism.

C Shell Start-up Files

Since I've started talking about the C shell interactive features, it's time to discuss the start-up files, or files whose name starts with a dot. The C shell uses three dot-files:

+-----------------------------------------------+
      |File	  Purpose			      |
      +-----------------------------------------------+
      | .cshrc	  Used once per shell		      |
      | .login	  Used at session start, after .cshrc |
      | .logout	  Used at session termination	      |
      +-----------------------------------------------+

To be more accurate, the .logout file is more like a "finish-up" file, but I'll discuss that shortly.

The .cshrc file

The .cshrc file is scanned, or more accurately sourced, every time a new shell starts. The shell executes the source command on the file. If a C shell script starts with

#!/bin/csh -f

or you explicitly execute a C shell script with

csh -f script

Then the start-up file is not executed. Think of the -f flag as the fast option. I recommend that every C shell script start with the "#!/bin/csh -f" option. Without it, the C shell executes the $HOME/.cshrc file. Remember - this is the user's personal file. My file is different from yours. If I executed that script, it might not work the same as when you execute the script. A shell script that depends on the contents of the .cshrc file is likely to break when other users execute it: a bad idea.

It is important to understand when these files are sourced. When you execute a program like cmdtool, shelltool or xterm, The value of the SHELL environment variable is used, and that shell is executed. If the shell is the C shell, then the .cshrc file is sourced. If you execute a remote command using rsh, the shell specified in the /etc/passwd file is used. If this is the C shell, then .cshrc is used at the start of the process.

The .login file

The second startup file is the .login file. It is executed when the user logs onto a system. These sessions are called a login shell. Assuming you have specified the C shell as your default shell, typing your username to the "login:" prompt on a console, or using the telnet or rlogin command, then this is a login shell, and the .login file is sourced. Ever wonder how the shell knows this? The mechanism is simple, but most people don't know about it. When the login program executes a login shell, it tells the program that the first character of the program is a hyphen. That is, if you execute "-csh" instead of "csh." then you would be starting a login shell. You can prove this by copying or linking /bin/csh to a filename that starts with a hyphen:

cd /tmp
cp /bin/csh -csh
-csh

Try this and see. If you execute "csh." the .cshrc file is read. If you execute "-csh," both the .cshrc and .login files are read. A shell created in this fashion executes the .cshrc file first, then the .login file. Without the hyphen, just the .cshrc file is executed.

The .logout file

The last file the C shell executes is the .logout file. This only happens when the shell is a login shell.

What goes where?

Knowing when each file is used is very important if you want to keep your account as efficient as possible. People have a tendency to add commands to any file, or in some cases both files. Finally the system behaves the way the user wants, and the changes are kept where they are without understanding the whys and wherefores. As always, I believe it providing tables, allowing you to look up the exact behavior in each condition. This is a summary of those actions:

+-----------------------------------------------------------------+
|Condition		     Files sourced			  |
+-----------------------------------------------------------------+
|Logging on the console	     .cshrc, then .login, finally .logout |
|rlogin system		     .cshrc, then .login, finally .logout |
|rsh system		     .cshrc, then .login, finally .logout |
|rsh system command	     .cshrc				  |
|csh filename		     .cshrc				  |
|csh -f filename	     -					  |
|C shell Script without -f   .cshrc				  |
|C shell Script with -f	     -					  |
|Starting a new shell	     .cshrc				  |
|Opening a new window	     .cshrc				  |
+-----------------------------------------------------------------+

The tricky part about start-up files

There are a couple of problems people have with their start-up scripts. Let me list them.

Determining which commands are the ones you want to execute. You discover a useful setting, and want to add this feature to all sessions.
Learning when you want to execute these commands. Does this feature need to be set once, or for every shell?
Executing commands at the wrong time. Some people put "biff y" in their .cshrc file. This is wrong. It should be in the .login file.
Learning when the two files are sourced, and in which order. Some people think the .login file is executed before the .cshrc file. This seems logical, because the .login file is executed during login, but it is wrong. The .cshrc file is always sourced before the .login file.
Bloat. Don't add commands without thinking of where they go. Placing commands and options in the wrong place, may be meaningless, and slow down your shell.

All of these problems confound the new C shell user. So how can you distinguish between these different cases? Well, the C shell sets various variables under different conditions. The operating system also has variables, independent of the shell. I'll briefly describe the difference, and provide a template.

The C shell prompt variable

I mentioned earlier that you can check if a variable is defined by using "$?" before the variable name. For example, if variable "abc" is defined, then "$?abc" has the value of 1. The $prompt variable is defined when the C shell is in interactive mode. However, when the shell executes a script, the $prompt variable is undefined. Therefore if you have the code

if ( $?prompt == 0 ) then
# This is a shell script
else
# This is interactive
endif

If you set the prompt in your .cshrc file without testing that variable, then you will be unable to distinguish between interactive sessions and scripts. Why is this important? I have a large number of aliases. How large? I currently have about 300 aliases. You may or may not think this is large. It is significant, however. I noticed by shell was taking longer and longer to execute scripts. When I started customizing my C shell, I tried

if ( $?prompt ) then
...
...
# 300 lines later
endif

This does make the shell faster. However, I found there are two problems with this. As the number of aliases I had grew, it became harder to remember the association between the if/then/endif commands, because they were six pages apart. Good coding style says we should keep modules short and easy to understand. The other problem was a matter of efficiency. Even though the shell didn't have to execute the 300 lines of aliases, it still had to read each line, looking for the "endif" command. This slowed down the shell. Therefore I currently use something like this:

if ( ! $?prompt ) exit
if ( -f ~/.aliases ) source ~/.aliases

This keeps my .cshrc file short, and allows the shell to skip reading a large file when it doesn't have to.

The current terminal

A second important condition used to customize your shell session is the current terminal. This is learned by executing the program "tty." This will either be a serial device, like "/dev/ttya" or a pseudo terminal, like "/dev/pts/1" or the master console on the workstation, which is "/dev/console." Many users customize their shell, so that they automatically start up a window system. I often see something like the following in a .login file:

if ( "`tty`" =~ "/dev/console" ) then
# start up the window system
/usr/openwin/bin/openwin
endif

I place double quotes around the command. This is good practice, because if the command ever fails, the variable will have an empty string as a value. The double quotes prevent this from becoming a syntax error.

Local variables vs. Environment variables

There are two kinds of C shell variables: local and environmental. Local variables are local to the current shell. If a new shell is created, then it does not have these variables set. You have to set them for each shell. The .cshrc file is used to set local variables.

Environment variables are exported to all sub-shells. That is, if you set an environment variable, and then create a new shell, the new shell inherits the value of this variable. Inherit is the essential concept. The child can inherit the traits of the parent, but anything the child does to itself does not affect the parent. If you specify an environment variable before you start up the window system, then all new shells, i.e. all new windows, will inherit the environment variables from the parent shell. But if you set an environment variable to a different value in each window, this has no effect on the parent process, which is your login shell.

You can set environment variables in your .cshrc file. However, this is unnecessary, because any variable set in the .login file will be inherited by all sub-shells. There are two reasons you need to set an environment variable in the .cshrc file. The first is because you need to customize it for each shell. Perhaps different windows have different values. The second reason is that you need to look up something, by executing a program, and want to optimize your shell, so that this only has to be done once. Suppose you wanted to learn what your current computer is called. You could use the following test:

if ( "`hostname`" =~ "grymoire" ) then
...
endif

However, this executes the program "hostname" in every shell. If you want to optimize your shell, then only do this once. The logical place is to do this in your .login file. But you may want to use this information for something that is set in your .cshrc file. One way to solve this problem is to check for a special environment variable, and if it is not set, then execute the command, and set the variable. An example is:

if ( ! $?HOSTNAME ) then
setenv HOSTNAME `hostname`
endif

Other conditions

There are some other special cases. Many people perform different actions based on the current terminal type. If you log onto a system with a device local to the system, the terminal type is known. If you use rlogin or rsh to create an interactive session, the terminal type is communicated to the remote system. If you use the telnet command, the terminal type is unknown, and must be determined somehow. Once the terminal type is known, the user often customizes the keyboard configuration. In particular, some characters, especially the delete key, differs on different terminals. One terminal may have an easily accessible backspace key, and another has a convenient delete key. The command to modify this is the "stty" command. The .login file typically adjusts this parameter, based on the terminal type.

# if the terminal type is an HP terminal,
# change the delete character
if ( $TERM =~ "hp*" ) then
stty erase '^h'
endif

Sample startup files

Here is a sample .login file:

# Sample C shell .login file
# Created by Bruce Barnett
# This file is sourced after .cshrc is sourced
# set up the terminal characteristics
if ( -f ~/.terminal ) source ~/.terminal
# define the environment variables
if ( -f ~/.environment ) source ~/.environment
# set search path
if ( -f ~/.path ) source ~/.path

# Start up window system, but first, learn the terminal type once
if ( ! $?tty ) then
set tty = `tty`
endif

# You may wish to start a window system
# Here is one way:

if ( "$TERM" =~ "sun*" && "$tty" =~ "/dev/console" ) then
# some people like to wait 5 seconds
# echo "starting window system in 5 seconds"
# sleep 5;

# By using 'exec', then when you exit the window system,
# you will be logged out automatically
# without the exec, just return to the shell
/usr/openwin/bin/openwin
# exec /usr/openwin/bin/openwin

# - any more window systems?
# elsif ( $TERM =~ "abc*" && "$tty" =~ "/dev/console" ) then
# start up another window system

endif

And here is a sample .cshrc file:

# Sample .cshrc file
# Bruce Barnett 
# This part is executed on the following occasions:
#	1. "rsh machine command"
#	2. "csh scriptname"
#	3. All scripts that start with #!/bin/csh (without -f)	
# Read the minimum .cshrc file, if there

if ( -f ~/.cshrc.min ) source ~/.cshrc.min

# if run as a script, then $?prompt == 0
# if run as a remote shell, then $?prompt == 0 && $?term == 0
# if $USER is not defined, then "~" doesn't have the proper value
#     so bail out in this case

if ( ! ( $?USER && $?prompt && $?TERM )) exit

# This is an interactive shell

#---Local variables
# examples:
#     set noclobber
#     set myvariable = "value"
#     set mylist = ( a b c d e f g)


#----aliases
if ( -f ~/.aliases ) source ~/.aliases

#----Searchpath
if ( -f ~/.path ) source ~/.path

Click here to get file: Cshrc1

C Shell Searchpath

An essential part of Shell Mastery is understanding what a searchpath is, and how to optimize it. Following the principle of modularity, most of the UNIX commands are separate programs. Only a few are built into the shell. This means you can change your shell, and still use 99% of the commands without change. These external programs may be scattered in dozens of directories. When you ask for a command that the shell doesn't understand, it searches the directories in the order specified by the search-path, and executes the first command it finds with the name you specified. And trust me on this, systems that do not behave consistently are bad for your mental health. I knew a programmer who was writing software for a system that behaved unpredictably. He receives excellent care nowadays, but he always asks me the same question. "Two plus two is ALWAYS four, right?" Poor guy. The world will never be safe for programmers until we eliminate all non-deterministic systems.

Bourne shell and C shell paths

The searchpath is stored in an environment variable "PATH." You set or change the Bourne shell PATH variable using commands like:

PATH=/usr/bin:/usr/ucb:/usr/local/bin
PATH=$PATH:$HOME/bin
EXPORT PATH

The C shell has a different syntax for setting environment variables:

setenv PATH /usr/bin:/usr/ucb:/usr/local/bin
setenv PATH ${PATH}:~/bin

Notice that the tilde can be used instead of $HOME. The curly brace is necessary in this case, because the C shell has the ability to perform basename-like actions if a color follows a variable name. The curly braces turn this feature off. Without the braces, you would get a syntax error. The braces could be added in the Bourne shell example above, but it isn't required.

The C shell has an alternate way to modify the searchpath, using a list. Here is the same example as before:

set path = ( /usr/bin /usr/ucb /usr/local/bin )
set path = ( $path ~/bin )

The variable name is lower case, the syntax is the list form, and a space is used to separate directories. The Bourne shell uses a colon as a separator, and a blank value is used to indicate the current directory. Since any number of spaces can be used as a separator, something else must be used to indicate the current directory. The period is used instead. This is logical, because "." refers to the current directory. The following command:

set path = ( . $path )

specifies that the current directory is searched for commands before all other directories. Important! This is a security problem. See the sidebar to fully understand the danger of this action.

It might seem confusing that there are two path variables, one in upper case, and the other in lower case. These are not two different variables. Just two different ways to set the same variable. Change one, and the other changes.

Because the C shell uses a list, this allows some convenient mechanisms to examine and manipulate the searchpath. For instance, you can add a directory to the third place of a searchpath using

set path = ( $path[1-2] ~/new $path[3-] )

Examining files in the searchpath is easy. Suppose you want to write a simple version of the "which" command. The C shell makes this an easy job:

#!/bin/csh -f
# A simple version of which that prints all
# locations that contain the command
if ( $#argv != 1 ) then
echo "Wrong number of arguments - usage: 'which command'"
exit 1
endif
foreach dir ( $path )
if ( -x $dir/$1 ) echo $dir/$1
end

Click here to get file: Which.csh

Here is the same script using the Bourne shell:

#!/bin/sh
# A simple version of which that prints all
# locations that contain the command
file=${1:-"Wrong number of arguments - usage: 'which command'"}
paths=`echo $PATH | sed '
s/^:/.:/g
s/:$/:./g
s/::/:.:/g
s/:/ /g
'`
for dir in $paths
do
[ -x $dir/$file ] && echo $dir/$file
done

Click here to get file: Which.sh
As you can see, the Bourne shell is much more complicated. Surprisingly, my measurements show the Bourne shell version is faster. Well, I found it surprising. I expected the C shell version to be faster, because it doesn't use any external programs. The Bourne shell version executes three additional processes because of the back quotes. Therefore four programs compete with one C shell script. And the C shell still loses. Hmmm. Does this tell you something?

The system comes with a C shell script called "which." Not only does it find the first reference, but it reports if the command is an alias. This is fine, but I prefer the above script, because it tells me about all of the commands, and runs much faster. I called it "Which," with a capital "W," so I can use either one. The Korn shell has a build-in command, called "whence."

When to change searchpaths

Most people specify their searchpath in their ".cshrc" file. But this really isn't necessary. Like all environment variables, all newly created shells get their environment from their parent. Some people therefore specify it in their "login" file. All new shells will have this searchpath. I set my searchpath before I start my window system. This is very flexible, and quite easy to do. Just create a shell script that specifies your new searchpath, and then start up the windowing system. If you use OpenWindows, an example might be

if ( ! $?OPENWINHOME ) then
setenv OPENWINHOME /usr/openwin
endif

set path = ( $path $OPENWINHOME/bin )

$OPENWINHOME/bin/openwin

If you want to add a new directory to your searchpath, change it. If you then create a new window using a command like shelltool, or xterm, that window will inherit the searchpath from their parent. I specify my searchpath in a file called ".path." It contains something like this:

set mybin = ( ~/bin )
set standardbin = ( /usr/ucb /usr/bin /bin )
if ( $?OPENWINHOME ) then
set winpath = ( $OPENWINHOME/bin )
else
set winpath = ( )
endif
# extra lines omitted
set path = ( $mybin $standardbin $winbin )

I start my window system like this:

#!/bin/csh -f
# start OpenWindows
setenv OPENWINHOME /usr/openwin
source ~/.path
$OPENWINHOME/bin/openwin

Click here to get file: OpenWin.csh
This way I have one place to change my searchpath for all conditions. Any time I want to, I can define or undefine variables, and source the same file to reset my path. This allows me to use different windowing systems, or different versions, and have one main file to control my search-path.

Another way I change my searchpath is with an alias. You may want to define the following aliases:

alias .path 'set path = ( . $path )'
alias poppath 'set path = ( $path[2-] )'

The ".path" alias adds the current directory, while "poppath" removes the first one from the list.

You can make aliases as simple or as complicated as needed. For instance, you can radically change your environment with the simple

alias newstuff "source ~/.anotherpath"
alias newwin "source ~/.anotherpath;cmdtool&"

You can create multiple personalities, so to speak.

Undoing any changes

One of the simplest way to reset your path is to type the new path on the command line. If you have problems, you can always change your ".cshrc" file to have a precise path, or have it source the ".path" file you created. After this, all new shell windows you create will have the new path. Another convenient way to undo the change is to execute another shell. That is, before you experiment, type

csh

Then modify your searchpath. When done, type a Control-D. This forces the current shell to terminate, and the environment variables of the earlier shell are restored.

Summary

I've discussed several ways to change your searchpath. As a summary, here are the methods:

Explicitly set the path.
Specify it in your .cshrc file.
Specify it in your .login file, or before you start the windowing system.
Specify it in another file, and source this file.
Execute a shell script that changes the path, and then executes another program or shell.
Use an aliases to do any or all of the above.
No one method is right for everyone. Experiment. But don't think you have to set your searchpath in your ".cshrc" file. Other solutions are possible. In fact, next month, I will discuss how to optimize your searchpath.

Sidebar

Some people put their current directory in their searchpath. If you type

echo $path

and you see a dot, then you are vulnerable. If you ever change directories to one owned by someone else, you may be tricked to execute their commands instead of the ones you expect. Don't believe me? Okay. You've forced my hand. Suppose I created a file called "ls" that contains the following command:

/bin/rm $HOME/*

If I placed it in the "/tmp" directory, then as soon as you typed

cd /tmp
ls

all of your files would be deleted!

If you must include the current directory in your searchpath, put it at the end:

set path = ( $path . )

This is a little better, but you can still fall victim to the same trap. Have you ever made a typo when you executed a command? I could call the script "mroe" or something similar, and still delete all of your files. Now - are you willing to risk this? I hope not. I tell people how dangerous this is. Remember, other actions could be taken instead.

Personally, I don't have the current directory in my searchpath. It was painful at first, but I soon learned how to adjust. When I want to execute a command in the current directory, I just type

./command

I do this when I debug the script. When done, I move the command into my private "bin" directory:

cp command ~/bin

Is that so hard? I sleep much better at night.

Optimizing the C Shell Searchpath

Last month I discussed various ways to modify your C shell searchpath. This month, I will discuss ways to optimize your path.

What do I mean by optimization? Well, to tell the truth, I have looked at a lot of C shell start-up files, and I shudder at what I see. Some people have dozens of directories in their searchpath. I also see people walking around with a gloomy look, as if a storm cloud is circling over their head.

"What's the matter?" I ask.

"My file server is down. I can't get any work done." they reply.

"Oh." I reply. "I didn't notice."

And it's true. A server went down, and it didn't bother me at all because it wasn't in my searchpath. When some people discover a directory that contains useful programs, they add it to their searchpath. The searchpath grows and grows, until it becomes so convoluted, no one understands it. That's the wrong thing to do.

You see, I created a special directory that contained symbolic links to files on remote systems. Because the directory is on my local workstation, I am not affected if a server goes down. The only time I have a problem is if I use one of those executables. That process will freeze, but only that process. All I have to do is create a new window, and continue working. I call this a cache dirctory, and I know that isn't the best name for it. A better name ought to be "favored." but what the heck.

I will admit that solving this problem isn't trivial. In fact, I wrote a program to help me figure out what to do. Several programs, as it turns out. Let me describe them.

Programs to Optimize the Searchpath

The first script, and the most important one creates and maintains the cache directory. If you want to use the directory "/local/cachebin" as a cache directory, and want to eliminate the directory "/public/bin," just type

CreateCacheDir /local/cachebin /public/bin

This will look at all of the executables in "/public/bin" and create a symbolic link to them in the "/local/cachebin" directory. You can then remove "/public/bin" from your searchpath, and if the server providing this directory goes down, you will not be affected. The script CreateCacheDir follows:

#!/bin/sh
# Argument #1 is the directory where we will cache
# a filename. Actually, not a cache, but a link.
# where is the cache directory?
# Usage
# CreateCacheDir Cachedir dir1 [dir2 ...]
# Function:
# - Create a symbolic link in cachedir, pointing to the files in dir1, etc.
#
CACHE=${1:?'Target directory not defined'}
if [ ! -d "$CACHE" ]
then
echo making cache directory "$CACHE"
mkdir $CACHE
fi

shift

# The rest of the arguments are directories to cache

verbose=false # true to see what happens
debug=false # true if you want extra information
doit=true # false if you don't want to change anything

for D in $*
do
$verbose && echo caching directory $D
# list files, but ignore any that end with ~, or # or % (backup copies)
for f in `cd $D;ls|grep -v '[~#%]$'`
do
if [ -f $CACHE/$f ]
then
$debug && echo $CACHE/$f already exists
else
if [ -f $D/$f -a -x $D/$f ]
then
echo $D/$f
$verbose && echo ln -s $D/$f $CACHE/$f
$doit && ln -s $D/$f $CACHE/$f
elif [ -d $D/$f ]
then
$verbose && echo linking directory: ln -s $D/$f $CACHE/$f
$doit && ln -s $D/$f $CACHE/$f
else
$verbose && echo linking other: ln -s $D/$f $CACHE/$f
$doit && ln -s $D/$f $CACHE/$f
fi
fi
done
echo you can now take $D out of your searchpath

done

Click here to get file: CreateCacheDir.sh
This does the work, but how do you know which directories to replace? I wrote some programs to measure how good (or bad) my searchpath is. You might find them useful.

Programs to measure your efficiency

One way to simplify the searchpath is to identify directories that are really symbolic links to other directories. For instance, there is no reason to have both /bin and /usr/bin in your searchpath, if one points to the other. This script, called ResolveDir, will identify these cases:

#!/bin/sh
# Find the unique and final location of a directory
# Usage:
# ResolveDir directoryname
#
# If the directory is called ".", return "."
dir=${1:?'Missing argument'}

[ "$dir" = "." ] && { echo "."; exit 0; }
cd $dir >/dev/null 2>&1 && { pwd; exit 0; }
exit 1;

Click here to get file: ResolveDir.sh
The next script, when given a directory, returns the system name on which the directory resides. If the file is on the local system, it returns the host name. This script is called Dir_to_System.

#!/bin/sh
# Given a directory, return the system it is mounted on
# If it is localhost, then echo `hostname`
dir=${1:?'No directory specified'}
cd $dir

# On SunOS 4.x, use /usr/bin/df before /usr/5bin
# On Solaris 5.x, use /usr/ucb/df before /usr/bin
# Solve the problem by specifying the explicit path
PATH=/usr/ucb:/usr/bin:/usr/5bin:$PATH
export PATH

x=`df . | grep -v Filesystem`
# use expr to extract the system name
server=`expr "$x" : '(.*):'`
# with sed, I could do $server=`echo $x | sed 's/:.*//'`
if [ "$server" ]
then
echo $server
else
hostname
fi

Click here to get file: Dir_to_System.sh
Using these programs, I wrote some programs that evaluate my current searchpath. Since I am talking about the C shell, I started to write the script in this shell. However, because of limitations of the C shell, I have to split one script into three files. The main script, AnalyzePaths, calls two scripts:

#!/bin/csh -f
# this could be one script instead of three if
# we were using the Bourne Shell
# But the C shell isn't very good at piping
# commands inside of loops.
#
# Therefore we generate the pipe in one script,
# and feed AWK in another script
GetPaths | analyze_paths.awk HOSTNAME=`hostname`

Click here to get file: AnalyzePaths.csh
The script GetPaths Outputs one line for each directory in the searchpath. Each line contains the directory name, the final name (if a symbolic link) and the system that provides the directory:

#!/bin/csh -f
foreach p ( $path )
if ( -d $p ) then
set newp = (`ResolveDir $p`)
set server = (`Dir_to_System $p`)
echo "${p}:${newp}:${server}"
else
echo "${p}:?:?"
endif
end

Click here to get file: GetPaths.csh
The last script is a complex nawk script, but it outputs useful information. For instance, it reports directories that don't exist, redundant directories, and specifies remote directories that can be eliminated. For example, it could report

Directories that don't exist on this system:
/local/bin/SunOS
/usr/etc
You are dependent upon the following systems:
server1, directories: /home/barnett/bin
Directory /usr/X11R6/bin used 2 times (symbolic links: 1)
symbolic links for '/usr/X11R6/bin' are: /usr/X11/bin

The script is:

#!/usr/bin/nawk -f
# Do this before you read any input
BEGIN {
FS=":";
}

# do this for each line
(NF==3) {
if ($3 ~ /?/ ) {
# then it is a directory that does not exist
missing[$1]=1;
number_of_missing++;
} else if ( $1 ~ /./ ) {
# ignore it - it is the "." directory
} else {
# count how many times each directory is used
used_count[$2]++;
# is it a duplicate,
if ($1 !~ $2) {
links[$2]++;
# remember it, by catenating two strings
used[$2] = used[$2] " " $1;
}

# Is it a remote system?
if ($3 !~ HOSTNAME) {
systems[$3]++;
# if this is the first time we have seen this directory
system_to_dir[$3] =
system_to_dir[$3] " " $1;
remote_systems++;
}
}

# printf("%st%st%sn",$1, $2, $3);
}
# Do this at the end of the file
END {
# Any directories that do not have to be included?

if (number_of_missing>0) {
printf("Directories that don't exist on this system:n");
for (i in missing) {
printf("t%sn", i);
}
}

# how many computer systems are needed?

if (remote_systems) {
printf("You are dependent upon the following systems:n");
for (i in systems ) {
printf("tsystem %s, directories: %sn", i, system_to_dir[i]);
}

} else {
printf("Good! You are not dependent upon any other servers,");
for your current directoryn");
printf(" except for your current directoryn");
}

# What about duplicate directories?

for (i in used ) {
if (used_count[i]>1) {
printf("nDirectory %s used %d times (symbolic links: %d)n",
i, used_count[i], links[i]);
if (links[i]>0) {
printf("tsymbolic links for '%s' are: %sn", i, used[i]);
}
}
}

}

Click here to get file: analyze_paths.nawk
There you go. Using these scripts will give you a set of tools to improve your searchpath. Do it right, and may you never have your workstation freeze again.

Specifying system-specific searchpaths

So far, I've explained how to optimize your searchpath. But that is only part of the problem. If you only used one computer, simplifying your searchpath would be easy. But logging onto many different computers makes life "interesting," as the ancient Chinese curse goes. You could do what most people do, and keep adding directories ad nauseum until every possible directory is in your searchpath. Of course, as I explained last month, this makes your account as stable as a one-legged elephant. I have a solution, based on specific rules. Here is the short list:

Understand directory characteristics.
Define a consistent naming convention.
Define a strategy.
Reduce the complexity.
Use local link/caching directories.
Consider a local home directory

Let me describe each of these rules.

Understand directories characteristics.

The UNIX file system was designed to be the single common structure to perform all operations. All input and output, all devices, and the internals of the operating system are reachable by using special files. With NFS, this was extended to include remote files, making them look like local files, and making all file-based UNIX utilities capable to operating on remote files.

However, all directories don't have the same characteristics. Here are some of the variations:

·: A directory that contains standard architecture-specific executables. The directory is read-only, and identical on other systems of the same architecture. Example: /usr
·: System-specific configuration files. Typically writable by super-users only. Example: /etc
·: System-specific directory to hold temporary files. Writable by various processes, programs, and users. Often not backed up. Not visible by other systems. Example: /var
·: Same as the above, but backed up frequently. Example: /private
·: Common files shared across multiple groups. Log onto other systems, and see the same files. Example: /home/server/project
·: Directories which contain locally added executables. This may be private to the system. Example: /local
·: Same as above, but containing executables that are shared among several systems of the same architecture. Examples: /usr/local, /project/bin
·: Same as the above, but containing executables and/or data for multiple architectures. Example: /public

As you can see, there are many different characteristics, and every user has to understand the difference. Make sure you understand how the files at your site are organized. Most new workstations have lots of disk space, and it is rare that the systems don't make it available. Users should know if the files are visible to other systems, and if the files are backed up or not.

Some directories are project-related. Executables are only used when working on the project. Other directories, such as "/projects/bin," are used for certain rarely-used functions. These are what I call "optional" directories. If they are missing, you can still do other work. Other directories are mandatory.

Define a consistent naming convention.

Some companies do not have a consistent naming strategy. This is a serious problem. If the location of a single file changes when different machines are used, then the user has to deal with this extra complexity, which is very difficult, and if not taken care of, can lead to irritability, moroseness, and eventually--insanity. Sun's automounter can help. For instance, you can set up a network so that your home directory always has the same name, even if it is moved to another system. Some sites also use a mechanism such as "/work/projectname" to specify the location of a project. Another popular and practical convention is to specify machine-specific files using a path that contains the name of the machine. If all machines use the same convention, then you can make sure the "data" directory on computer "server" is called "/home/server/data" is the same directory on all machines.

Define a strategy.

You may want to write up your strategy. This helps clarify some of the decisions. For instance, here is one similar to what I use, leaving out the standard directories:

/home/systemname - Shared system-specific files. Backed up.
/work/Projectname - Project directories.
/private - Local to system, not backed up, unless I do it myself.
/local - Local to current system. Not shared.
~/SunOS/5.5.1/sun4u/bin - Executables for a specific system type

The important directories are the last two, because these contain the executables for my computer. But deciding on a naming strategy is the first step.

Reduce the complexity.

It doesn't matter what the convention is, as long as it is consistent. But if it's too complicated, errors can occur. When possible, keep things simple. Remove non-essential directories. If you don't need to use certain executables all the time, then add them when you need them using an alias. To repeat the lesson from two months ago, don't set the path in your ".cshrc" or ".login" file. Inherit it from your environment. When you want to add a directory, change your environment for that window.

Use local link/caching directories.

Once you have removed all of the directories you use occasionally, you still might have several directories that are on other systems. What then?

The solution is simple. Create a directory on your local system, that contains symbolic links to the real files. Then remove the original directory from your searchpath, and add the new one. I discussed this in detail last month. But there is one more step.

Consider a local home directory

Most large installations share user home directories. No matter which system you log onto, you are in the same spot. This is convention, but there are two problems. The first concerns efficiency, and the second security. The efficiency is a concern if the server providing this file goes down. The security is a concern, because anyone who can gain superuser access on any system that mounts that home directory, can break into that account if you use rlogin, and have access to your files. To solve both problems, create a directory on the local system that has copies of some of the critical files, like ".cshrc" and ".login." Since the ".rhosts" file is not available to other systems through NFS, A hacker cannot modify your ".rhosts" file and log onto your workstation. A better solution is to eliminate the "rlogin" and "rsh" services, using a program like "ssh" instead. But that's another topic.

Having a home directory local to your main computer can increase your efficience, by removing the dependance on other systems. You can have your own directory for executables, for instance.

Customizing your environment

I'll go through it a step at a time. Remember, I do not recommend you always set your searchpath in your ".cshrc" or ".login" file. Set it when you need it, like starting up a window system. Assume the searchpath is controlled by a file called ".path." Here is my version of this file:

# example .path file

if ( -f ~/.path.default ) then
source ~/.path.default
else
# this is a default searchpath
set path = ( ~/bin /usr/ucb /usr/bin /usr/local/bin /local/bin )
endif

Click here to get file: Path1.csh
The system searches for a file called ".path.default." If this file is found, then it is used. Otherwise a default searchpath is specified. The file ".path.default" is a bit more complicated, but it is designed to handle special cases in a simple manner:

# .path.default modified for SunOS

if ( ! $?HOSTNAME ) then
setenv HOSTNAME `hostname`
endif

# get host specific path, if there
if ( -f ~/.path.$HOSTNAME ) then
source ~/.path.$HOSTNAME
endif

if ( ! $?SYSTEM ) then
set SYSTEM = "`uname -s`"
endif

# get system-specific searchpath, if there
if ( -f ~/.path.$SYSTEM ) then
source ~/.path.$SYSTEM
endif

# define these if .path.$SYSTEM doesn't

# define the system defined paths
if (! $?SYSPATH ) then
set SYSPATH = ( /usr/ucb /usr/bin /bin )
endif

# define the places a window system's files are located

if (! $?WINPATH ) then
set WINPATH = ( /usr/X11/bin /usr/X11R6/bin /usr/openwin/bin )
endif

# Private executables
if ( ! $?MYBIN ) then
set MYBIN = ( ~/bin ~/bin/$SYSTEM )
endif

# local to machine
if ( ! $?LOCALBIN ) then
set LOCALBIN = ( /local/bin /local/bin/$SYSTEM )
endif

#set CACHEBIN = ( $HOME/cachebin )
set CACHEBIN = ( )

#If TOTALPATH is defined, use it, else build it up from the pieces
if ( $?TOTALPATH ) then
set path = ( $TOTALPATH )
else
set path = ( $CACHEBIN $MYBIN $LOCALBIN $WINPATH $SYSPATH )
# set path = ( $CACHEBIN $MYBIN $LOCALBIN $WINPATH $SYSPATH . )

endif

Click here to get file: Path.default
There are many things to point out in this file. First - let me describe the variables

HOSTNAME - The system hostname
SYSTEM - The operating system type
SYSPATH - The standard vendor-supplied directories
WINPATH - Directories used for the window system
MYBIN - Directories that contain my personal executables
LOCALBIN - Where extra executables are kept
CACHEBIN - My local cache directory

Suppose you are on a system called "pluto." If so, then the script searches for the file ".path.pluto." If this file is found, it is sourced. This file can be quite simple. For instance, if the file ".path.pluto" contained these two lines only:

set SYSTEM = "FunnyUnix"
set TOTALPATH = ( ~/bin /usr/bin /usr/ucb /usr/local/bin )

Then the value of TOTALPATH will be used to set the searchpath. No other programs will be executed. The system doesn't even need the "uname" executable.

Here is another example, names "path.neptune" for a computer called neptune:

set SYSTEM = "SunOS"
set MYBIN = ( ~/bin ~/SunOS/bin )

In this case, the file ".path.SunOS" would be sourced, and the default values in that file would specify the searchpath The exception would be the value of "MYBIN." I have set up these files so that each system can have it's collection of paths that define the searchpath. You can customize the directory used for the window system on a particular machine, yet still add a common NFS directory to all systems, by modifying the value of $path near the end of the file. Still, this doesn't solve all cases. In particular, SunOS 4.X and 5.X systems have different searchpaths. Here is a file that shows one way to automatically change the searchpath based on the version of the operating system:

# .path.SunOS
# copied from .path.default, and modified

if ( ! $?SYSTEM ) then
set SYSTEM = "`uname -s`"
endif

# Having this here would cause an infinite loop
# Make sure it is commented out
#if ( -f ~/.path.$SYSTEM ) then
# source ~/.path.$SYSTEM
#endif

# define these if .path.$SYSTEM doesn't

if ( ! $?MACHINE ) then
set MACHINE = "`uname -m`"
endif

if ( ! $?REV ) then
set REV = "`uname -r`"
endif

# define the system defined paths
if (! $?SYSPATH ) then
if ( "$REV" =~ 4.[012].* ) then
set SYSPATH = ( /usr/ucb /usr/bin /usr/5bin /bin /usr/etc )
else if ( "$REV" =~ 5.[0-6].* ) then
set SYSPATH = ( /opt/SUNWspro/bin /usr/ccs/bin
/usr/ucb /usr/bin /usr/sbin )
else
# How did I get here?
set SYSPATH = ( /usr/ucb /usr/bin /bin )
endif
endif

# define the places a window system's files are located
# I could look at DISPLAY, and change this depending on
# the value

if (! $?WINPATH ) then
if ( ! $?OPENWINHOME ) then
setenv OPENWINHOME /usr/openwin
endif
set WINPATH = ( $OPENWINHOME/bin )
endif

# define the places where architecture-specific binaries are
#
if ( ! $?MYBIN ) then
set MYBIN = ( ~/bin ~/$SYSTEM/$REV/$MACHINE/bin )
endif

if ( ! $?LOCALBIN ) then
set LOCALBIN = ( /local/bin )
endif

#set CACHEBIN = ( )
set CACHEBIN = ( $HOME/cachebin )

# If I set this,
# then the variables in .path.default will be skipped
#set TOTALPATH = ( $CACHEBIN $MYBIN $LOCALBIN $WINPATH $SYSPATH )

Click here to get file: Path.sunos
As you probably noticed, I used

~/$SYSTEM/$REV/$MACHINE/bin

as a directory for a machine-specific searchpath. This becomes, when evaluated,

~/SunOS/5.5.1/sun4u/bin

You can move an executable there, and make links from other directories to that file if you want to. Other approaches also make sense. You could have a searchpath that includes several directories in a particular order, such as the following:

~/bin
~/SunOS/bin
~/SunOS/5.5.1/bin
~/SunOS/5.5.1/sun4u/bin

The first one could hold all shell scripts. The second one could contain all Sun specific shell scripts. The third one could contain all executables for a Solaris 5.5.1 system. And the fourth directory can contain machine-dependent executables. Or you could use underscores instead of slashes:

set MYPATH = ( ~/bin ~/SunOS_bin ~/SunOS_5.5.1_bin )

I hope you find this setup easy to use, and easy to modify. It is a bit complex, but it offers a lot of flexiability. I hope you find it useful.

C Shell History - Forward into the Past

Prehistoric History

In ancient times, before mice existed, large beasts roamed the earth. These included band printers, 9 track tape drives, (oddly 9-track was before 8-track), and disk platters. The most fearsome of all the beasts was the TeleType, able to cause piercing headaches. Quieter machines evolved, like the Silent 700, and the DecWriter. A new beast was seen wandering the dark and desolate laboratories. It was quick. It was silent. It was the VKB, or Video Keyboard, a choice morsel for those higher in the evolutionary plane i.e. the Hacker (Hackerus Keyboardus).

A common sound was the stampede of fingertips, as the Hacker stalked his or her prey - the perfect program. Faster, faster went the fingers, but it was never fast enough. That perfect program was just around the corner, but rarely was it caught.

One hacker, in an effort to go faster still, decided to enhance his shell in such a way that all he had to do was take a tiny step in a direction, and the shell knew what he wanted to do. That hacker added history. And thus was the history mechanism born.

Enabling History

Nowadays shells (like ksh) support a history mechanism that can visually modify the command line, allowing the user to edit the command line, call up previous commands by pressing the up arrow. The C shell does not do this. It was designed for any terminal, including the prehistoric hard-copy terminals. The C shell has no problems running within Sun's cmdtool, for instance. To enable history, set the history variable:

set history = 100

This tells the shell to remember the last 100 commands. You can make this number bigger or smaller. But the larger the number, the more memory the shell uses.

The command "history" will print out the history information. If you specify a number such as 20, it will list the last 20 commands. You can save a series of commands, and read them in later:

history -h 20 >file
... (do something else )
source -h file

The "-h" option is used to make the format suitable for sourcing files. The history command normally adds a number before each line for reference. The "-h" turns this off, so it can be sourced. You can reverse the order by adding a "-r" to the history command. The "source -h" command reads the file in, but does not execute the commands.

Automatic History Saving

You can automate this. The "savehist" variable specifies the number of lines to remember. Therefore, when you type

set savehist = 10

and you log out, the program will save the last 10 lines in the file "~/.history." The next time the C shell starts up, it will read this file, effectively executing the

source -h ~/.history

command. Large values of savehist will slow down the C shell during the start-up process.

Changing your prompt

The C shell numbers each command for reference. Many people like to see these numbers automatically. The C shell allows you to put this number into the prompt. Placing the magic character "!" in the prompt does this:

set prompt = '!% '

The command "history" will display the list of past commands. Add a number afterwards, and it limits the list to that number of commands. Some people create aliases to make it easier to view the history:

alias h history

alias h "history 20"
alias H "history 100 | more"

Repeating Past Commands

The history mechanism isn't just used to keep track of your past commands. You can execute them again very easily. The command

executes the last command again. Besides repeating the same command, it can be used in other ways:

ls file?
# now rename each of these files
foreach i (`!!`)
mv $i $i.old
end

Repeating older commands

You can execute ANY command stored in the history list. If you know the number in the history list (or the number that was in the prompt), just specify the number. Suppose the history command listed the following:

31 vi file1.c
32 make file1
33 ./file1

You could edit the file again with

!31

You can also specify the relative position, by putting a minus before the number. The command "!-2" executes the command before the last. Remember, the position is relative. So you can guess the relative position, and if you get the wrong command, type the same history command, and you will execute the next one in the list. In the vi/make/file1 example before, you can keep executing "!-3" to execute the same three commands in a rotating order.

A third form uses the first few letters of the command. To repeat the vi command, you can use

You only need to specify enough letters to match the desired command. The first matching command, searching backwards, is executed. This is my favorite method. It is not necessary to know the number, or to list the history.

Suppose you typed the following commands:

make prog1;./prog1
make prog2;./prog2

If you type any of the following commands, you will execute the second make command:

!m
!ma
!mak
!make

You cannot specify a pattern with a space. For instance, the command

!make prog1

do not execute the first make command. To execute the first make command, without knowing the number, you can tell the C shell to search for a string, and execute the command that contains the string. This is done with the special pattern "?pattern?." Therefore, to execute the first make command, type

!?prog1?

Appending to a previous command

Often I find myself executing a command, and wish to repeat the previous command, but append something to the end. Just execute the "!!" command, and append the addition to the end. (This is why you cannot search for a command that contains a space). I often develop complex filters on the fly. I type a command, and add filters one step at a time. Here is an example, where I want to find all lines that contain "keyword" but not "ignore." Once I find this, I want the second word of the line, sorted. One way to do this, without writing a shell script, is:

grep keyword file*
!! | grep -v ignore
!! | sort
!! | more

If you want to append to the last command, without adding a space, you can. The following will print files "prog" and "prog1"

lpr prog
!!1
# same as
# lpr prog1

Suppose you want to append to a command that was not the last command? This is no problem if you want a space between the old command and the new word. If you edited file1.c, and wanted to edit that file and a second file, you can type:

!v file2.c

This edits file1.c and file2.c Now suppose you type the following commands:

vi File
chmod +x File
./File
cp File File1

and now you want to edit File1 without typing the complete line? You can't type

!v1

and

!v 1

will edit the wrong files. There are two solutions. The first is to execute the vi command again, and then append a "1"

!v
!!1

There is another way. Remember that most UNIX shells have two ways to specify a variable, $x and ${x}? The C shell history mechanism has a similar feature. To append the number "1" to the previous vi command, type

!{v}1

This works for all of the examples discussed. That is, you can use the curly braces to enclose the history specification:

!{-2}
!{32}
!{?File1?}

Adding the number "1" without adding a space can be done by typing:

!{-2}1
!{32}1
!{?File1?}1

Using history to reuse words

So far, I have only talked about using the history feature to repeat entire command lines. You can also use it to repeat words. The special variable "!$" refers to the last word on the last line. This is very useful, if you want to save keystrokes. Here is an example:

touch File
chmod +x !$
vi !$
./!$
cp !$ !$.orig

This is the same as typing

touch File
chmod +x File
vi File
./File
cp ./File ./File.orig

The variable "!^" refers to the first argument on the line. That is, instead of typing

touch File
cp File File.backup
vi File
chmod +x File
File abc

you could type

touch File
cp !^ !^.backup
vi !^
chmod +x !^
!$ abc

!* - All of the arguments

The string "!!*" refers to all of the arguments on the previous line, except the first. You can use this to reuse arguments:

ls *% %~ *.old *.backup
# A lot of files - move them into a directory
mv !* OldDirectory

In this case, "!*" and "!!*" mean the same. There is another mechanism for specifying part of the argument list. A hyphen can specify a range. by itself, it's the same as the entire line, including the command. Put a number afterwards, and it places a limit on the range:

ls f1 f2 f3 f4 f5 f6 f7 f8 f9
!!-4
# this is the same as
ls f1 f2 f3 f4 f5

I mention this for completeness. Next month I will discuss more elaborate variations. Notice that "!-4" and "!!-4" are not the same.

!% - find the word

I've mentioned that you can search for a command containing a particular word. For instance, to repeat the command containing "ABC" type:

!?ABC?

It doesn't matter where the string "ABC" is on the line, the command will be executed again. But sometimes you don't want to execute the command again. You just want to use the word. As an example, suppose you execute

more A_file_with_a_long_name

Now suppose you type some other commands, and then decide to print the file. You could type

!more
lpr !$

If you executed the more program a second time, this will not work. You could type

history | grep A

and then repeat the command. Or you could search for that line, and print the line out without executing it, by typing

# print the command containing "A"
echo !?A?
lpr !$

But there is an easier way. If you use the character "%" after a search, it is the same as the entire word matched. In other words, you can search for the line, extract the word, and use it in one step:

lpr !?A?%
# same as
# lpr A_file_with_a_long_name

Editing the last command

One of the most useful features of the C shell is the ability to correct a mistake. If you make the following typo:

mroe File

and you want to change "mroe" to "more," you can simply type:

^ro^or^

The shell changes "ro" to "or," and repeats the commands.

Someone onced asked me if there was a way to do this with a function key. They kept pressing the backquote by mistake. That is, they typed

ls`

and wanted an easy way to fix this. If this was done in a shelltool window, there is a solution. Place the following in your .ttyswrc file:

# i.e. csh history
mapi F7 ^`^n

This maps the F7 function key to effectively type

^`^

and type return at the same time. You can also use this to duplicate other common operations. Alas - this only works with shelltool and not in cmdtool mode.

One more thing, recent versions of Solaris do not require you to type the second up-arrow. If you want to delete the letter "x" from the previous command, just type

and the letter will be deleted.

Word Modifiers in History

Last month I discussed the history mechanism. I showed how you could use it to execute commands without typing the entire command again, by using the "!" character.. I also discussed the substitution mechanism, triggered by using the "^" character. Different characters can be used, by modifying the value of the histchars variable, which has the default value of "!^."

There are three types of history substitutions: events, words, and word modifiers. I discussed event designators last month. These correspond to lines typed at the keyboard. A summary of event designators follows:

+-----------------------------------------------------------+
|Format	  Example   Meaning				    |
+-----------------------------------------------------------+
|n	  !2	    Command from line #2		    |
|-n	  !-3	    Command 3 commands ago		    |
|#	  !#	    The current command			    |
|!	  !!	    The last command			    |
|s	  !abc	    The last command that starts with 'abc' |
|?s?	  !?abc?    The last command that contained 'abc'   |
+-----------------------------------------------------------+

I didn't mention the "!#" event last time. This is a feature of newer versions of the C shell. It matches the current line. It isn't much used unless you use some of the features described below.

long_word !#

is the same as

long_word long_word

Yeah. That's real useful.

Word events

The previous events refer to complete lines. You don't have to recall previous lines, and use the entire line. You can just use part of the line. Last month I discussed four word designators. If the last command was

program arg1 special2 arg3 arg4

Then the value of these word designators would be

+-----------------------------------------------+
      |Character   Example    Value		      |
      +-----------------------------------------------+
      |^	   !^	      arg1		      |
      |*	   !*	      arg1 special2 arg3 arg4 |
      |$	   !$	      arg4		      |
      |%	   !?spec?%   special2		      |
      +-----------------------------------------------+

These word designators are specialized abbreviations of a more flexible, but complex syntax. All five examples below are identical:

program arg1 arg2
# now print arg2
lpr !$
lpr !!$
lpr !!:$
lpr !{!:$}
lpr !:$

Numeric word events

Using the last syntax, it is possible to specify any word on the previous command line, besides the first or last. Just specify the number:

lpr !:2

It is important to remember that the history mechanism comes early, before the meta-characters are expanded. For instance, suppose the following command was typed:

program abc{1,2,3} *.? "a b";program2>file.out

This table lists the words that can be recalled in a history event:

+------------------+
		    |Word   Value      |
		    +------------------+
		    |!:0    program    |
		    |!:1    abc{1,2,3} |
		    |!:2    *.?	       |
		    |!:3    "a b"      |
		    |!:4    ;	       |
		    |!:5    program2   |
		    |!:6    >       |
		    |!:7    file.out   |
		    +------------------+

Notice how the shell separates words not based on white-spaces, but on meta-characters like ">" and ";."

I've already mentioned that "!!:5" is the same as "!:5" when used to refer to the fifth word of the previous line. Two different syntaxes exist, because you can use these word modifiers to refer to any command stored in the history memory. If the history contained the following command:

cc -o prog prog.c
./prog >A_long_file_name

then the following table shows some of the values

+-----------------------------+
	       |Modifier    Value	     |
	       +-----------------------------+
	       |!c:2	    prog	     |
	       |!?long?:2   A_long_file_name |
	       +-----------------------------+

:- - Ranges of words

You can specify range of words, using the "-" modifier. Assuming the command

program a b c d e f g h

then the following table shows possible ranges of words:

+---------------------------+
		|Variable   Value	    |
		+---------------------------+
		|!!:2	    b		    |
		|!!:4	    d		    |
		|!!:2-4	    b c d	    |
		|!!:2-	    b c d e f g	    |
		|!!:2-$	    b c d e f g h   |
		|!!:2*	    b c d e f g h   |
		|!!:-	    a b c d e f g   |
		|!!:1-$	    a b c d e f g h |
		|!!:*	    a b c d e f g h |
		+---------------------------+

As you can see, if the value before the hyphen is omitted, then the default beginning is 1. If the value after the hyphen is omitted, the defaults is the next to last argument. In other words, "*" acts like "-$" when used after a number. I should also mention that the rules for abbreviations are confusing. "!!:-" is the same as "!:-" but it is not the same as "!-" because the last looks like the relative event identifier, i.e. "!-2."

Events, words, arguments, aliases

I haven't discussed aliases in detail, but the mechanism used for history are also used in defining aliases. For instance, to make an alias so that

compile xyz

does the same as

cc -o xyz xvz.c

you would use the alias

alias compile 'cc -o !:1 !:1.c'

Find this confusing? Wait till you learn about variable modifiers.

Variable Modifiers

The C shell has variable modifiers, that can be used to extract part of a variable. That is, if a variable "x" has the value "file.c," and if you wanted to remove the ".c" part, then you can use the "root" modifier ":r." That is, "$x:r" is the same as "file." There exists four useful modifiers, that extract the head, tail, root and extension of a variable. Assume a variable has been set by typing:

set a = "/a/b/c/d.e.f.g"

Then the four modifiers would give the following results:

+-------------------------------------------------------+
  |Modifier   Meaning		    Example   Results	  |
  +-------------------------------------------------------+
  |r	      Root		    $a:r      a/b/c.d.e.f |
  |e	      Extension		    $a:e      g		  |
  |h	      Head (or Directory)   $a:h      /a/b	  |
  |t	      Tail (or Filename)    $a:t      c.d.e.f.g	  |
  +-------------------------------------------------------+

One way to think of this is to realize that Root and Extension are matching pairs. That is, "${a:r}.${a:e}" is the same as "$a." Head and Tail, or rather, Directory and Filename are also matching. "${a:h}/${a:t}" is the same as "$a."

Earlier, I mentioned how the syntax of the C shell can be complex. These variable modifiers can be used for variables, arguments, history and aliases. Let me give some examples. If you wanted to rename all files with the extension ".old" to the extension ".new," then use

foreach i ( *.old )
mv $i $i:r.new
end

To make a shell script called "compile" that did the same as the alias above (except the .c is not needed), use

#!/bin/csh -f
cc -o $1:r $i:r.c

An alias to do the same would be

alias compile 'cc -o !:1:r !:1:r.c'

Lastly, suppose you wanted to execute the following:

diff file.old file

There are several ways to do this. You can use the in-line expansion:

diff file{.old,}

You can also use the "!#" value, combined with a way to get one word from the event, with a variable modifier:

diff file.old !#:$:r

Of course that example is a bad one for several reasons. The abbreviation is longer than typing "file." Also - it doesn't work. I tried it, got a segmentation fault, the shell exited, and my window disappeared. It's so much fun researching these little-used features. Using "!#:1:r" worked, however. Go figure.

:p - Print modifier

There are other modifiers. You can use ":p" as a "print, but not execute" modifier. If you wanted to see the command that started with "abc" but did not want to execute it, you could type:

echo !abc

The ":p" modifiers does this, with fewer characters. Just type

!abc:p

and the history value is printed, but not executed. This modifier isn't useful for the other cases. If you can find a use for variables, arguments or aliases, let me know.

:s - Substitute modifier

A useful modifier is ":s," as it can be used to perform simple substitutions on all variables. If you wanted to change "old" to "new" in a filename, use

#!/bin/csh -f
mv $1 $1:s/old/new/

An alias would be

alias old2new 'mv !:1 !:1:s/old/new/'

This substitution always changes the first string found. Any character can follow the "s" character; it is used as the delimiter. Note that "!!:s^" is the same as "^" when the later is used at the beginning of a line.

The string substituted is not a regular expression. The characters are ordinary. There is one special character, when used in the second pattern - "&." This matches the string in the first part. Suppose you typed

echo abc.{1,2}

(which prints abc.1 abc.2) and you wanted to change this to

echo {abc,def}.{1,2}

(which prints abc.1 abc.2 def.1 def.2), you can use the "&" character like this:

^abc^{&,def}

:& - Repeat substitution

Suppose you type the following command:

cc -o trial trial.c

and you want to change "trial" to "program." If you type

!!:s^trial^program^

or more simply

^trial^program

then the first change will be made. This will be the same as typing

cc -o program trial.c

This doesn't change the second occurrance. If you want to repeat it, then either type the same command a second time, or use the ":&" modifier, which says to repeat the last substitution. If you add a "g" before the "&" then the substitution is repeated on each word. Instead of typing

program abc.1 abc.2 abc.3 abc.4
program def.1 def.2 def.3 def.4

You could type

program abc.1 abc.2 abc.3 abc.4
# Now change 'abc' to 'def', but don't execute it
^abc^def^:p
# Now repeat it for each argument
!!:g&

The last step will echo "program def.1 def.2 def.3 def.4" and then execute it. This only changes the first occurrence in each word. Suppose the following lines were executed:

echo a aa aaa aaaa
^a^A^:p
!!:g&

This last line is equivalent to executing

echo A Aa Aaa Aaaa

The tcsh variation has a "a" modifier in addition to the "g" modifier. If present, then the change is made several times in each word. If your shell has this, you could change all lower case "a's" to upper case "A's."

:q - Quote modifier

The C shell and Bourne shell behave differently when a filename pattern is used, and nothing matches. Suppose you have a shell script, named "test_me," that says

# The test_me script, either sh or csh
echo $1

and you call it using:

test_me ?

The Bourne shell will echo "?" which the C shell will report "echo: No match." You can disable this error by setting the nonomatch variable:

#!/bin/csh -f
set nonomatch
echo $1

The other way to do this is to use the ":q" modifier, which quotes the variable:

#!/bin/csh -f
echo $1:q

:x - Quotes with spaces

Suppose you had a script that was called with the following argument:

test_1 'This is a question?'

This script has one argument. If you wanted to pass it to a second script, and wanted to retain the meta-characters like "?" and "*," simply use the ":q" modifier described above:

#!/bin/csh -f
# This is the test_1 script
test_2 $1:q
# or use
set nonomatch
test_2 "$1"

The second script will get the identical argument the first one has. But what would you do if you wanted to separate the words into arguments, but leave the question mark alone? You could use the nonomatch variable:

#!/bin/csh -f
# This is the test_1 script
set nonomatch
test_2 $1

Another choice is to use the ":x" modifier, which is like the ":q" modifier, but separates words when whitespace is found:

#!/bin/csh -f
test_2 $1:x

:u and :l - Upper and Lower Case

The tcsh variable supports two additional modifiers. ":u" makes the first lowercase letter uppercase, and ":l" makes the first uppercase letter lowercase.

The End of History?

There are a few more points to mention. You can have multiple variable modifiers on a single variable. In some cases, braces are needed to specify the variable part. In complex cases, the tcsh variant of the C shell behaves differently.

Well, that is everything I know about the history feature of the C shell. It is confusing, but I tried to come up with examples of all of the main features. I hope you found this interesting.

C shell Aliases

I'm lazy.

If there is something that I do on the computer a lot, I look for ways to make my life easier. If there is something that takes ten seconds to type, I'll do whatever it takes to save nine seconds of those ten, even if it takes me days to accomplish this. Weeks, even. Months. Years. No. Not quite. I'm not compulsive, you know. Don't be silly.

C shell users have a very important tool to save those extraneous seconds: aliases. There are several alternative techniques to save keystrokes. In the last two columns I discussed the history function in detail. You can also create shell scripts. However, shell scripts run in a new shell process. You cannot use them to change your current shell process. Most of you have tried this, by creating a script that says something like:

#!/bin/csh -f
cd newdirectory
setenv VAR `pwd`

If you execute this script below, will the directory be changed?

myscript;pwd;echo $VAR

The answer is no. The current shell executes a new shell, which changes its directory. However, this shell, being a child, does not affect the parent shell. Another technique is needed.

The third method is to use variables as commands. Many people forget about this particular approach. The following commands will execute the contents of the C shell script in the current shell, changing the current directory:

set DO = "source ~/bin/myscript"
$DO

The fourth way is to use aliases, which is the subject of this month's tutorial.

alias DO "source ~/bin/myscript"
DO

Advantages and disadvantages of C shell aliases

There are several advantages to aliases.

·

You can save keystrokes.

·

Simple aliases are easy to understand.

·

Changes affect the current shell.

·

Meta-characters can be passed without quoting.

But there are several disadvantages:

·

Complex aliases have a confusing syntax.

·

Certain characters must be quoted to pass them into an alias. This makes the syntax more complex.

·

Aliases are defined on a single line. This makes it difficult to document complex aliases, and multiple-lined aliases must have the newline quoted.

·

Because the C shell syntax requires certain words on the beginning of the line, it is difficult to include commands like while, or foreach in the alias definition. Using these in aliases require additional quoting, adding more complexity.

Bourne shell functions were added long after C shells had the alias. They act like aliases, but do not suffer from the above problems. Instead, Bourne shell functions have the following advantages:

·

Bourne shell functions have the same syntax as shell scripts.

·

Bourne shell functions can be easily documented, especially those that are several lines long.

·

Bourne shell functions don't require additional quoting, and the quoting is more consistant.

All-in-all, the concept of C shell aliases is flawed. They are fine for simple definitions, but if you need anything more complex, I suggest you use a shell script (in whatever shell you prefer) or define an alias to source the script, as I did above.

A file for aliases

I have the following in my .cshrc file:

if ( -f ~/.aliases && -o ~/.aliases ) source ~/.aliases

This sources my alias file if the file exists and I own it. This isn't high security, but I no longer wake up screaming in the middle of the night. You can make sure that you only read this file once, by using a special variable:

if ( ! $?ALIAS_def ) then
if ( -f ~/.aliases && -o ~/.aliases ) source ~/.aliases
endif

Inside the file .aliases you need to add the line:

set ALIAS_def

Some people include the following aliases:

alias rm rm -i
alias mv mv -i
alias cp cp -i

This is a good idea. But remember that if you change to another user ID, (like the superuser), these aliases may not be there. So don't assume they always exist.

Aliases can also abbreviate commands:

alias a alias
# create aliases for "more" and "emacs"
a m more
a e emacs

There are variations of the ls command, that I've seen many people define as aliases:

alias lf ls -F
alias la ls -a
alias li ls -i
alias ll ls -l
# yada yada...

These alias, composed of a single command and options, allow arguments:

lf A*
ll *.c

Disabling aliases

There are two ways to disable an alias. The first is to remove the definition:

unalias rm mv cp

If you want to ignore an alias once, then quote it. That is, make part of the command quoted. This by-passes the alias expansion. Examples:

rm file*
"mv" file1 file2
""cp new old

Aliases can refer to other aliases:

alias ll ls -l
alias llF ll -F

The second alias uses the first alias. Be careful of alias loops:

alias AA BB
alias BB AA

If you execute either command, the C shell will report "Alias loop."

Alias with pipes

Aliases can be several commands, piped together:

# do an ls -l, but only list directories
alias ldir 'ls -l|grep "^d"'

I often use the aliases below. They sort various files, based on size, date, or accessed date:

# sort by date:
alias newest 'ls -lt | head -30'
alias oldest 'ls -ltr | head -30'
# sort by size:
alias biggest 'ls -ls|sort -nr | head -30'
alias smallest 'ls -ls|sort -n | head -30'
# sort by time last used
alias popular 'ls -lut | head -30'
alias unpopular 'ls -lutr | head -30'
# Sort by link time
alias lsst 'ls -lct | head -30'
alias lsstr 'ls -lctr | head -30'

Click here to get file: Aliases4.csh
These last few aliases, however, do not take arguments in the proper place. Typing

biggest [A-Z]*

is the same as

ls -lut | head -30 [A-Z]*

This isn't what a user might expect.

Arguments to aliases

It is possible to pass arguments to aliases. The syntax for aliases is based on the same crufty syntax for the history mechanism. The string "!*" matches all arguments. The exclamation mark must be quoted by a backslash:

# sort by date:
alias newest 'ls -lt !* | head -30'
alias oldest 'ls -ltr !* | head -30'

Some aliases will only work with a single argument. That is, if you give it two arguments, the program will complain with an error message. Let's assume you want an alias to recursively search for a file. That is, the alias will search from the current directory, and go inside any subdirectories. The only argument is a pattern that describes the filename:

% findfile test*
-rw-r--r-- 1 barnett staff 22027 Nov 9 23:32 ./test.pl
-rw-rw-r-- 1 barnett staff 18520 Aug 26 12:13 ./Old/test.out
-rwxrwxr-x 1 barnett staff 18557 Sep 12 09:43 ./Tmp/test.sort

Notice the asterisk is not quoted. If this was a shell script, you would have to quote the meta-character like this:

findfile test*
or
findfile 'test*'

The C shell alias doesn't need to have meta-characters quoted, because the alias substitution happens before the meta-character for filenames are expanded.

You can reuse a single argument. Here is an alias I use when I want to work on a copy of a file, and leave the original alone, keeping the date the same:

alias orig "mv !:1 !:1.orig;cp !:1.orig !:1"

typing "orig file.c" does the following:

mv file.c file.c.orig
cp file.c.orig file.c

I use this alias, because it keeps the original version of the file unchanged. If I ever examine the dates of the new and old file, the original file has the original date. If I just executed "cp file.c file.c.orig" and then edited "file.c" the dates of the two files would be close to identical.

Compiling C programs can be complex, especially if you do not have a makefile to go with it. Here are some that can be used to compile C programs:

alias ccc 'cc -o !:1:r !*'
alias ccg 'cc -g -o !:1:r !*'
alias cco 'cc -fast -o !:1:r !*'
alias ccx 'cc -o !:1:r !* -L${OPENWINHOME}/lib -lX11

Use them as following

ccg main.c part1.c part2.c

This compiles programs without needed a makefile. Here is one that is useful to kill a process:

alias slay 'set j=`/usr/ucb/ps -ax| grep !*|head -1`;
kill -9 `echo $j[1]`'

Notice how it uses a C shell array to extract the first word on the line (i.e. the PID or the Process Identification Number).

You can pick and choose the arguments you want. If you wanted an alias to move to a directory, with the directory as the first argument, use

# as in
# moveto directory *.c
alias moveto 'mv !:2* !:1'

Some examples of arguments inside an alias follow:

+----------------------------------+
	    |	     Aliases arguments	       |
	    +----------------------------------+
	    |Variable	meaning		       |
	    +----------------------------------+
	    |!:1	Argument 1	       |
	    |!:2	Argument 2	       |
	    |!*		Arguments 1 through n  |
	    |!:1-$	Arguments 1 through n  |
	    |!:1*	Arguments 1 through n  |
	    |!$		Nth argument	       |
	    |!:-	Argument 1 through n-1 |
	    |!:2-	Argument 2 through n-1 |
	    |!:2-$	Argument 2 through n   |
	    |!:2*	Argument 2 through n   |
	    |!:2-4	Arguments 2, 3 and 4   |
	    +----------------------------------+

Redefining commands

You can redefine any command with an alias:

alias cat mycat
alias /bin/cat mycat
alias /usr/bin/cat mycat

You can also "trap" commands, and execute a special command:

alias command '/source ~/.command; /bin/command'

Multiple-line aliases

You can create an alias that spans over several lines. Here is one that compiles a C program. If you tell it to compile a program that does not end with a ".c," an error occurs. That is, if you type:

C file1.f

The message is this:

Error: this is for C programs only
Extension should be .c on file file1.f, not .f

On the other hand, if you typed

C prog.c file1.c file2.c file3.c

and the file "prog" existed, the alias would do the following:

mv prog prog.old
cc -o prog proc.c file1.c file2.c

Here is the alias:

alias C 'eval "if (!:1 !~ *.c) then
echo "Error: this is for C programs only"
echo "Extension should be .c on file !:1, not .!:1:e"
else
if ( -e !:1:r ) mv !:1:r !:1:r.old
cc -o !:1:r !:1:r.c !:2*
endif

Click here to get file: AliasesC.csh
Notice how the alias includes an "eval" call. This makes sure the meta-characters are expanded. They are quoted in the alias, so they are unchanged so the alias can be defined. The "eval" command reverses this action. Also note the liberal use of the ":r" variable modifier.

I have tried to use aliases that use foreach and while commands. To be honest, I find them impossible to write. I suggest you use a shell script if possible. There are, however, times when a shell script doesn't work well. Suppose you want to source several files. You could type

foreach i ( *.alias )
source $i
end

This, however, could not be done with a shell script. You can try it, but your current shell would not know about the aliases, because a different shell is executing the commands. Below is an alias that will solve the problem, however. It has the advantages of an alias, without the disadvantages of trying to get the C shell to work on one line. The alias is

alias FOREACH 'set a = !:1;
set b = (!:2*);
source ~/source_alias/FOREACH.csh;'

Really, this alias is in two parts. This first part sets two variables, and sources a file, the second part. I have created a special directory that contains these sourced aliases. This allows me to source them in easily, and to keep them in one location. The file looks like this:

#!/bin/echo use source to execute this command
# variable a = filename metacharacter
# variable b = command to execute
foreach i ( $a )
eval $b $a
end

To use this, type

FOREACH *.csh source

Note how the first line starts with a "#!/bin/echo" command. This is to make sure that people source the file, and not execute it.

Now, for a special treat, (I'm proud of this next trick) I have a more powerful version. Suppose we wanted to take each file, and rename it with a ".old" extension. That is, suppose you type:

FOREACH *.csh mv # #.old

The character "#" is used as a special character, and it changes to match each argument in the list. This does the same as typing the following, but one line is used.

foreach i ( *.csh )
mv $i $i.old
end

Since this is starting to get a bit complex, I combined the definition of the alias with the script itself. You source this file to define the alias. You also source the same file to execute the script. Therefore both parts are in the same file:

#!/bin/echo use source to execute this command
# variable a = filename metacharacter
# variable b = command to execute

# source this to define the alias
if ( ! $?FOREACH_def) then
alias FOREACH 'set a = !:1;
set b = (!:2*);
source ~/source_alias/FOREACH.csh;'
set FOREACH_def
else
echo $b | grep '#' >/dev/null
# remember status
set r = $status
foreach i ( $a )
if ( $r == 0 ) then
# change '#' to $i
set B = `echo $b | sed 's/#/'$i'/g'`
else
# Add a '$i' to the end, if it is missing
set B = `echo $b | sed 's/$/ '$i'/'`
endif
eval $B
end
endif

Click here to get file: Foreach1.csh
This script is upwardly compatable with the previous script. If you typed

FOREACH *.csh source #

this does the same as typing

FOREACH *.csh source

In other words, if you omit the "#" it assumes there is one at the end. However, this script does not remove the extension. If I wanted a script that converted all files ending in .c to files ending in .old.c, I could not use the above script. Here is another one, slightly different, called "FOREACHr," that removes the extension from the arguments. I added a "r" on the end to remine me that it uses the ":r" variable modifier. To use it, type

FOREACHr *.c mv #.c #.old.c

The script is almost identical to the previous:

#!/bin/echo use source to execute this command
# variable a = filename metacharacter
# variable b = command to execute

# source this to define the alias
if ( ! $?FOREACHr_def) then
alias FOREACHr 'set a = !:1;
set b = (!:2*);
source ~/source_alias/FOREACHr.csh;'
set FOREACHr_def
else
echo $b | grep '#' >/dev/null
# remember status
set r = $status
foreach i ( $a )
# Here is the different part
set j = $i:r
if ( $r == 0 ) then
# change '#' to $j
set B = `echo $b | sed 's/#/'$j'/g'`
else
set B = `echo $b | sed 's/$/ '$j'/'`
endif
eval $B
end
endif

Click here to get file: Foreach2.csh
I hope you enjoyed my examples. I tried to illustrate the important points, and give useful examples. Next month, I'll tackle directories and prompts.

Directories and the C shell

This section talks about the "cd" command. Wait! Come back. Sure, everyone knows this command. It's the first command UNIX users learn. Sounds like an excellent reason to discuss it. Ah, you question my sanity. It's okay. I'm used to it. It's a good topic, because it has subtle abilities and most people are ignorant of them.

Directories and paths are either absolute (starting with "/)" or relative (starting with any other character). The directories "." and ".." are special. The first specifies the current directory, and the second the parent of the current directory. The command

cd ./dir/../dir/../.

will not change the current directory.

If you get lost, or you simply want a quick way to go back to your home directory, the command

with no arguments takes you to your login directory.

Normally, when you start up your window system, each new window starts up with the working directory being the same as the one you were in when you started up the windowing system. However, "cd" with no arguments still takes you to your home directory. Alternately, you can change your home directory:

set home = /usr/project

then every time you type "cd" or "cd ~" you will go to the projects directory instead of your normal home directory. The variable "home" is special because it is automaticallt converted to the "HOME" directory, and exported. The character "~" is an abbreviation of the variable "$home."

Your true home directory, or any user's home directory, which is determined by the entries in the "/etc/passwd" file, can be accessed by

cd ~username

even if you change the variable "home."

cdpath - favorite directories

You may occasionally type the "cd" command and get the error "no such file of directory." Perhaps you forgot which directory you were in, or forgot to add "~/" or "../" before the name. This can be fixed. If you have a set of favorite directory where you normally create new directories, you can specify this set, by setting the variable "cdpath." This variable accepts a wordlist. Example:

set cdpath = ( .. ~/Src /usr/projects ~ )

Now when you type

cd prog

the system will look for

./prog
../prog
~/Src/prog
/usr/projects/prog
~/prog

in that order.

Variables

Another way to access favorite directories is to define variables with the names of the directories. Example:

set Src = "~/Src"
set p = "/usr/projects"

You can now use the commands

cd $Src
cd $Src/program
cp $P/data .

The command "cd" will test for one more conditions before reporting an error. It will look for a variable with the same name as its argument. Therefore the two commands below are equivalent:

cd $P
cd P

Remember that the C shell first looks for the directory in the current directory, then the list in "cdpath," and finally the variable.

This can be used by adding the following alias to your .cshrc file:

alias = 'set !:1=$cwd'

Then you can type

= A

This remembers the directory. Whenever you type

cd A

you go to that directory. You can also go to another directory, and refer to the old directory:

= A
cd NewDirectory
cp $A/file .

Remember that only "cd" will test for a variable. So commands like

cp A/file .

won't work.

You can also use aliases. Executing the command

alias projects 'cd /usr/projects'

will define a shortcut to specify a change in a directory.

Specifying the current directory

There are two ways to find out what your current working directory is. The first is with the program "pwd" - print working directory. The second is with the variable "cwd," which can be examained by

echo $cwd

or by creating a new alias

alias cwd 'echo $cwd'

Examining the internal variable "cwd" is faster than executing the stand-alone program "pwd," but there is one subtle difference. If you access a symbolic link in your directory path, then "pwd" will show the actual directory path, while "cwd" will show the path you used to get to the directory. For instance, the command

cd /usr/tmp; pwd; echo $cwd

might output the following:

/var/tmp
/usr/tmp

The first is the actual directory, while the second is the directory you think you are in. You can force directories to be displayed with the true path by the C-shell command

set hardpaths

NOTE: what about cd /usr/tmp/..?

The C-shell also provides a convenient method of changing between a common set of directories. It keeps a "directory stack" - which can be displayed with the command

dirs

There is an optional "-l" argument which expands the "~" abbreviations.

You can make use of this stack by using the command "pushd" - push directory - instead of "cd." They both change directories, but "pushd" adds your old directory to the stack, and puts the new directory on the top of the same stack.

Here is an example (lines starting with "%" are typed by the user)

% dirs
~
% pushd /usr; dirs; pwd
/usr ~
/usr

To return to the old directory, (i.e. the second director on the stack), use the command

popd

If you merely wish to exchange the first and second directory, the command "pushd" with no arguments will do this. If you repeat this, you go back again. "Pushd" is very convenient when you wish to go back and forth between two directories.

"Cd" only changes the top directory. If, therefore, you are in a projects directory, but are interrupted, you can "pushd" to the new directory, and "cd" to several other directories. When you are done, "popd" will return you to the directory before the interruption.

Both "pushd" and "popd" can have an argument consisting of the character "+" followed by a number. "popd" will discard the "nth" entry in the stack, without changing the current directory. "pushd" will change the current directory to the "nth" entry, rotating the entire stack so that entry is now on top.

For instance, if you had the stack

/usr/spool/news/comp/sys /usr/spool/news/comp /usr/spool/news /usr/spool /usr

the command

pushd +2

will change the stack to

/usr/spool/news /usr/spool /usr /usr/spool/news/comp/sys /usr/spool/news/comp

bringing entry 2 to the top, entry 3 to be second from the top, etc. The top of the stack is entry 0.

Some people like to keep their current stack of directories in an array. This is simple:

alias pushd 'pushd !* && set dirs = (`dirs`)'
alias popd 'popd !* && set dirs = (`dirs`)'

This allows you to use "$dirs[2]" in command lines, and to search for files:

cp $dirs[2]/file $dirs[3]
foreach dir ( $dirs )
if ( -f $ddir/DATA ) echo $dir/DATA found
end

You can clear the stack by typing

repeat 9 popd

Others like to create aliases:

alias +1 pushd +1
alias +2 pushd +2
alias +3 pushd +3
alias -1 popd +1
alias -2 popd +2
alias -3 popd +3

chdir versus cd

Be aware that "cd" is built into the shell. The command "chdir" is a synonym, convenient when creating aliases for "cd." So executing cd in a shell script will not change your current shell's directory. If you want to do some clever things, like changing your prompt to reflect your current directory, you have to use the alias feature. But when the alias is combined with the "source" command in the C-shell, a lot of flexibility is available. But more on that later.

This alias will echo your current directory whenever you change it.

alias cd 'cd !*;echo $cwd'

However, you must be careful of alias loops. As a general rule, I use "chdir" instead of "cd" in aliases, to prevent a potential loop. Therefore the above should be:

alias cd 'chdir !*;echo $cwd'

Suppose you want to execute a special command whenever you change directories. Or you want to change your aliases whenever you change directories. The following example will execute the commands in the file ".dir" whenever you change directories and it finds the file ".dir" in the directory and you also own it.

alias cd 'chdir !*; if ( -f .dir && -o .dir ) source .dir '

The test for ownership, "-o," protects you from executing someone else's copy of ".dir." You could instead use two files, say called ".in" and ".out" and execute one when you eeenter a directory, and another when you leave. Both choices are dangerous, because someone might have write access to one of your files, and trick you into executing a command.

These aliases allow you to move up and down directories, and the command "." is easier to type, and faster, than "pwd."

alias . 'echo $cwd'
alias .. 'set dot=$cwd;cd ..'
alias , 'cd $dot '

Some people simply use the alias

alias cwd echo $cwd

Many people like to have their prompt change to show the current directory. Others like to display the history number. All of these can be done. The character "!" has a special meaning when the "prompt" variable contains it. It is expanded to be the current command number. You don't need to type the "history" command to learn the number of a previous command. Just type

set prompt="! % "

Or create an alias

alias SP 'set prompt="! % "'
SP

Others like to show the current directory in the prompt. This can be combined with the cd alias above:

alias cd 'chdir !*;SP'
alias SP 'set prompt="! $cwd % "'

Finally, here is a elaborate example that displays

command number
hostname
user
current directory

with each prompt with the format of "10 hostname {username} cwd >"

alias cd 'chdir !*;SP'
alias SP 'set prompt="! `hostname` {$USER} $cwd > "'

If you would rather have just the base directory (the last directory in the path) instead of the entire path, try:

alias SP 'set prompt="! `hostname` {$USER} `basename $cwd` > "'

alias SP 'set prompt="! `hostname` {$USER} $cwd:t > "'

instead of the above line.

Some people even use multi-line prompts:

set hostname = `hostname`
alias SP 'set prompt="
${hostname}::${cwd}
! % "'

Next month, I will explain my system for integrating diretory management with the window system. I have used more elaborate aliases that execute programs which provide directory selection from the current stack using cursor keys. But that is beyond this tutorial. Instead, I would like to end with a popular set of aliases that changes the black bar on top of each SunView window to show the current hostname and directory stack. In addition, it adds the following commands:

+---------------------------------------+
	  |Command   Function			  |
	  | _					  |
	  | +	     alias for "pushd"		  |
	  | -	     alias for "popd"		  |
	  | ~	     alias for "cd ~"		  |
	  | .	     update top stripe, show jobs |
	  | ..	     go up one directory	  |
	  | ,	     go back down (used with ..)  |
	  | --	     go to last directory	  |
	  +---------------------------------------+

This was documented in the SunOS Release 3.5 manual, but that version had some errors. The characters "^[" below must be changed to an escape character.

#!/bin/sh
# Zappath: created by Bruce Barnett
# this script edits the output of the C shell "dirs"
# command and shortens the strings to a compact form
#
# Some example editing:
#
# remove the automount /tmp_mnt prefix
# change /usr/export/home to /home
# change /homedisk to /home
# change $HOME into ~
# change /home/abc to abc
# change /usr/etc to ./etc
# change */*/ABC to ./ABC
# one more thing:
# must change ~ to be -
sed '
s:/tmp_mnt::g
s:/usr/export/home:/home:g
s:/homedisk:/home:g
s:'${HOME}':~:g
s:/home/::g
s:/usr/:./:g
s:[^/ ]*/[^/ ]*/:./:g
s:~:-:g
'

Click here to get file: Zappath.sh

# C shell aliases to display hosts, directories, and directory stacks
# in the window title bar, icon string, and directory prompt
# created by Bruce Barnett
# Based on the SunOS 3.5 Release notes
# Usage:
# if ( -f ~/.header && -f ~/bin/Zappath ) source ~/.header
#
# the SP alias sets the directory prompt
# the setbar alias changes the window title bar and icon string
# define them here as the default case

alias SP 'set prompt="$cwd:t% "'
alias setbar 'echo !* >/dev/null'

# If not a sun cmdtool or shelltool, or not an X terminal, exit.
if ( ! ( $term =~ sun* ) && ! ( $term =~ xterm )) goto getout

# if using a raw console, don't do anything
if ( `tty` =~ /dev/console ) goto getout

# set a few variables for later
# but only if the environment variable is not set

if ( ! $?HOSTNAME ) then
setenv HOSTNAME `hostname`
endif

# find the home machine
# is there a file that has a machine name in it?
if ( -f ~/.display ) then
set mymachine = `sed -e 's/:.*//' <~/.display`
else
# obviously change this to match your
# default system. Mine is "grymoire" - of course
set mymachine = "grymoire"
# alternately, some people use "who am i"
endif

set console = '<< CONSOLE >>'

# figure how how to generate escape, bell,
# and echo commands without a a line terminator
# I may have done this before. If so, the variable E is set

# have I executed this script before on this system?
if ( $?E ) then
# echo "already set the echo variables">/dev/tty
else if ( -f ~/.echo.${HOSTNAME} ) then
source ~/.echo.${HOSTNAME}
else if ( `echo -n |wc -l` == 0 ) then
# echo "built in echo is bsd" >/dev/tty
# then berkeley style echo
echo 'set ech = "echo -n"' >~/.echo.${HOSTNAME}
echo "set E = `echo a | tr a ' 33'`" >> ~/.echo.${HOSTNAME}
echo "set B = `echo a | tr a ' 07'`" >> ~/.echo.${HOSTNAME}
echo 'set N = ""' >> ~/.echo.${HOSTNAME}
source ~/.echo.${HOSTNAME}
else
# echo "built in echo is sysV" >/dev/tty
echo 'set ech = "echo"' >~/.echo.${HOSTNAME}
echo 'set E = " 33"' >> ~/.echo.${HOSTNAME}
echo 'set B = " 07"' >> ~/.echo.${HOSTNAME}
echo 'set N = ""' >> ~/.echo.${HOSTNAME}
source ~/.echo.${HOSTNAME}
endif

# Are we using shelltool, cmdtool or xterm?
# duplicate these aliases here to avoid problems
if ( $term =~ sun* ) then
# Sun Aliases
alias Header '${ech} "${E}]l!:1${E}${N}"'
alias IHeader '${ech} "${E}]L!:1${E}${N}"'
else if ( $term =~ xterm ) then
alias Header '${ech} "${E}]2;!:1${B}${N}"'
alias IHeader '${ech} "${E}]1;!:1${B}${N}"'
endif

# There are three different combinations:
# 1) A window on a remote machine
# 2) A window on my machine
# 3) A console on my machine

# test for each case:

if (${HOSTNAME} != $mymachine ) then
# it is a remote machine, therefore:
# window title has machinename and dirs
# icon has machine name only
# prompt has machine name+directory

alias setbar 'Header "${HOSTNAME}: `dirs | ~/bin/Zappath`" ; IHeader ${HOSTNAME}'

# use either of these two lines to suit your tastes
# the first one shows the command number and full directory path
# The second shows just the hostname and the tail of the directory name

# alias SP 'set prompt="[!] ${HOSTNAME}:$cwd % "'
alias SP 'set prompt="${HOSTNAME}:$cwd:t% "'

else if ( `tty` == "/dev/console" ) then
goto getout
else if ( `tty` == "/dev/ttyp0" ) then

# in this case an assumption is made that the first pty
# window to appear is the console window.
# It's not always true, but it usually works

# window title has <<CONSOLE>> and dirs
# icon has "console" only
# prompt has directory

alias setbar 'Header "${console} `dirs | ~/bin/Zappath`"'

# both of these works - pick one that suits you
# alias SP 'set prompt="[!] $cwd % "'
alias SP 'set prompt="$cwd:t% "'
else
# a plain window on my localhost
# window title has dirs
# icon has cwd only
# prompt has directory
# The next line must be one line, and not split onto two
alias setbar 'Header "`dirs | ~/bin/Zappath `"; IHeader "`echo $cwd| ~/bin/Zappath`"'
# alias SP 'set prompt="[!] $cwd % "'
alias SP 'set prompt="$cwd:t% "'
endif

# redo current window
alias . 'dirs|~/bin/Zappath;setbar;jobs'

# change cd to change prompt, window and icon title
alias cd 'chdir !*; SP;setbar'
alias pushd 'pushd !*; SP;setbar'
alias popd 'popd !*; SP;setbar'

SP;setbar
getout:
# end

Click here to get file: Header.csh

Thanks

Other of my Unix shell tutorials can be found here. Other shell tutorials can be found at Heiner's SHELLdorado and Chris F. A. Johnson's Unix Shell Page This document was translated by troff2html v0.21 on September 22, 2001.