Friday, March 02, 2012

C-shell: basic features compared to ksh

Sign-on files

The following files, if they exist, are read at sign-on:

/etc/csh.login instead of /etc/profile - interactive (login) sessions only
~/.cshrc instead of $ENV
/etc/csh.cshrc if supported, not all c shells use this
~/.login instead of .profile - interactive (login) sessions only

On sign-off the optional file ~/.logout is executed (no equivalent in ksh).
In-line scripts

To execute a script in-line, that is without creating a sub-shell, use the source command (instead of the . command), as in bash. For example:

source ~/.login


Filename Completion

An interactive C shell can complete a partially typed filename or user name. By default this feature is switched off, to turn it on:

set filec


When the user types characters followed by , the shell attempts to fill in the remaining characters of a matching filename, assuming the characters typed are enough to identify a single file.

If the filename prefix is followed by EOF (usually d), the shell lists all filenames that
match. It then prompts once again, supplying the incomplete command line typed so far.

Ending the name with a tilde (~) indicates that a username is required instead.
Aliases

The syntax for alias in the C shell is slightly different to other shells:

alias new-alias existing-command

i.e. there is no assignment character (=).

Functions

No. There are none.

History mechanism

The history list (and the history command) exists as in other shells, however the syntax for retrieving lines from it is different to ksh. The following table refers to lines in the history list:


!n The nth line in the history list
!-n The nth previous line
!! The last line
!prefix The most recent line with the specified prefix
^xx^zz The last line, with the string xx replaced by zz
!n:s/xx/zz/ The nth line, with the string xx replaced by zz
!* All the arguments of the last command
!$ The last argument of the last command (instead of $_)
!^ The first argument of the last command
!:n The nth argument of the last command


This is also supported by bash.

Variables

Pre-defined variables are different from other shells, the most noticeable feature being that they are in lower and upper case. The list of variables is large, here are a few:

The exit status of the previous command is $status (instead of $?).

The command line prompt is $prompt (instead of $PS1). The contents of which may contain a large range of format characters. It is too huge to list here, see the man pages for details. Some versions may also support $prompt2 and $prompt3.

Command line arguments are set in the array $argv (see below).

The current working directory is in $cwd (not $PWD).

Lowercase versions of path, home, and term also exist. The lowercase version of path is set from PATH, but is only used by the shell. This does not mean that path can be altered without affecting PATH, since the two are kept synchronised. A similar scheme exists with term, home, and others. Only the upper-case versions are exported. Note that path is a list, i.e. the directory names are separated by white-space, not colons.

Setting values

Local variables may be assigned values using set (which is more like BASIC than C), for example:

set var = "She sold sea-shells"


Note that spaces are optional around the assignment symbol (=) – hooray! The = is mandatory.

To create a variable in the environment block (export), use setenv:

setenv var "She sold sea-shells"

They may be removed using unsetenv.

There must be no assignment symbol (=) with setenv.
If you have an existing local variable of the same name then that is not overwritten, a new one is created. For example:


$ set var=red
$ setenv var blue
$ echo $var
red
$ csh # Create a child
$ echo $var
blue
$ exit


In the example above, if we had just done
    setenv var 

then the child process would see an empty variable.

Arrays

Like ksh and bash, csh arrays cannot be exported. However unlike ksh and bash csh does not allow you to try. Attempting to setenv an array will give an error.

An array is initialised from a list (as in bash and ksh93), where the elements are space separated and delimited by parentheses, for example:

set arry = (HPUX AIX DGUX Dynix Tru64 DRSNX SunOS Linux) 


The old ksh88 method of initialising an array with set –A is not supported.

Access an element of the array using the index, counting from 1 (which is weird, since in C we count from zero).

echo $arry[3]
DGUX

We can also access a range of elements:

echo $arry[2-4]
AIX DGUX Dynix

and the whole array using an index of * :

echo $arry[*]
HPUX AIX DGUX Dynix Tru64 DRSNX SunOS Linux

This is the same as printing the array variable (echo $arry).

The number of elements in the array is held in a variable with the same name as the array, prefixed with a #, for example:

echo $#arry
8

so getting the last element is simple:

echo $arry[$#arry]
Linux

With other shells we would have to use braces ({…}), which are not needed in this context, but may still be used to delimit a variable name:

set money = "dollar"
echo "Give me your ${money}s"


Command-line arguments

The list of command line arguments is held in the array argv. To get the complete list of arguments, use $argv[*] (instead of $* or $@). The variables $0..$n are still available, but may alternatively be obtained using $argv[n]. A C programmer would also expect a variable called argc, but the number of arguments is held in #argv instead. See also Arrays.

The command shift is supported as in ksh and bash.

Modifiers

The C shell also supports modifier codes. These are used by suffixing a variable name with a colon (:), followed by the code letter:

h remove a trailing path name (similar to dirname)
t remove all leading path names (similar to basename)
r remove a file "extension" (suffixed preceded by a dot).
e remove the filename preceding a file "extension"

When used on a whole list or array, they should be preceded by a g (gh, gt, gr, ge) otherwise they will only be applied to the first element. The exception is the modifier code q, which places the extracted data in quotes (thus avoiding $*/$@ differences).

For example:

set file = /home/user1/Seas/North.c

echo $file:h
/home/user1/Seas

echo $file:t
North.c

echo $file:r
/home/user1/Seas/North

echo $file:e
c



Other variable operations:

Shell Options
There are no shell options like ksh and bash have. Instead everything is controlled by shell variables, and many have the same names as options in ksh, like noclobber, ignoreeof, noglob, etc. See the man pages for a complete list.

To set a shell option you normally only have to create the variable, you often do not need to give it a value, for example:

set ignoreeof


Variable types

No, not in csh. (ksh typeset) For example, to convert to uppercase we have to use:

set var = `echo $var | sed \ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/`

Reading from stdin

To read a variable from the keyboard (stdin), use the < var =" $<">to continue…"
$< Command substitution

Command substitution uses back-ticks ($(…) is not supported), for example:

set DIR = `ls`

Note that this will return a list, and make DIR into an array. This reduces the need for slicing, for example:

set now_date = `date '+%a %H %M %S'`

returns a four element array, where:

$now_date[1] is the weekday
$now_date[2] is the hour
$now_date[3] is the minute
$now_date[4] is the second

Arithmetic

Arithmetic commands are prefixed with the @ symbol, which is actually a built-in command and so must be followed by a space.

The usual C/C++/Java arithmetic operators, including post-fix increment and decrement (but not pre-fix), are supported, for example:

set x # variable must already exist
set j # variable must already exist
@ x++ # Increment $x
@ j = ($j + 42) * $x # Brackets are supported

# To continue over > 1 line, escape end-of-line

@ now_secs = ($now_date[2] * $hour_secs) + \
($now_date[3] * 60) + $now_date[4]


Conditionals

As well as the usual conditional statements (see below) the C shell supports a short cut for testing if a variable exists, the $? prefix. Confusingly it returns 0 if the variable does not exist, and 1 if it does. For example:

echo $?non_existent_variable
0

echo $?existing_variable
1

Relational operators

!= not equal
== equal (note : two = characters)
> greater than
>= greater than or equal to
< pattern1 ="~" file ="~" dirname =" $SYS_LPT/$usernm" userdir =" true" dirname =" '$SYS_JOB_'$type" result =" `eval" dirname =" '$SYS_'$type">Loops

As in most shells, there are two basic loops, the foreach loop, for list processing, and the while loop. The syntax is familiar but note, like Perl, the foreach loop does not contain an in:

foreach variable-name ( list )

# loop body

end


while boolean-expression

# loop body

end

The commands break and continue have the same meaning as in most languages, break exits the loop prematurely, and continue executes the next iteration at once.


Unconditional flow


The command onintr is similar to trap in ksh – it executes specific code on a signal, but it cannot pick specific signals to trap. There are three forms;

onintr use default signal handling
onintr - ignore signals
onintr label jump to the specified label and continue execution from there. The syntax of a label is: label:

Apparently there is also a goto command.

Redirection

Standard channel (stdin / stdout / stderr) redirection is similar, but not the same, as other shells. Channel numbers are not used,

command > filename redirect stdout
command >> filename append stdout
command <>& filename redirect stdout and stderr to the same file
command >>& filename append stdout and stderr to the same file

The only way to direct stdout and stderr separately is by invoking a subshell, for example:

(command > out_file) >& err_file

Pipes are supported as normal, however co-processes are not.

The set command sets shell variables, so to set option noclobber:

set noclobber

That is, no –o. To override noclobber, append a !, as in:

command >! filename

Background jobs

Very similar to ksh – in fact Korn "stole" the idea from csh. The jobs and bg commands are supported.

Quote from man csh

"Although robust enough for general use, adventures into the esoteric periphery of the C shell may reveal unexpected quirks". You have been warned!


That Python next/send demo

import glob
import os

def get_dir(path):

while True:

pattern = path + '/*'
count = 1
for file in glob.iglob(pattern):
if os.path.isdir(file):
print count
count += 1
path = yield file
if path: break

if not path: break


gen = get_dir('C:/QA/Python')

print "about to next"
print next(gen)
print next(gen)
print gen.send('C:/QA')
print next(gen)

Gives:


about to next
1
C:/QA/Python\AdvancedPython
2
C:/QA/Python\Appendicies
1
C:/QA\Android
2
C:/QA\blanket

A little Python introspection

In Python we can load a module and assign an alias:

import module-name as alias-name

I was asked how a programmer can find the mapping between an alias and the real module. That was rather beyond the scope of the course, but here goes.

The locals() built-in, and it's sister, globals(), were mentioned in the course almost as a 'by the way'. In fact both are very useful for introspection. In this case locals() will give us the name of the module aliases. It also gives us all the other names that are local, but we are only interested in modules. We can use inspect.ismodule() to get only those names that refer to modules. Watch it: locals() returns a dictionary where the keys are the names, but the values are the objects themselves - handle with care. Fortunately, stringifying the object gives us a text string like this:

<module 'GetProcs' from 'C:\Python27\lib\site-packages\GetProcs.pyd'>

and it is simple regular expression to extract the names.

We still have issues. Most obvious is that any modules used for the introspection (inspect and re) are included in the data. In Python 3 we also get issues trying to read the locals dictionary because it changes during a loop (items() returns an iterator in Py3). We can solve both by putting the code into a different module (although there are other solutions). But then, how can we look at our callers namespace? Simple: sys._getframe(1), which exposes just about everything, warts and all. The locals dictionary is avalable through f_locals.

So, here is my module, which I named Spection:

import inspect,sys,re

def look():
for name,val in sys._getframe(1).f_locals.items():
if inspect.ismodule(val):
fullnm = str(val)
if not '(built-in)' in fullnm and \
not __name__ in fullnm:
m = re.search(r"'(.+)'.*'(.+)'",fullnm)
module,path = m.groups()
print "%-12s maps to %s" % (name,path)

The user's code looks like this:

import Spection
import glob as wildcard
import ConfigParser as parser
import GetProcs as ps

Spection.look()

And the output looks like:

ps maps to C:\Python27\lib\site-packages\GetProcs.pyd
parser maps to C:\Python27\Lib\ConfigParser.pyc
wildcard maps to C:\Python27\Lib\glob.pyc