The Darke Side: 2012

Thursday, September 06, 2012

It's a bazaar

I never really agreed with Eric Raymond's seminal work "The Cathedral and the Bazaar". He was making the point that closed source software development takes place in a cathedral-like environment, i.e. hierarchical, whereas open source development was more like a bazaar where everyone is equal. I can name quite a few open source projects that are hierarchical, complete with cult figures and religious fervour. The mobile market though, now I can see Raymond's point.

Fear, Uncertainty, and Doubt (FUD) was a phrase I first heard relating to a certain purveyor of large blue mainframes. It could equally apply to any large corporate player. "No one ever got fired for choosing IBM" was a familiar phrase in the 1970s, replace IBM with Microsoft and that brings it up to date.

But not in the mobile market. Whereas corporates buy their IT systems based on measured evaluations and empirical evidence (and if you believe that you should not be in sales). Mobile devices are as much bling as work-a-day tools. The market has many more players, and lacks the religious loyalty familiar elsewhere. Windows dominates the desktop; will Windows 8 penetrate the mobile market? Microsoft is now selling in the bazaar.

Nokia's recent launch of their Lumia 920 has had reasonable reviews, it is early days and I'm not sure why Nokia's shares plummeted by 11% after the launch – what did the City expect? Microsoft's shares were virtually unchanged. Despite Nokia being closely familiar with Microsoft, several other manufacturers will offer Windows 8 telephones, including those also offering Android. Microsoft cannot afford to turn away players like Samsung, but where does that leave Nokia? Their new telephone has some interesting technical innovations, but I would love to see it running Android. It won't take long for the competition to catch-up, but I don't find the idea of an electronic wallet appealing, with opportunities for NFC pick-pockets.

Windows has yet to get that critical mass required to make a Windows phone cool. Meanwhile it has to shout against all the other traders in the bazaar.

All eyes now on Apple – with the attributes of the cathedral – and the iPhone 5.

Wednesday, September 05, 2012

The march of mobile

Those of you who reside on another planet might have missed the largest change to the IT industry since the release of the IBM PC – mobile computing.

There are parallels with the mid-1980s. Early versions of PC-DOS were dire, and so were the first few versions of Android. There are lots of differences though; there is a solid, mature operating system at the heart of Android – Linux, whereas an even older BSD UNIX is the stable base of iOS and OSX, the Apple operating systems.

The fight between UNIX-Linux-Windows is an old one and is continuing in the mobile market today, although the products are very different from the originals.

Comparing the approach of each supplier is interesting. Apple does not license its software to anyone, and is the sole supplier of hardware. This means they have complete control over everything from manufacture through marketing, sales and support, and even the programming language used for applications (ObjectiveC). Software suppliers only have to test on a small range of hardware, and are closely controlled.

Google, who produce Android, have taken the Open Source route. Linux, and other Open Source software, has long been a favourite of device manufacturers. Why? Because they are free to modify it to run effectively on their own devices. Software components like telephony are usually proprietary and can be plugged into a modified Linux. Most of Android is Open Source, and this enables even start-ups to be involved and encourages innovation. The range of Android devices is huge, some software testers have a stash of over 400 devices on which to test their code, and that will have to expand with TVs and games consoles using Android more and more.

Android application development mostly uses Java. Google have avoided issues with Oracle (who now own Java) by using their own runtime environment. Anyone can produce applications for Android, on any subject, including some distinctly dodgy ones.

Unusually, Microsoft occupies a middle ground. They are closely allied with Nokia, which Google described as "two turkeys don't make an eagle", yet many other manufacturers have licenses to produce Windows phones, including Samsung, the largest. Whether the Windows 8 interface will appeal or not remains to be seen, but no one should underestimate Microsoft. They only have about 3% of the mobile market right now, but this is an incredibly fast moving world. Microsoft have several advantages: their user interface is familiar to most people for a start. There have been mixed reviews of the UI for Windows 8, but can anyone remember a Windows Beta release that has been any different? Not everything that Microsoft touches turns to gold, remember Vista, and their Windows CE attempts "also ran".

Microsoft Office dominates corporate office applications, yet none of the "compatibles" on Android are close. That might change, there is certainly an opportunity for Adobe there, but in the short term Microsoft Office could be the killer app. Personally I have doubts whether that monolith will perform well on mobile devices, but we shall see. Hardware is getting more powerful day-by-day, with quad-cores on mobiles being the norm for high-end devices.

The position of .Net as a development environment is interesting, in that it will run on all three operating systems using a layer called Mono. This makes cross-platform development a possibility, and its performance compares favourably with Java, even on Android.

I have been working with Ian Wallington on producing an Android course "Developing Android Applications" (QAANDDEV), and we have been struck by how fast changing the environment is. We have been in IT a long time and have worked in many environments between us, but the Android world takes some beating for the rate of change. There is at least one new version of Android each year – that is not so different to iOS, but that is not the only variable. Because development environments and tools come from different suppliers they are not co-ordinated with the Android release - changes and new products come at an astonishing rate. Devices we bought at the start of this project six months ago are now uncool and out-of-date. During the course development I had to scrap a chapter and start again because the development environment changed. Ian has been trying to decipher the latest development techniques when there is little (accurate) documentation. I have found books from usually reliable publishers to be out-of-date, even those published a few months ago, and full of bugs.

This dynamic environment shows no sign of slowing, the odd patent law-suite will just speed innovation to get around it. We had better get used to it. In my view (and remember that I'm biased) 20^th century monolithic cultures like Apple and Microsoft will not be able to keep-up with this rate of change. Google embraces the culture and is a true 21^st century company, combining corporate muscle with community effort. Our aim is to keep at the forefront of this pace so we can help our clients exploit the benefits, regardless of which platform they decide on.

Wednesday, July 18, 2012

Korn Shell programming and being "dotted"

While teaching a Korn shell course I was asked "where do you stand on exit verses return"? To be honest I had not realised you could use return outside a function in Korn shell.
Well you can, but not in Bash, and the POSIX standard says that the effect is undefined.

Why would you want to? The effect is exactly the same as exit, unless you have been executed through the dot command, "sourced", or "dotted". The argument goes that if you use exit then when your script is invoked using 'dot' then it would exit the caller, whereas return does not.
I find the argument fundamentally flawed. It is folly to assume that it is safe to execute any old script in this way, because it breaks a fundamental rule of programming - encapsulation.
"Dotted" files do not run in their own namespace or environment. Any change, for example a cd command, will alter the caller. Therefore writing a script to be safe would also involve restoring any changes.
Now, what about variables and functions? ANY declared in the called script will overwrite those in the calling program. If you are unaware of the script you are using ("I don't know if it has exit in it") then you will be unaware of its variable names and functions. It could do absolutly anything to your environment (I use the term loosly). An important principle of encapsulation is that you have your own namespace, of course with 'dot' you do not.

It works both ways. Take the following senario:

typeset -i x
. myscript

If myscript has a varible called 'x', and it is used for non-numeric text, then the typeset will have the effect of altering that text to "0". What if x contains a filename in the script? This could completely alter or invalidate the action of myscript.
The dot command is designed specifically so that the called code WILL change your current process, but the code has to be written specifically for that job. Having blind faith that a script will not trash you current session, and will even work in your environment, is taking a big risk.
I suggest, in ksh93, that you put this at the head of your scripts:

if [[ $0 != ${.sh.file} ]]
then
    echo "Do not . this file!" >&2
    
    return 1
fi

Sunday, April 29, 2012

Are we loosing C skills? Should we care?

A sign of the times. In the last three weeks I have taught two Python courses and one Perl. Last week a C programming course was running with just one delegate. I have not taught C for several years despite being the only trainer for C in the company (I do not include ObjectiveC). This week I will be teaching UNIX programming using C for the first time for four years.

So what? Software moves on, no one uses C anymore. Really?

Perl, PHP, Python, Ruby, are all written in C. Most of Windows and Linux is written in C. Sure, there are versions like Jython, but the use of that variant is relatively small. Programmers of my generation are gradually "logging-out", what happens then? The number of new C programmers seems very low. That will be fine for while, but we are loosing the skills to operate at the lower levels.

Will the systems of the future be able to build on the C base we have, or will everyone be using high level tools? Does it matter? I think it does, the foundations of software should be solid, not top-heavy. I don't think you can really understand UNIX or Linux unless you understand C. Who will pickup the baton?

Friday, March 02, 2012

C-shell: basic features compared to ksh

Sign-on files

The following files, if they exist, are read at sign-on:

/etc/csh.login instead of /etc/profile - interactive (login) sessions only
~/.cshrc instead of $ENV
/etc/csh.cshrc if supported, not all c shells use this
~/.login instead of .profile - interactive (login) sessions only

On sign-off the optional file ~/.logout is executed (no equivalent in ksh).
In-line scripts

To execute a script in-line, that is without creating a sub-shell, use the source command (instead of the . command), as in bash. For example:

source ~/.login

Filename Completion

An interactive C shell can complete a partially typed filename or user name. By default this feature is switched off, to turn it on:

set filec

When the user types characters followed by , the shell attempts to fill in the remaining characters of a matching filename, assuming the characters typed are enough to identify a single file.

If the filename prefix is followed by EOF (usually d), the shell lists all filenames that
match. It then prompts once again, supplying the incomplete command line typed so far.

Ending the name with a tilde (~) indicates that a username is required instead.
Aliases

The syntax for alias in the C shell is slightly different to other shells:

alias new-alias existing-command

i.e. there is no assignment character (=).

Functions

No. There are none.

History mechanism

The history list (and the history command) exists as in other shells, however the syntax for retrieving lines from it is different to ksh. The following table refers to lines in the history list:


!n The nth line in the history list
!-n The nth previous line
!! The last line
!prefix The most recent line with the specified prefix
^xx^zz The last line, with the string xx replaced by zz
!n:s/xx/zz/ The nth line, with the string xx replaced by zz
!* All the arguments of the last command
!$ The last argument of the last command (instead of $_)
!^ The first argument of the last command
!:n The nth argument of the last command

This is also supported by bash.

Variables

Pre-defined variables are different from other shells, the most noticeable feature being that they are in lower and upper case. The list of variables is large, here are a few:

The exit status of the previous command is $status (instead of $?).

The command line prompt is $prompt (instead of $PS1). The contents of which may contain a large range of format characters. It is too huge to list here, see the man pages for details. Some versions may also support $prompt2 and $prompt3.

Command line arguments are set in the array $argv (see below).

The current working directory is in $cwd (not $PWD).

Lowercase versions of path, home, and term also exist. The lowercase version of path is set from PATH, but is only used by the shell. This does not mean that path can be altered without affecting PATH, since the two are kept synchronised. A similar scheme exists with term, home, and others. Only the upper-case versions are exported. Note that path is a list, i.e. the directory names are separated by white-space, not colons.

Setting values

Local variables may be assigned values using set (which is more like BASIC than C), for example:

set var = "She sold sea-shells"

Note that spaces are optional around the assignment symbol (=) – hooray! The = is mandatory.

To create a variable in the environment block (export), use setenv:

setenv var "She sold sea-shells"

They may be removed using unsetenv.

There must be no assignment symbol (=) with setenv.
If you have an existing local variable of the same name then that is not overwritten, a new one is created. For example:


$ set var=red
$ setenv var blue
$ echo $var
red
$ csh # Create a child
$ echo $var
blue
$ exit

In the example above, if we had just done

    setenv var

then the child process would see an empty variable.

Arrays

Like ksh and bash, csh arrays cannot be exported. However unlike ksh and bash csh does not allow you to try. Attempting to setenv an array will give an error.

An array is initialised from a list (as in bash and ksh93), where the elements are space separated and delimited by parentheses, for example:

set arry = (HPUX AIX DGUX Dynix Tru64 DRSNX SunOS Linux)

The old ksh88 method of initialising an array with set –A is not supported.

Access an element of the array using the index, counting from 1 (which is weird, since in C we count from zero).


echo $arry[3]
DGUX

We can also access a range of elements:


echo $arry[2-4]
AIX DGUX Dynix

and the whole array using an index of * :


echo $arry[*]
HPUX AIX DGUX Dynix Tru64 DRSNX SunOS Linux

This is the same as printing the array variable (echo $arry).

The number of elements in the array is held in a variable with the same name as the array, prefixed with a #, for example:


echo $#arry
8

so getting the last element is simple:


echo $arry[$#arry]
Linux

With other shells we would have to use braces ({…}), which are not needed in this context, but may still be used to delimit a variable name:


set money = "dollar"
echo "Give me your ${money}s"

Command-line arguments

The list of command line arguments is held in the array argv. To get the complete list of arguments, use $argv[*] (instead of $* or $@). The variables $0..$n are still available, but may alternatively be obtained using $argv[n]. A C programmer would also expect a variable called argc, but the number of arguments is held in #argv instead. See also Arrays.

The command shift is supported as in ksh and bash.

Modifiers

The C shell also supports modifier codes. These are used by suffixing a variable name with a colon (:), followed by the code letter:

h remove a trailing path name (similar to dirname)
t remove all leading path names (similar to basename)
r remove a file "extension" (suffixed preceded by a dot).
e remove the filename preceding a file "extension"

When used on a whole list or array, they should be preceded by a g (gh, gt, gr, ge) otherwise they will only be applied to the first element. The exception is the modifier code q, which places the extracted data in quotes (thus avoiding $*/$@ differences).

For example:


set file = /home/user1/Seas/North.c

echo $file:h
/home/user1/Seas

echo $file:t
North.c

echo $file:r
/home/user1/Seas/North

echo $file:e
c

Other variable operations:

Shell Options
There are no shell options like ksh and bash have. Instead everything is controlled by shell variables, and many have the same names as options in ksh, like noclobber, ignoreeof, noglob, etc. See the man pages for a complete list.

To set a shell option you normally only have to create the variable, you often do not need to give it a value, for example:

set ignoreeof

Variable types

No, not in csh. (ksh typeset) For example, to convert to uppercase we have to use:

set var = `echo $var | sed \ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/`

Reading from stdin

To read a variable from the keyboard (stdin), use the < var =" $<">to continue…"
$< Command substitution

Command substitution uses back-ticks ($(…) is not supported), for example:

set DIR = `ls`

Note that this will return a list, and make DIR into an array. This reduces the need for slicing, for example:

set now_date = `date '+%a %H %M %S'`

returns a four element array, where:

$now_date[1] is the weekday
$now_date[2] is the hour
$now_date[3] is the minute
$now_date[4] is the second

Arithmetic

Arithmetic commands are prefixed with the @ symbol, which is actually a built-in command and so must be followed by a space.

The usual C/C++/Java arithmetic operators, including post-fix increment and decrement (but not pre-fix), are supported, for example:

set x # variable must already exist
set j # variable must already exist
@ x++ # Increment $x
@ j = ($j + 42) * $x # Brackets are supported

# To continue over > 1 line, escape end-of-line

@ now_secs = ($now_date[2] * $hour_secs) + \
($now_date[3] * 60) + $now_date[4]

Conditionals

As well as the usual conditional statements (see below) the C shell supports a short cut for testing if a variable exists, the $? prefix. Confusingly it returns 0 if the variable does not exist, and 1 if it does. For example:

echo $?non_existent_variable
0

echo $?existing_variable
1

Relational operators

!= not equal
== equal (note : two = characters)
> greater than
>= greater than or equal to
< pattern1 ="~" file ="~" dirname =" $SYS_LPT/$usernm" userdir =" true" dirname =" '$SYS_JOB_'$type" result =" `eval" dirname =" '$SYS_'$type">Loops

As in most shells, there are two basic loops, the foreach loop, for list processing, and the while loop. The syntax is familiar but note, like Perl, the foreach loop does not contain an in:

foreach variable-name ( list )

# loop body

end

while boolean-expression

# loop body

end

The commands break and continue have the same meaning as in most languages, break exits the loop prematurely, and continue executes the next iteration at once.

Unconditional flow

The command onintr is similar to trap in ksh – it executes specific code on a signal, but it cannot pick specific signals to trap. There are three forms;

onintr use default signal handling
onintr - ignore signals
onintr label jump to the specified label and continue execution from there. The syntax of a label is: label:

Apparently there is also a goto command.

Redirection

Standard channel (stdin / stdout / stderr) redirection is similar, but not the same, as other shells. Channel numbers are not used,

command > filename redirect stdout
command >> filename append stdout
command <>& filename redirect stdout and stderr to the same file
command >>& filename append stdout and stderr to the same file

The only way to direct stdout and stderr separately is by invoking a subshell, for example:

(command > out_file) >& err_file

Pipes are supported as normal, however co-processes are not.

The set command sets shell variables, so to set option noclobber:

set noclobber

That is, no –o. To override noclobber, append a !, as in:

command >! filename

Background jobs

Very similar to ksh – in fact Korn "stole" the idea from csh. The jobs and bg commands are supported.

Quote from man csh

"Although robust enough for general use, adventures into the esoteric periphery of the C shell may reveal unexpected quirks". You have been warned!

That Python next/send demo

import glob
import os

def get_dir(path):
 
    while True:
  
        pattern = path + '/*'
        count = 1
        for file in glob.iglob(pattern):
            if os.path.isdir(file):
                print count
                count += 1
                path = yield file
                if path: break
      
        if not path: break  
          

gen = get_dir('C:/QA/Python')

print "about to next"
print next(gen)
print next(gen)
print gen.send('C:/QA')
print next(gen)

Gives:



about to next
1
C:/QA/Python\AdvancedPython
2
C:/QA/Python\Appendicies
1
C:/QA\Android
2
C:/QA\blanket

A little Python introspection

In Python we can load a module and assign an alias:

import module-name as alias-name

I was asked how a programmer can find the mapping between an alias and the real module. That was rather beyond the scope of the course, but here goes.

The locals() built-in, and it's sister, globals(), were mentioned in the course almost as a 'by the way'. In fact both are very useful for introspection. In this case locals() will give us the name of the module aliases. It also gives us all the other names that are local, but we are only interested in modules. We can use inspect.ismodule() to get only those names that refer to modules. Watch it: locals() returns a dictionary where the keys are the names, but the values are the objects themselves - handle with care. Fortunately, stringifying the object gives us a text string like this:

<module 'GetProcs' from 'C:\Python27\lib\site-packages\GetProcs.pyd'>

and it is simple regular expression to extract the names.

We still have issues. Most obvious is that any modules used for the introspection (inspect and re) are included in the data. In Python 3 we also get issues trying to read the locals dictionary because it changes during a loop (items() returns an iterator in Py3). We can solve both by putting the code into a different module (although there are other solutions). But then, how can we look at our callers namespace? Simple: sys._getframe(1), which exposes just about everything, warts and all. The locals dictionary is avalable through f_locals.

So, here is my module, which I named Spection:


import inspect,sys,re

def look(): 
    for name,val in sys._getframe(1).f_locals.items():
        if inspect.ismodule(val):
            fullnm = str(val)
                if not '(built-in)' in fullnm and \
                   not __name__     in fullnm:
                    m = re.search(r"'(.+)'.*'(.+)'",fullnm)
                    module,path = m.groups()
                    print "%-12s maps to %s" % (name,path)

The user's code looks like this:


import Spection
import glob as wildcard
import ConfigParser as parser
import GetProcs as ps

Spection.look()

And the output looks like:


ps           maps to C:\Python27\lib\site-packages\GetProcs.pyd
parser       maps to C:\Python27\Lib\ConfigParser.pyc
wildcard     maps to C:\Python27\Lib\glob.pyc

The Darke Side