Saturday, June 19, 2004

A few days back I found what seemed to be a book about Ruby. This was being discussed on the Ruby mailing list. It’s called “A Little Ruby” or more precisely “A Little Ruby. A Lot of Objects”. You can find it here:

http://web.archive.org/web/20030618203059/visibleworkings.com/little-ruby/

(Someday it will be available here: http://www.visibleworkings.com/little-ruby/ )

 

Instead of writing the whole thing myself or copy paste it, I ask you to simply go read the book. That is my blog entry for the day.

 

The “Little Ruby” book is a conversation between two people where some sublime ideas about the design philosophy of the Ruby language are discussed. The book itself is a pleasure to read and more importantly, to think about. (It is an incomplete book, only 3 chapters – the author Brain Marick said on the Ruby list that he hopes to complete it sometime).

 

Reading “Little Ruby” put in a phrase in my thinking – “Model of Computation”, I don’t know if this sounds sober, but I think this is what I am really looking for.

In all my tinkering around languages, compilers, runtimes and other things – I am looking for a Model of Computation, a fundamental set of programmatic thought abstractions that are beautiful and can encompass various forms of programming.

 

The Little Ruby book talks about a model of computation where all computation is simply built around the idea of passing messages to objects. It is a simple concrete idea with which the rest of the Ruby world is built (apart of syntactic sugar). I don’t know if you are used to thinking in this way – but it is a powerful form of thought.

 

Let me quote from one of the conversation toward the end of the third chapter (the last chapter that is written so far):

 

“A language that provides lots of features

will always be missing that one feature you

need.”

 

“But a language that chooses the right

simple rules for you to combine lets you

build the features you need.”

 

This is the basic idea of composition – small integral units that compose to produce powerful behavioral entities. Have you ever thought why a unix command shell guy never really thought much of a Win/Dos user – because somewhere the way the shell forces you to thinking terms of composition of small do-one-thing-well tools and create powerful meta-tools, is a greater thought pattern.

 

You might have heard this being said about tools in the old unix culture (I say ‘old’ because I have different opinions of ‘unix’ culture as it is now)

 

"This is the Unix philosophy. Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

--Doug McIlroy

 

The “Little Ruby” book is inspired by the old book “The Little LISPer”. Something that is now on my reading list – I can’t seem to get a copy of this anywhere. The present edition of the book is called “The Little Schemer”. The book is co written by Prof Daniel P Friedman of Indiana University and Prof Matthias Felleisen of Rice University. The Little Schemer discusses a different model of computation from what the “Little Ruby” describes.

 

I did not know this then, but sometime last year I was in email correspondence with Prof Friedman. That time, had I known that he is author of a respected LISP text book, I might have been frightened off the prospect of asking this -  but in one of the mails I had asked “why Lisp?”

 

Roshan,

 

The most fundamental building block of computation is composition. If the language does not support composition in a trivial way, then I have no use for it.  ML, Haskell, LISP, and Scheme each give a kind of composition.  Composition is the building block of Category Theory, which is a unifying tool that helps clarify much of mathematics. and logic.  So, thinking that it would be okay to use a language that does not support composition is impossible for me.

 

(I quote this here presently without his permission, I believe he would be ok though).

I didn’t understand him then. But now after a year, I think I am closer to understanding him.

 

What would a unified model of computation be? Can such a thing exist? Can we think of all computation using a set of minimal and powerful abstraction such that every other form of computation can be built out of it. Can this be one that is easy and fun to use that we could interact with this force on a day to day basis.

 

And what forms the underlying foundation for computation then might also form the underlying basis for other systems of organized thought as well. This is like the dream of Grand Unified Field Theory in physics. Can something like that exist in the computational systems as well?

 

I don’t know enough to guess. But however I believe that as long we keep pursuing computing in a way that is fun and simple, we are probably on the right track.

 

 

To end this entry I want to quote from the preface of the little ruby:

 

Welcome to my little book. In it, my goal is to teach you a way to think about computation, to show you how far you can take a simple idea: that all computation consists of sending messages to objects. Object-oriented programming is no longer unusual, but taking it to the extreme - making everything an object - is still supported by only a few programming languages.

 

Can I justify this book in practical terms? Will reading it make you a better programmer, even if you never use "call with current continuation" or indulge in "metaclass hackery"? I think it might, but perhaps only if you're the sort of person who would read this sort of book even if it had no practical value.

 

The real reason for reading this book is that the ideas in it are neat. There's an intellectual heritage here, a history of people building idea upon idea. It's an academic heritage, but not in the fussy sense. It's more a joyous heritage of tinkerers, of people buttonholing their friends and saying, "You know, if I take that and think about it like this, look what I can do!"

 

As a closing note, sometime last year I was looking to do research under someone working with the SSCLI code base and work on virtual machines and runtimes. I wanted to do my Masters.

 

At that time the best way I could describe what I wanted to do was to say that I was looking runtimes and virtual machines research with a specific interest in SSCLI. Now, maybe I can describe myself a little better.

 

The only way I could think of doing this that time was to ask around in online forums and mailing lists about universities doing work with Rotor. That accompanied by a barrage of mails to everyone who I thought might know, or point me in the right direction. One name that came up was of Prof Ralf Johnson of UIUC. Right now I was looking for Brian Marick (author of little ruby) on Google, Brian is research student doing his PhD under Prof. Johnson.

 

Saturday, June 19, 2004 2:50:25 AM (Eastern Standard Time, UTC-05:00)  #    Comments [2]  | 
 Tuesday, June 15, 2004

India Advocates Day(s) 2004 happened a few weeks back (29 and 30th May) – MVPs, Microsoft Regional Directors and a select few Student Champs are invited. The event this year was at the luxurious Park Hyatt at Goa.

 

 

  

 

Images from the Park Hyatt - IAD 2004 venue.

 

Pooja at IAD.

 

Me - Self picture at the Hyatt beach

 

 

MVP crowd - also (first from left) Abhishek Kant - India MVP lead and
(fourth from left) Shu-Fen Cally Ko - Regional Director for MVP Program and Community - Asia Pacific and Greater China Region

 

IAD conferencing

 

 

  

Churches of Old Goa - Bom de Jesus

 

Tuesday, June 15, 2004 1:15:31 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Monday, June 14, 2004

I just had to drop these links at the expense of a separate entry, for them.

http://primates.ximian.com/~miguel/archive/2004/Jun-08.html

http://primates.ximian.com/~miguel/archive/2004/May-31.html

(The Marcelo in the last picture on this entry is Marcelo Tosatti – Maintainer of the Linux kernel)

 

There are reasons why many of my friends who work non Linux technologies are generally treat work on Linux as user hostile and generally immature. I personally think that while the majority of the Linux crowd may have their head in the clouds, the serious programmers are the same sort of the free style systems hackers that we idealize – despite the difference in technologies. I appreciate these folk for their spirit and sometimes for sheer smartness.

Monday, June 14, 2004 6:40:54 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, June 12, 2004

With some things cleared up and some ground work done, here is one of the first the things I want to talk about in Monad – the new Microsoft Command Shell(msh).

 

In my first article I talked about ‘cmdlets’ and that they return .Net objects. Here is one other smart and subtle thing they did with Monad – Parameters to cmdlets come not only on the command line but can also come from objects in the input object pipeline.

 

Some basic stuff on cmdlets and parameters which might help set the stage for understanding things. A command-let can define fields/member variables as part of the command let class. These fields can then be decorated by attributes so that they can be treated as parameters to the commandlet.

 

In code it actually looks like this:

 

[CmdletDeclaration( "demo", "cmdlet" )]

public class HelloWorld : Cmdlet

{

        [ParsingPromptString("Enter a string to echo: " )]

        [ParsingParameterMapping(0)]

        [ParsingMandatoryParameter]

        [ParsingAllowPipelineInput]

        private string message;

(This snippet is a modified version of code from the beta documentation)

 

Now this might look a little over decorated with attributes, but the thing I want you to notice is that that is a member called ‘message’ which simply has a few attributes applied. Applying those attributes causes ‘message’ of type System.String to be a parameter required for our command let.

 

Unlike traditional command line exes, the responsibility of parsing the command line options and assigning them to internal variables is not the responsibility of the program but is automatically done for you by the Monad shell.

 

Which is to say that by the time code that you write in the cmdlet is executed, values are already assigned to the parameter variables by the shell. (Not strictly, but almost).

 

The shell can let you assign parameter values either by specifying them at a certain postion in the command line – for example argv[1] will be the value for ‘message’, argv[2] will be the value for something else and so on. Or the shell lets you  specify the parameter name and then the value

> demo-cmdlet –message “hello world” -< parameter name >  < value >

 

The other good thing (which is new and what this log entry was about) is that the value for a parameter can be extracted from an object on the pipeline if the object has a field/member of the same name. (The types have to be consistent)

 

So if I have a command let that generates an object of type foobar that has a member of name ‘message’, I can pipe the output to the commandlet that required a parameter of name ‘message’ and it would all work.

> create-foobar | demo-cmdlet

Would create foorbar instances that get piped to demo-cmdlet which uses the ‘foobar.message’ as its message parameter.

 

Where would this be used?

 

For an example there is a command-let called ‘get-process’. It lists the running processes on the system. To be more precise it returns a collection of process instances which get standard formatted to the console.

MSH 5 C:/>get-process

 

ProcessName                  Id   HandleCount   WorkingSet

-----------                  --   -----------   ----------

CcmExec                     288           480     14008320

cmd                         804            22      1421312

csrss                       464           669      4743168

dfssvc                      316            70      3260416

DWRCS                      1444            44      2560000

explorer                   3520           366     20926464

FrameworkService           1544           303      9367552

Idle                          0             0        16384

 

Similarly to see all the instances of notepad that are running I would simply say

MSH 16 C:/>get-process note*

 

ProcessName                  Id   HandleCount   WorkingSet

-----------                  --   -----------   ----------

notepad                    3912            16      1912832

notepad                    4044            16      1912832

notepad                    3056            16      1912832

 

Now there is a cmdlet called stop-process which can terminate a process if you pass it the process id as a parameter. The parameter name that stop-process expects is called ‘Id’.

MSH 11 C:/>command stop-process

 

Command: stop-process

Command Parameters:

        Id                    : Int32[]         : Optional

        ProcessName           : String[]        : Optional

 

So with all the earlier talk you can conclude that if I simply wanted to kill all the instances of notepad that are running I could type

MSH 11 C:/>get-process note* | stop-process

 

Now isn’t that clean? Just to add a bit of garnishing to that, Monad defines ‘ps’ as an alias to get-process and ‘kill’ as an alias to ‘stop-process’.

So now you can say

MSH 11 C:/>ps note* | kill

 

Cool? This works on Monad today.

 

Prev:

Introductory entry about Monad 

 

Next:

ObjShell: A precursor of Monad?

Saturday, June 12, 2004 6:54:31 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 

Folk, after a few mails we got confirmation from Jeffrey Snover himself, architect of Monad, clearing up any NDA issues. We are free to blog, write articles, talk about it etc etc.

 

Among other things watch out for the next .Net show on MSDN, they are covering Jeffrey Snover talking about Monad. Here is a blog entry by Robert Hess:

http://blogs.msdn.com/theshow/archive/2004/05/19.aspx

 

Also there is a build of Moand that might be made available in July which is more complete in the shell language than the present build is. That’s a lot to look forward to.

 

Meanwhile on a personal front this, gives us the personal freedom to explore Monad and talk about it :-)

Saturday, June 12, 2004 3:06:32 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Friday, June 11, 2004

Since yesterday I have been thinking about NDAs. Yesterday I wrote the entry below about Monad and Pooja wrote hers, and I have been thinking.

 

The reason is this – MVPs before being awarded the title have to sign an NDA that says that certain information that Microsoft may reveal to them may not be publicly disclosed. The NDA is one in good spirit where employee of Microsoft who are part of product teams and doing such other core work may freely interact with MVPs about future products and ideas that are still being tested and such. A lot of MVPs actually give direct feed back to the product teams which reflect on the products that you see tomorrow.

 

The MVP program by its very nature is an award program and the winner of the title doesn’t directly commit anything to Microsoft. So a lot of the feed back from MVPs is neutral and critical in a very constructive sort of way, because MVPs really love their technology.

 

The problem with the NDA is simply that of late most MVPs (at least in the India circuit) don’t have a clear way of saying what is under NDA. We actually get to hear SO much about so many things happening that we are really not sure. So breaches of the NDA do happen simply one did not know that an item is under the NDA.

 

One thing that we were told of is that when in doubt – check with your MVP lead. That happens, but sometimes that is not very feasible. Sometimes you don’t even think of checking about something. Which is when another ‘rule of thumb’ was proposed at the India advocates day, 2004. At IAD it seemed ‘common-sense’ that what ever we can find on the web already is simply not under NDA – if we know something and it is not on the web yet (duh?) then it is under NDA.

 

This makes things a little tricky. Like when writing about Monad, I realize that a lot of information is actually available on the web – admittedly in bits and pieces, but still there. Now that I have access to the stuff as part of the beta program, can I write about it or not? We had a discussion last night with the India MVP lead and the ex India MVP lead and some of the Bangalore MVPs and to my surprise I was hearing that none of the stuff from the beta place could actually be disclosed. Also the above ‘rule of thumb’ stands corrected to ‘anything found on the Microsoft site is not under NDA’. !

 

Now that has some obvious contradictions – how for example do I know that I am talking about confidential information when the information is publicly available in some form? If there is a document that marks it as confidential but I do not have access to the document, does that make me in violation of the NDA? If I do have access to the confidential document, then what happens to conclusions I can draw from public information that is not explicitly stated elsewhere (though deducible) but is present in the document?

 

Some of this got me thinking today morning at the hacker Knight Lightning’s trial a decade back. Knight Lightning was brought to trial by the US secret service for stealing a confidential AT&T technical document that was estimated at 70k dollars or more (forgive my fading memory). The document was the centre of the debate there and in some sense was treated by the prosecution as being too sensitive to show even during the trial. The then newly formed Electronic Frontier Foundation under John Perry Barlow and Mitch Kapor came to Knight Lightning’s aid in the defense. It turned out that the document hardly discussed technical details of a sensitive nature. The cost of the document was a grossly over exaggerated figure, piled up as sheer administrative over head costs (things like the cost of the computer system used to typeset the document were added as the cost of the document). And as a final blow to the case it turned out that AT&T was actually selling documents of a similar but more technically detailed nature for hobbyists and enthusiasts to use (for about 13 dollars?) – which neither the prosecutors nor Knight Lightning knew about.

 

The issue about information being confidential while still being available in some form publicly is a very tricky one.

 

My own first exposure to the term ‘NDA’ was when I heard the recording of a speech by Richard M Stallman (founder of the Free Software Foundation) at Slovenia. RMS was talking about how an NDA imposed by Xerox for the printer driver software hurt the guys at MIT who were trying to fix a faulty laser printer that kept getting jammed. Stallman’s message was that NDAs “do have victims”. He did make several valid points and after listening to RMS several times I was sensitized to the issue of NDAs. So admittedly when I signed my first NDA with the company where I work, I did so after reading the document over several times and did it with shaky hands.

 

The issue about writing about Moand itself is a simple one – I had dropped a mail to the one of the contacts on the Monad team and I got prompt response. A few clarifications are left, but it seems to me that everything is in good faith now. In the case of Monad itself it is not an issue, especially when most of the folk at MS are so approachable and prompt when it comes to an relevant issue. The MVP crowd and the people around the MVP program are also were receptive and quick to respond about any queries.

 

However the general issue about NDAs itself is a relevant and could because serious issues really quickly, if communication between parties is not as transparent as in cases like mine.

 

Add to that I heard this rather recently – you cant reveal that you are under NDA? What? There is a lot I don’t understand. The thing about systems programming is that opinions are fact clearly distinguish each other – at least they are only a compilation away. Matters like this….  :-)

Friday, June 11, 2004 2:06:28 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Thursday, June 10, 2004

For a little less than a week now, I have been playing with the March build of the new command shell that Microsoft will be releasing in the Longhorn time frame. Being a command line/console enthusiast, I find this rather exciting news. Indeed it has been years since MS actually did anything significant to better its command shell – Monad is it.

 

I don’t really know how much I am allowed to talk about Monad itself because of some NDA material, but if you look around on the net you will find a pretty much a lot. One thing about the way the NDA was explained to me recently was that they said – if you can find the material on the web then assume its not under NDA !! (whatever .. )

 

One thing that Monad does differently from other command shells, be it cmd.exe or bash/ksh/csh is that Monad is centered around the idea of an object pipeline. Let me explain:

 

Traditionally one of the ways of defining a console application is to say that it is an application that was access to three streams by default – usually numbered 0,1 and 2 and called the stdin, stdout and stderr. These are text streams. This means that you can read/write text data from these streams.

 

When you pipe the output of one program to another program on the command line, what you are basically doing is that you are ‘connecting’ the stdout of the first process to the stdin of the second process. So text that it is outputted from one process is treated as input text for the next process. This is known territory.

 

For a long time I appreciated the beauty and simplicity of this approach – what could be better, well Monad could be.

 

Monad provides ‘applications’ (I use this term loosely here) on the shell with three streams – but these are not text streams but object streams. Applications that read and write objects in the object pipeline are not traditional processes but are called ‘commandlets’. A cmdlet is actually just a .Net class that is decorated by certain attributes (etc etc).

 

When cmdlets are executed from the msh command line, these classes are instantiated and are given access to the object streams. So one cmdlet writes objects to the output object stream which is read in through the input object stream of the other commandlet. In essence the same idea of piping as with text streams, but this time what passes around are full blown .Net objects.

 

Giving this idea a little thought, you will realize that this makes for much richer shell programming. Like for example a lot of time effort is spent in shell programming (traditionally a unix forte) is scrapping out meaningful values from the output of other commands. Even entire languages (awk) have been created largely for this kind of text scrapping. For example if you do a ps –ax, then you really need to cut out (literally) the columns of text where the process id falls so that you can automatically call a kill command with it.

 

This kind of difficulty completely disappears with Monad simply because the cmdlets down the object stream get full objects and so they can examine the fields of these objects for the parameters they want.

 

Of course this kind of functionality would have been near impossible in the pre .Net era because there was no understandable concept of a ‘type’ that was applicable across languages. The shell being purely a .Net shell the command-lets can talk .Net and can thus consume any .net type. (Remember this is one of the basic tenets of the .Net platform - that its tries to provide a level playing field for languages to interact).

 

When a command let returns a set of types and there are no more command lets in the pipeline to consume to objects then the objects are formatted in some standard way to the console. What that means is that if we have a command called ‘ps’ it would be expected to return a collection of objects each of which represent a running process. Now if at the console I type simply

> ps

then there are no more commandlets that receive the process objects returned by ‘ps’. These objects are formatted in a standard manner to the console so that user can see them. Nice?

 

Monad has really appealed to me in the short time I have spent with it. Traditional shells like bash may have some catching up to do in the light of power like this. Of course Monad is presently at a nascent stage. However for an early pre-beta, Monad is already a very sophisticated and elegant system.

 

I don’t know if I have crashed any NDAs already. I need to check up on that before I continue.

 

 

Next:
Cmdlet parameter binding in Monad 

Other:
Some of my ranting about NDA

Clarifications on NDA related issues about Monad
Pooja on deployment of Monad on 'unsupported' OSes

 

Thursday, June 10, 2004 6:25:24 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Thursday, June 03, 2004

Antonio has a post about generalizing environment classes (that capture state of a closure) with using generic types, such that the class itself captures only the arity of the environment.

 

To quote:

http://rotor.di.unipi.it/cisterni/Lists/My%20Blog/DispForm.aspx?ID=15

In general we may think that the compiler generates several environment classes, for the needed arities; for instance:

 

class Environment3<A, B, C> {

  A a;

  B b;

  C c;

}

 

There is one issue that I can think of, off the top of my head – the CLR has a an excellent approach to generics (among the best I have seen) and is well described in Don Syme’s paper here:

The Design and Implementation of Generics for the .NET Common Language Runtime

http://research.microsoft.com/projects/clrgen/generics.pdf

 

the thing about CLR generics is that they are very efficient for all reference types, because for reference types there is no specialization of templating behavior (as with classical c++ style generics). All reference types use the class definition during runtime.

 

However value types cause the runtime to generate specialized classes to handle type of a value type that is used is a templated entity. Classes definitions are shared by value types only when they share the same foot print with respect to the GC.

 

So it would be better is compiler actually generated specialized classes to hold environment state whenever is knows that the types the environment needs to hold are value types. This simply provides for a performance benefit, because the specialization of the class will not happen at runtime, instead will be done at compile time.

 

 

Antonio also discusses a private member access issue – again I don’t think I fully get him. Assuming the new delegate mechanism is in place we could have classes that look like this

 

class Env

{

        //have only public members

}

 

class Foo

{

        //original method

        void bar()

        {

        }

       

        //anonymous compiler generated method

        void anon_bar()

        {

                //access all members of Foo here

                //access all public members of Env here

        }

}

 

Is there a need to make members to Env private? The entire point of having Env is simply to act as a place holder for some values. Better yet (I don’t know if the old friend method mechanism works), but if friend decls are possible then the anonymous method can be declared as a friend in class Env. This does not add to the class definition in any way, it would simply allow for member access.

 

 

You might want to look at these links to follow the sequence of these posts –

1)       Closures in CLR 2.0

2)       Implementation of Closures (Anonymous Methods) in C# 2.0 (Part 6)

3)       More on CLR 2.0 closures

4)       Closure implementation enhancement in CLR 2.0 using the new delegate mechanism

5)       Again on closures

6)       this post

Thursday, June 03, 2004 12:58:34 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  |