Wednesday, October 12, 2005

Last night I was writing some for OS161 – I was implementing a fork(), execv(), getpid(), waitpid() and exit() for the OS161 kernel. I was using Visual Studio Express VC++ edition beta2 to write the code and I had a little ruby script to upload changed file to my linux (gasp!) box and a little script there that would build the kernel whenever the files changed. That way I get to develop code on an OS that has a notion of a unified clipboard and such pleasantries and get to build it wherever the tools are available. I also got a minimal shell like first process done.

 

Don’t get me wrong when I say this, but I sometimes like to think of myself as a fairly decent imperative programmer – I would be comparably good to many people you might meet. Ever since I came to college functional programming has been rewiring my brain. And its beginning to hurt. And I am enjoying it. And sometimes you need a break.

 

Hence writing missing pieces in a simplistic os kernel is a nice break. And I am going “Ooh! Look I have variables! Ooh! Look I can reassign the value of this thing. Look, I have loops”. And that’s a nice world to be in, if you can think in terms of state well – for those of you imperative programmers don’t know what that is, it means that things have changable values (like x=5 and x ++;) – and those of you who don’t realize you are imperative programmers, here is a simple test: if you are not a Haskell (and similar) programmer Or Scheme programmer (without set!) Or mathematician Or someone who goes from day to day without anything in the world affecting you, you are a imperative programmer of some device.

 

So I was enjoying being imperative for a while and then when I had it all going halfway into the night – I had multiple processes forking and execing and waitpiding and exiting – the whole crashes. And then it crashed again. And this was crazy, because all the of the pieces were things I had tested separately and yet they crashed when they came together. Somewhere in the back of my head there was this little elephant called lambda laughing on the floor ignoring his cons-kit for a bit. But wait a minute, I thought all this functional programming was making me a better programmer – I thought I could think more clearly about composition.

 

After about 2hours of meditation and atleast one or two sets of forays into the debugger as a last recourse, I realized what the problem was that I was not writing things back onto the user space stack in the correct alignment in some cases – and the this was some dword alignment requirement of this machine that was not mentioned anywhere. Lambda and the Turing tape looked down, looking a little sorry. I was in a bad mood by the end of it and that wasn’t fair – that was my little imperative break.

 

All of this, gets me to some of the conversations with some of my professors here. Dan Friedman is one of the kindest gentlest people you can talk to – and pretty much every other sentence some brain mutilating idea, said in the kindest subtlest way possible. So it happened that I needed to prove that something is not possible (it is related to the hypothetical apply-iterator operator I speak of here). Fifiteen minutes after the conversation I am thinking about higher order programming with continuations and my brain is hurting. (do you know what that means? If you don’t you need to go though the experience of hitting upon an idea that changes the way you look at the world. Changes the way you look at the world does not mean that the next time you eat a burger you look at it slightly more from the left, it is a little more encompassing an idea.)

 

An idea isn’t responsible for the people who believe in it.

 

Programming languages are devices to express thought. The fact that you can also create software using them are only secondary to this larger idea. Languages carry ideas of meaning and operation and you can compose these things together to create larger meanings and wider operations, or compose them to narrow down their scope. If you are a starting C programmer or an accomplished C# or Java programmer, it maybe hard to see things this way – but that is ok, you don’t always have to know the full import of what you could be doing. (I will write about Haskell and the wizard called Prof Amr Sabry some other time.)

 

One of the hardest things or like Prof Friedman says it is probably the simplest of all things is the idea of a continuation. It is an idea so simple that it drives you crazy trying to understand it. Imagine that if you did something, anything, it has rippling effects across the rest of all time – now imagine that you can take this ‘rest of all time’ and assign it to a variable and pass it around through all the operations you do, such that at some time you can go back to that rest of the world. That is a continuation. That as a programming device is good way to inflict some brain damage if you try to internalize it.

 

That’s about when I thought I should quit. That I probably should not go functional. All of this was too hard – you start out with something because you feel that there is some beauty and some truth in it. And when that fails you, maybe its time to betray you way of thinking – maybe its time to quit. By this point, you realize, that I might be quiet crazy, and that it might be rubbing off on you since you are still reading this. Betray the temple of Lambda?

 

That’s when Judas’s song from Jesus Christ Super Star (which I have been listening to a lot these days) plays in the back of my mind, when he does to Caiphus to betray Christ -

Now if I help you, it matters that you see

These sordid kinda things are coming hard to me.

It's taken me some time to work out what to do

I weighed the whole thing out before I came to you.

I have no thought at all about my own reward.

I really didn't come here of my own accord.

Just don't say I'm ... damned for all time.

 

I came because I had to; I'm the one who saw.

Jesus can't control it like he did before.

And furthermore I know that Jesus thinks so too.

Jesus wouldn't mind that I was here with you.

I have no thought at all about my own reward.

I really didn't come here of my own accord.

Just don't say I'm ... damned for all time

- Damned for All time, Judas’s Song, JCSS.

 

And then something snapped back in place and I was at it again. Why have such a device as continuations? Why bother? Well we understand little of continuations in my opinion. They are a thinking tool for the future. How can something so mind bogglingly hard be useful?

 

A long time back I remember seeing a beginner book about BASIC programming. That book had an example of the quicksort algorithm in basic written using gotos. It was a only a few lines. I spent a whole evening trying to understand what that was doing. And I didn’t succeed. I could execute the statements in my mind and see how they work, but I could see how such a beast could be created. I couldn’t see the way you would have to think to create something like that. Several years later I saw copper bars by Tylisha C Anderson and it was the same death by gotos. I could not understand it.

 

Many years ago they got together and killed off gotos and created a new-kid-on-the-block called structured programming. They took gotos and created a set of patterns that are sufficiently expressive to do anything that you could do using gotos. And then they killed off goto. The offspring forms the basis of almost all imperative programming today.

 

The real value of something powerful that we cannot tame yet is in what you can derive out of it. That’s when you say, in the conventional sense, that something is sufficiently mature – when people who don’t understand the first thing about it can use it to solve their programs (that’s the sense in which I sometimes say that windows is more mature than linux…). I think I should stop now. As you can see, college has been fun so far.

 

What then to do about this Jesus-mania?

Now how to we deal with a carpenter king?

 

Where do we start with a man who is bigger

Than John was when John did his baptism thing?

- This Jesus must die, Song of the Priests, JCSS

 

I dreamed I met a Galilean;

A most amazing man.

He had that look you very rarely find:

The haunting, hunted kind.

I asked him to say what had happened,

How it all began.

I asked again, he never said a word.

As if he hadn't heard.

                        - Pilate’s Dream, JCSS

Wednesday, October 12, 2005 9:37:40 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Tuesday, October 11, 2005

After a long time, one missing piece in the yield (iterator) puzzle fell in place in my head. It should have been obvious because it had been staring at me in the face for a while now. I am not sure, but it seems to me at this moment, that some of this might be of lasting importance beyond the scope of this blog entry.

 

Time warp 3 or 4 years back to when Sid and I were discussing ruby iterators and python list comprehensions. Sid had this notion that there was something missing in the Python implementation that made ruby iterators more powerful. We however could not place it at that time and we let it be. And we missed it for years after that. I am going to make some claims that some languages can and cant do certain things – I might be wrong and I hope that I am – if I am do show me how and maybe we can take this forward.

 

The most powerful yield (iterator) construct that I can think of, needs all of these –

1)     the parent function that takes arguments

2)     the parent function yields values

3)     the parent function can return a one or more values finally and terminate

4)     the parameter block of the yield should be able to ‘break’

5)     the yield should be able return values into the iterator

6)     it should possible to iterate two or more iterators in an interleaved manner.

 

1 and 2 are obvious. Almost all implementations of yield do this.

 

3 means that there is a notion of the function returning a value as the exit case of the function (such that the function instance terminates and that it will not produce any more values).

 

4 means that you should be able to do this

x.each {|y| break }

 

5 means that you should be able to do this

def foo

      n = yield 5

      puts n

end

 

6 means that you should be able to do this

a = foo1()

b = foo2()

puts a.next() + b.next()

puts b.next() + a.next()

 

Now, here is the search for yield the magnificent – I don’t a language that gives me a yield that can do all of the above and I am upset.

 

C# and Python rely on transformation of iterator (the callee) into an object that models the the state machine of iterator code. They use the objects state to maintain the environment for their closure (their state machine transformed iterator function). Read more about C# 2.0 iterator implementation here. The ruby way as I understand is that the callers block is passed to the callee as a closure. Scheme does not intrinsically have a yield, but you make an equivalent by passing a closure to a callee function. I am aware that there are other languages that implement the yield – Comega and Groovy, Icon and CLU – I don’t much about these (except that I believe that Cw is similar to C# in this regard), and so I wont discuss them here. Ever since ruby popularized the construct, it has found its way into many languages.

 

All of these support 1 and 2 above.

 

C# and Python support 4 and 6. They don’t support returning values from the function and do not support sending values into the iterator. This means that you cannot write code similar to the powerful inject and other similar ‘reduce’ like constructs. Here is what the inject (courtesy of smalltalk looks like)

 

class Array

      def inject n

            each { |v| n = yield n, v }

            return n

      end

end

 

def sum ls

      v = ls.inject(0){|n, v| n + v }

      return v

end

 

Inject is very powerful as it enables creating a large number of reduce like behavior. Inject itself is only one example of many things you can do if you can control the execution of iterator by passing it values from its consumer. The argument in favor of the return is weaker, but it is sufficient to say that it it important to separate a final value, from that of the last yielded value.

 

(I am not sure that Py cannot return a final value. If I am correct it terminates iteration of next() calls by throwing an exception. It don’t know if it captures a return value in the function object)

 

 

Scheme iterators, the simplistic approach that I have in mind is

(define iterator

       (lambda (x cls)

             (cls (+ 1 x)) ; yield 1

             (cls (+ 2 x)))) ; yield 2

(define caller

       (lambda ()

             (iterator 10

                    (lambda (n) ; this lambda is in the lexical scope of caller

                           (display n)))))

 

Should be equivalent to

def iterator  x

      yield x + 1

      yield x + 2

end

def caller

      iterator {|n|

            puts n

      }

end

 

The limitation in scheme is that you cannot do a ‘break’ the block of the caller. Atleast there is no easy way to do this without grabbing a continuation at the point of the call of the call to ‘iterator’, and then invoking the abortive continuation in the closure argument cls.

                       

If if it given that the abortive continuation is a natural way to do this, it is not obvious about how two iterators can created to lockstep iterate into a single block of code. I will discuss this further below.

 

So scheme can do 1, 2, 3 and 5. It can’t ‘break’ without continuations and cannot interleave or lockstep iteration in any obvious way.

 

Ruby, by far comes out with a lot of the above – the only thing it can’t do is 6, that you cant interleave iterations. The fact that you cannot interleave or lockstep iterations means that there whole classes of problems that you cannot solve. The simple case is the classical example of co-routines or continuations, the problem of comparing leaves of two trees. To compare the leaves of two trees the standard solution is write a routine that traverses the tree recursively and yields leaf values out of the recursive structure. You can then write a simple compare routine that initializes two instances of this procedure with two trees you want to compare, and then it merely compares the yielded values. This is the solution in Comega –

 

//the int* represents a stream of 0 or more ints. It is not a int pointer.

static int* getleaves(node n)

{

       if(n != null)

       {

             if(n.value != null)

                    yield return n.value;

             else

             {

                    if( n.left != null) yield return getleaves( n.left );

                    if( n.right != null) yield return getleaves( n.right );

             }

       }

}

 

static bool samefringe(node tree1, node tree2)

{

       en1 = getleaves(tree1).GetEnumerator();

       en2 = getleaves(tree2).GetEnumerator();

 

       t1 = en1.MoveNext();

       t2 = en2.MoveNext();

      

       //this is the lockstep iteration

       for(; t1 && t2; t1 = en1.MoveNext(),  t2 = en2.MoveNext())

       {

             if( en1.Current != en2.Current )

                    return false;

       }

       return !t1 && !t2;

}

 

As you can see, the class of problems that fall into the above category is fairly large and its annoying that they cant be solved in ruby.

 

Note: Ruby provides something called a generator to solve the problem of lockstep iteration. I like the language they describe this in –

Generator converts an internal iterator (i.e. an Enumerable object) to an external iterator.

Note that it is not very fast since it is implemented using continuations, which are currently slow.

 

The first problem is that it grabs continuations over something that already manages state. Lets assume that that’s transparent to us and is a non-issue, there is a larger problem that generator created iterators violate some of the other requirements, specifically 5: how you can send values into the iterator.

g = Generator.new { |g|

      n = "hello"

      loop {

             n = g.yield( n ) 

             puts n

             break if n == nil

      }

}

g.each {|n|

      puts n

      n + "."

}

This code does not work as expected (does not generate hellohello.hello..hello…). If there is something amiss above that you can point out, let me know.

 

All of this brings us closer to the conclusion – you will notice that there are two approaches to the implementation of iterators – one where the callee (the iterator) is closure converted (like C#, Python…) and the approach to grab a closure in the caller with a light-weight or heavy-weight abortive construct (as in Ruby and Scheme). Here is a pleasant discussion on iteration styles.

 

Finding Yield the Magnificent

 

Fixing the C# and Py approach requires two potentially simple changes. Make the C# MoveNext() or the Py next() functions take an argument. This is not strictly semantically correct, because this means that if MoveNext or next are used to start the computation then the first value passed is ‘hanging in space’. This fixes requirement 5. To fix 3 it is enough to create a new member in the closure object that retains the return value.

 

While this might be all very well in Py, its probably not as easy in C# or other strongly typed systems since you hit a syntactic limitation about where you express the return type of the function and the types of the value that yield returns to the function.

 

 

Fixing lockstep in a blocks/internal iterators/scheme-ruby-style approach can be done if it is possible to create a combinator that takes multiple iterators and a block that takes multiple values and does an iterator equivalent of scheme’s ‘apply’. I don’t if such an apply-iterator operator exists. Maybe this was the difficulty they weren’t built into ruby in the first place. I don’t have a solution to interleaved iteration when it’s the caller that is closure converted – there is probably no valid general purpose transformation of this sort possible.

 

 

There, I am glad all of this is said finally. I knew it all at the back of my mind as a shapeless dark area in my understanding of yield from which I could derive working knowledge, but nothing quantified. What started off all of this is that I finally needed a yield that did all of the above – that seemed to be the only natural way to express a particular algorithm. Now that all of this is said its quantified, weather it is all right or wrong. I will be happy to see what ideas you can contribute to the above discussion. I would also like work of the apply-iterator operator or if such a beast exists and you know about it, let me know.

 

I will get down to proof-reading this soon.

Monday, October 10, 2005 11:44:25 PM (Eastern Standard Time, UTC-05:00)  #    Comments [13]  | 
 Sunday, October 09, 2005

Mini Kanren programming gives you a few basic constructs and you are on your way doing logic programming.

 

It has a run construct which looks like this

(run* (q) <goals>)

And this means that the it generates all possible values of the variable q, that can be satisfied by all goals that are part of the run.

 

The simples goal constructor is that of unification and is denoted as a ==. The unifier tries to bind its values together. If both values are compatible the unifications succeeds.

(== 5 5)  - this unification succeeds

(== x 10) – this succeeds and binds x to 10

             (if there was no binding for x before)

(== a b)  – this binds a to b and succeeds

(== a 10) – this binds a to 10 and succeeds, hence b is also 10

(== b 20) – this fails since the b cannot be bound to 20 in this context

 

So if you say

(run* (q) (== q 10))

You get 10 as the result.

 

The conde forms the equivalent of an ‘or’ relationship between sets of goals. It returns all the possible substitutions that are got by evaluating each of its goals sets.

(run* (q)

       (conde

             ((== q 10))

             ((== q 20))

             ((== q 5) (== q 15))))

This returns values 10 and 20. The third relation under conde fails because q cant be bound to 15 after it is already bound to 5.

 

The last construct is fresh. The important thing about mk over prolog is that it has a notion of lexical scope for its logical variables. ‘run’ introduces the variable whose value it will return on evaluating the its goals. Fresh introduces or defines one or more new logical variables which can then be used in relations in the scope of the fresh.

(run* (q)

       (fresh (a)

             (== a q)

             (== a 10)))

This returns 10.

 

There! That’s all you need to know about mini-kanren.

Embed this much into scheme and you have a full (I think) logic system. Take a closer look at more Mk here.

 

The real reason I wrote all this is that I had some potentially interesting random thoughts about the logic system. But I could not say it without establishing some context first.

 

Now, if you are using the Cw version here is the simple conversion Mk.run is run*, Mk.conde is conde. == is Mk.eq. I don’t need a fresh because I use the Cw type system and lexical scoping to create logic variables as instances of class ‘Var’. Get the Cw version is here.

Saturday, October 08, 2005 11:36:23 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Friday, October 07, 2005

This is an implementation of the Mini Kanren logic system for the CLR. This was done last night, so you can take it with a pinch of salt. It does not cover the full set of constructs that original MK system covers. Here is a overly simplified introduction to miniknaren.

 

This implementation is done in Cw, the experimental language from MSR. You will need the COmega compiler if you wish to compile the source.  

 

This implementation covers the basic ‘run’, ‘all’ and ‘conde’ constructs. You do not need a ‘fresh’ because I use the Cw type system and scoping to give me the equivalent of fresh. MiniKanren is implemented as an embedded language in scheme – so you get to do relational programming and then drop into scheme’s functional host language as need be. Since this implementation runs over Cw, this extends and hence you get a combination of a declarative logic programming system on top of everything that is Cw.

 

I am considering adding some more constructs and maybe a relational arithmetic system. I am also considering extending Cw syntax using an external precompiler so that it becomes as little more natural to write relational programs. The present ‘syntactic beauty’ is well a little lacking..

 

All that said, here are the sources from last night.

 

q = new Var();

a = new Var();

b = new Var();

c = new Var();

 

res = Mk.run( q,

      Mk.eq( q, new object[]{ a, b}),

      Mk.eq( a, c ),

      Mk.eq( c, 10),

      Mk.conde(

            new goal[]{ Mk.eq(b, 5) },

            new goal[]{ Mk.eq(b, 10) },

            new goal[]{ Mk.eq(b, 20) },

            new goal[]{ Mk.eq(b, 25) },

            new goal[]{ Mk.eq(a, 50) }

      ));

 

//res is the potentially infinite lazy evaluated set                            

r  = res.{ return Var.ToString( it ); };

r.{ Console.WriteLine( it ); };

 

A logic programming system is a great way to implement solutions to several kinds of problems like dependency analysis, all searching any relational space (ex: A Type Discipline for Authorization Policies)

 

Disclaimer:

(I am not responsible for your divergences).

(I am also not responsible for any halting problems with your compilation process).

(The full credit of MK goes to its original authors. This implementation is merely a transliteration with a certain imperative flavor added).

Friday, October 07, 2005 1:45:47 PM (Eastern Standard Time, UTC-05:00)  #    Comments [4]  | 
 Wednesday, October 05, 2005

Its been a little over a month at IU and the experience has been overwhelming. As has been since when I can remember I have gotten myself involved in more things than I can handle – which is another way of saying that things have been fun, atleast when I am able to keep up with them.

Like Hogwarts, the university is dotted with signs of greatness from antiquity. And every once in a while you come across some one who can do things to your mundane set of ideas that is no lesser than of Gandalf.

iu-4.jpg iu-2.jpg iu-5.jpg 

iu-3.jpg iu-1.jpg
Pictures from around the Indiana University Campus

I have been hit by some hard ideas and fortunately I had been preparing my mind for a while to handle these – had I not and had I tried to absorb them at the same rate now, I might have been in for some hardship.

lindley_hall.jpg
Lindley Hall, Computer Science Dept

 I have formally enrolled for Operating Systems by Andrew Lumsdaine, Programming Languages by Dan Friedman and Quantum Programming by Amr Sabry.

OS is primarily to scratch an old itch – because of that incomplete OS I left behind in college – this course gets to complete is half done os161 kernel – so there is a lots of traditional C programming there.

Programming Languages essentially covers the whole of ‘Essentials of Programming Languages’ with Prof Friedman saying things like ‘I will let you know when you need to think’, on day one. Most of us are thinking pretty hard already. This year the course also covers the new logic system he worked on with Will Byrd and Oleg Kiselyov called Mini-Kanren (source-forge is not as updated as it should be). This also forms the basis for his third ‘little’ book – The Reasoned Schemer.  

Quantum Programming is … well a course about magic – it’s a huge thinking exercise about creating a model of computation for possible ‘quantum machines’. I will not venture to say much more here, except drop a link to an introductory paper.

OS and PL are supposed to be among the hardest courses in the CS department, many people don’t advice taking them both together. To put things in perspective, I am finding QP the hardest simply because of the breadth of the computer science from which the ideas in QP are coming from. The present state of QP is a little like the state of classical computing in the times of Church and Turing.

kirkwood.jpg iu-7.jpg
Kirkwood Hall and the walkway by Lindley

That aside, I am looking into some of these things as of right now (this is not a complete list)
-        join calculus, pi calculus and other process calculi
-        squiggol (Bird and Meertens formalism)
-        quantum circuits
-        grokking functional programming Haskell-style
-        logic programming

imu.jpg

All of this aside, Bloomington is a beautiful place - small university town with lots of charm and a pleasant weather (for now). I am told that this place has bad winters with lots of snow.

bloomington3.jpg bloomington2.jpg

bloomington1.jpg

I have a small single bedroom apartment that is a ~12 min walk from the university.

univ.JPG
The red square is Lindley Hall, the main CS department building and the green square is my apartment.
(Courtesy: Google Earth)

And among other things, I have a new weapon – its called Qubit. Qubit is a Dell Inspiron 9300, UXGA 17” monitor, with 1Gb ram, 256 mb Nvidia Geforce 6800 graphics card and a dual layer Dvd writer. 

iu-6.jpg

Wednesday, October 05, 2005 10:34:17 PM (Eastern Standard Time, UTC-05:00)  #    Comments [9]  | 
 Tuesday, August 09, 2005

cheers!

Tuesday, August 09, 2005 7:11:33 AM (Eastern Standard Time, UTC-05:00)  #    Comments [14]  | 
 Friday, August 05, 2005

Yay! Finally the search bots gave us a honorary indexing. :)
Hopefully most of the damage due to the url change will be undone soon.

Friday, August 05, 2005 12:37:12 PM (Eastern Standard Time, UTC-05:00)  #    Comments [1]  | 
 Monday, August 01, 2005

I just finished reading Erik Meijer and Peter Drayton’s work-in-progress paper on programming languages and type systems. The paper as correctly noted in the beginning is written-to-provoke and I think this line of work is required today when technologies like .Net and the JVM have opened up possibilities of experimenting with languages and language systems and potentially reduced barriers for acceptance of new languages. The value proposition the expressiveness of a new language provides can be more readily accepted today in frameworks like .Net where a programmers existing knowledge/use of an API and an existing code base will not be lost by switching over to the new language.

 

The paper touches a large variety of topics and is more in brain dump of thoughts around the general area of type systems in dynamic and static languages. They also have some very interesting pointers and references.

 

The papers discusses type inference, programming by contract as opposed to having the type system as the only means of contract, subtyping and potentially expressing subtyping as programming by contract, type-directed lifting in Cw and how that can be used as opposed to non-intuitive reflective code.

 

The paper talks mentions the SqlDataReader – even if you don’t grok anything else in the paper, try and understand this part and try and see the need to a better programming language model – especially if you have ever worked on the data tier or business logic tier of an n-tier data centric application.

 

The paper also mentions lazy evaluation and about potentially deprecating the need for using eval if you have a better system in place for most of the common uses of eval.

 

On the whole it’s a paper that worth your time if you have started thinking about why when using C++ its so damn hard to get anything right, why C# cant be a little more flexible when handling collections, XML like or SQL like data and why languages like your pet python or ruby don’t make the mainstream.

 

http://pico.vub.ac.be/~wdmeuter/RDL04/papers/Meijer.pdf

Monday, August 01, 2005 2:30:34 PM (Eastern Standard Time, UTC-05:00)  #    Comments [3]  |