Monday, May 29, 2006

I was thinking of a highly opinionated, racist (in the funny sense (yes such a usage exists (now it does))) standup comedian of Indian origin called Russel Peters who was talking about his visit to Africa. And he said something to effect of, I went to the motherland .. and quickly correct himself saying, “No not my mother land, black people’s motherland. I am Indian, we have our own motherland, England.”

 

So I am in the pseudo metaphorical  motherland. It’s a beautiful country. I didn’t see much of London – I had friends pick me up at the airport and we drove down to Cambridge. I am staying with friends of Michel Salim from back at IU. Michel was studying at Cambridge sometime back. So I was amazed and grateful for generosity of these guys who drove all the way to Gatwick to pick me up. Its been half a day but already the conversation are interesting.

 

I am also beginning to have the dormant British accent in me wake up. If you know me, you can stop laughing now (that’s for people who don’t know me to reflect on). Its odd – you have to be Indian – grow up reading Archie and Enid Blyton. And go to US, get confused by everything being reversed and learn bits of American. And then finally come to  England and hear the all to familiar British English accent and cars on the left side of the street, to know what I mean.

 

Cambridge is so beautiful. It has narrow roads, quaint brick buildings. I haven’t seen much yet, I just one short walk. Saw an old church, an amazingly beautiful cathedral. I am told that the colleges are very beautiful.

 

Also for the record, I was actually told, in as many words (yes!) that I was cool. Hear ye! Hear ye!

 

Which reminds, when the plane was landing I had the Grandmom in Kumars at No 42 singing “Hare Parky, Hare Parky..” in my mind.
Anyway …

 

For one, what I am doing for summer is cool in the fullest definition of cool in my dictionary. What are you doing for summer? Oh just flaying to UK and am going to be writing an experimental garbage collector and some other things for Haskell at MSR. And maybe write a paper or two in the meanwhile, And maybe see a lot ok UK and some of Europe (the mainland) while I am at it.  That is cool. This is a little outside what I imagined possible a few years back.

 

I am sitting on the window sill of their first floor flat (not apartment). Opposite is a Indian restaurant that says “Chutney Balti House” – ha ha – what does that mean? I wonder if anyone in India would name their restaurant like that.

 

Tomorrow I visit MSR.

 

 

I have been at Cambridge for about three weeks now. MSR is wonderful. I got to shake hands with a Knight and Turing Award winner – Tony Hoare. Random people you get to meet over lunch are pretty much the reigning best in their fields in computer science. Its like you were an insect of some sort (when I say that I don’t mean anything hoopy like a butterfly or firefly – more like one of those little cricket like things) and you managed to crawl into the midst of a league of intellectual superheroes. It’s a lot like the old Justice League of America comics http://www.hyperborea.org/flash/jla.html (with the minor difference that there aren’t many Americans here).

 

Cambridge is a wonderful place too. Its just that I haven’t had time to write. Seen a bit of Cambridge, been to London, been to Peterborough, am going to Luton today. And the the computer science is intoxicating – along with other minor details like operas, wine, beer festivals etc. Joi di Vivre!

 

An over exposed shot of the statue of Eros at Piccadilly Circus, London.

Monday, May 29, 2006 3:46:11 AM (Eastern Standard Time, UTC-05:00)  #    Comments [2]  | 
 Friday, May 05, 2006

After several weeks (running into months) of blog silence, here is a post.

 

I have been busy. Been teaching myself several new things. Life in Bloomington has been good on the whole.

 

Several good things have happened in life. I am on the verge of finishing my first formal paper in computer science. I have been teaching myself how think in formal terms, not just as a programmer, but like a computer scientist. It is an interesting process.

 

My first paper should be about the yield operator – it turned out that what was idle tinkering and curiosity over the past few years actually seems to have given rise to something of theoretical value. The paper has some formalizations of things, reference implementations, comparison with existing operators and such. Looks like I just may have enough material for a follow up paper as well.

 

 

I have an interesting summer job. I am going to be (finally) at Microsoft Research in Cambridge. I am going to be an intern working with Simon Marlow and at this time it looks like I will be working on a parallel garbage collector for Haskell. It should be good fun to get back to writing systems code for a change.

 

I have seen so many things in code that I am unsure of code that I write these days. I keep seeing demons in my code. Most of the time I don’t code. When I do, and I intend to use it, it is usually in some extreme language – ruby being one extreme and Haskell being the other. I am afraid of what code I will actually write, when I get down to it and if I actually think about the code. There is an interesting Phil Wadler paper titled “Imperative Functional Programming” – I am afraid that if you give me C these days code I write maybe better described as Functional Imperative Programming. Maybe not that bad though.. at heart I think I am still an imperative programmer – a little shaken, but I can still see the machine execute in my head.

 

 

I also spent sometime writing a little book. It is incomplete, but I offer it for download here. If you are the sort, do read it for the fun of it and give me feedback. My intention is to add for text and add more machines and make it a more complete “little” book.

 

It is a book about abstract machines – machines that will teach you the essence of programming languages and constructs. It shows you how several constructs work in a concise formal way to the extend that you can take the definitions in the book, apply the syntax of your favorite language to it and create your own interpreters and potentially your own programming languages.

 

The Little Machines

 

Thats all for now. More when time allows.

 

ps. If you have any suggestions about places to visit in the UK and such let me know. I am looking for interesting things to do in the time I can take away from work.

Friday, May 05, 2006 12:01:15 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, April 29, 2006

 

I now have a new weapon. This time it’s a really dangerous one.

 

rebel_xt.jpg

 

I now have a Canon Digital Rebel XT – aka the EOS 350d. I have the 18-55mm kit lens and a 50mm f1.8 prime lens. Of course, this is far from the sort of thing I can afford very often – set me back by a $900+ when everything came together, I had some help from my friends to get things to work out. But as life would have it, I had to buy the camera now or wait for ~3 months, since I was leaving the US for a while. More on that later.

 

I have decided to call him “Keeper” based on the keepers in Babylon 5. J

 

I had been Chicago on Thu and took “Keeper” along.

 

Chicago1.jpg Chicago2.jpg Chicago3.jpg

 

 

This trip also gave me the chance to visit Shedd Aquarium which I had been meaning to visit for a while.

 

Shedd1.jpg Shedd11.jpg Shedd10.jpg Shedd2.jpg Shedd3.jpg Shedd12.jpgShedd4.jpg Shedd5.jpg Shedd9.jpg Shedd6.jpg Shedd8.jpg Shedd7.jpg Shedd13.jpg Shedd14.jpg 

 

 cheers!

Saturday, April 29, 2006 2:45:54 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Friday, March 03, 2006

Why Haskell?

So how long does it take for a completely fledgling Haskell programmer to implement a weird language? 2 nights.

 

Its pretty amazing if you think about it – and it fun and not stressful as well. If I did as much in C or Cpp it would have been significantly more stressful and would have take a over a weekend for sure. And even after that it wouldn’t be as flexible as this is.

 

In some sense, in my current limited opinion, Haskell is to Scheme what Ruby is to C++. To add to it, it is purely functional than scheme, is lazy, comes with A REAL library, allows for pattern matching and discriminated unions etc… the goodness goes on – as matter of fact, it can even pretend to be imperative for a bit. Also, GHC generates actual Win32 exes that I can run. If you are a languages person you might think it’s a moot point, but for someone who likes to build tools and wants to sometimes share them with his (less inclined) friends, this is essential.

 

So I think Haskell is going to takes its place in my bag of tools. The only thing I really miss from Scheme land are macros. I wish Haskell had macros – better, I wish Haskell had multistage evaluation.

 

Being a almost ‘pure’ imperative programmer when I joined college, Prof Dan Friedman’s Programming languages class hurt my head really badly. And I was coming to terms with functional programming.

 

Last semester I also took a class with Prof Amr Sabry – Quantum computing – and this was my first real introduction to Haskell. I am working with him this semester and of late I have been trying to understand Monads. Its one thing to have working knowledge of a concept, and its quiet another to formally study it. Monads hurt like few other things hurt before – and the fact is that it makes you look at imperative programming in this totally other way.

 

So that’s the context for why this is in Haskell. The blog entry is really about this little language I wrote in .. in all fairness a little more than a late night. And I haven’t used the parser generator or other tools before this.

 

Yield Language 1.0

So whats the language like? Well, you’re not going to want to use it at all. But all the same it is the lambda calculus with the following additions (some of which take away its purity) – yield (with all the right behavior in place), if-then-else, numbers (+, -, ==), strings (+, ==). In other words, it’s a simple higher order language with strings and numbers and fancy control flow operator called yield, which is the only “impure” part of the language.

 

To support yield I introduce a funny tagging operator I call ‘pi’ to tag a sub-expression that can yield. A pi-expression can be called like a normal application, or can be used as a coroutine using co, next and run.

 

The actual internal implementation is what is best described as “stackless”.

 

 

f = \x ->

  ((("x = " + x) + ", ") +

  if (x == 0)

    "zero"

  else

    "not zero");

 

(f 10)

 

This evaluates and returns “x = 10, not zero”

 

Basic syntax

e =     \x->e

| x

| e1 e2

| if e1 e2 else e3

| (e1 + e2)

| (e1 – e2)

| (e1 == e2)

| <num>

| <string>

| x = e1; e2

 

The above syntactic categories as basically all pure. The last rule is a let. There is no fixed point operator and you will need to add one of your own to do recursion. Evaluation is strict, applicative order. The following are the extensions for yield support.

 

      | pi e

      | yield e

      | co e

      | next e1 e2

      | run e1 as x1 x2 -> e2 or x3 -> e3

 

Some more examples, assume that we have an applicative order Y combinator called “fix”.

-- Applicative Y combinator

fix = \f -> ((\x -> (f \y -> ((x x) y)))

          (\x -> (f \y -> ((x x) y))));

 

Here  is a relatively simple function that takes a number and yields as many times. I have used the fixed point operator so that I can loop inside the pi block.

-- An iterator that yields 'x' times

iter = \x ->

  (pi  (((fix \loop -> \x -> \s ->

                  if (x == 0)  s

                  else v = yield x;

                        ((loop (x-1)) (v+s)))

            x)

      0));

 

One can call this function in two ways – simple invocation –

((iter 1000) \x->(x+x))

 

Or using the coroutine-like behavior, that is sometimes called external-iterators or generators.

call = (fix  (\call -> \c ->

                  run c

                  as (x1 x2) -> (call (next x2 (x1 + x1)))

                  or x1 -> x1));

 

 

v1 = (call (co (iter 1000)));

 

So one can save the following code into a file and execute it from command line using the command –

langy.exe < filename.txt

 

[update: made some simple syntactic niceties to the language]

 

-- Applicative Y combinator

fix = \f -> (\x -> (f \y -> (x x y))

             \x -> (f \y -> (x x y)));

 

-- An iterator that yields 'x' times

iter = \x ->

       (pi  ((fix \ loop x s ->

              if (x == 0)  s

              else v = yield x;

              (loop (x-1) (v+s)))

             x 0));

       

                                  

-- Call will invoke an iterator any number of times

-- At each point it thunks the return value and then uses that as the

-- next value paramter.

call = (fix  \ loop c ->

        run c

        as x1 x2 -> (loop next x2 (x1 + x1))

        or x1 -> x1);

                        

 

v1 = (call co (iter 1000));

                      

v2 = (iter 1000 \x->(x+x));

                      

("Values are " + v1 + ", " + v2)

 

And this outputs –

"Values are 1001000, 1001000"

 

Downloads

 

 

Friday, March 03, 2006 2:37:42 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Tuesday, February 28, 2006

(A much delayed part 2. Previous article is here)

 

Pipelines

 

This article primarily describes the pipeline construct. The simplest way to describe a pipeline is to say the way the trampoline models the interaction between a caller and iterator and, the pipeline models a call tree. One can think of a pipeline as a cascade of trampolines.

 

Suppose one had a call tree where each frame in the call tree corresponded to that of an iterator. As each iterator yields or returns to a yield, what the program effectively does is that it moves values up and down the call tree.

 

Consider the following function –

def bits(n)

      if n == 1

            yield 0

            yield 1

      else

            bits(n-1){|v|

                  yield v*2

                  yield v*2+1

            }

      end

end

 

Here invoking bits with a value n, causes it to yield every possible n-bit binary number. Now this sort of thing is not easy to model with a trampoline. A trampoline lets one process in the trampoline act as a iterator and the other as a caller (though the device itself does not introduce an asymmetry).

 

To do something equivalent to code snippet above, one needs to maintain a chain of iterators. How does control transfer between iterators in call chain, assuming that none of the iterator return? Control can move up and down the call tree corresponding to when a caller returns the value to a yield statement higher up the tree OR the iterator can yield a value to a caller further down the tree. Something like  the diagram below – each up arrow being a return to a yield called by a block on top and each down arrow being a yield. (The direction of the arrow indicates which way control transfers.)

pipe-call-tree.gif

 

The pipeline device gives every process in it to communicate with the process to the left of it or to the process to the right of it. One can choose a convention here where the process to the left of a process can be considered it callee and the process to the right can be considered its caller.

 

The picture below gives a comparison of what the pipeline device does.

 

pipe-device.gif

Every process in a pipeline device is created using a lambda-pipe. A lambda-pipe, unlike the lambda-iter, makes available the binding ‘left’ and ‘right’ in the scope of its body.

 

A process can call left to pass control to the process to its immediate left. It can call righ to pass control to its immediate right. The ‘left’ and ‘right’ are similar to the yield in the sense that they take a parameter and return a value.

 

If a process in a pipe, executes a right with a values ‘v’, the value is made available to a process to its left as the returned value of a ‘left’ executed by it. Hence left and right alls by the process have to match up to facilitate control flow.

 

This is the basic idea behind the pipeline device.

 

Example:

Consider the following function basis, which is basically a rewrite of bits, but for usage in a pipe.

 

(define basis

  (lambda-pipe ()

    (let ((v (left)))

      (if (eq? v 'end)

          (right 'end)

          (begin

            (right (cons #f v))

            (right (cons #t v))

            (yield-all (basis)))))))

 

One can put multiple instances of basis into a pipe and have an initial process that will start the cascade by right yielding a empty list, immediately followed by the end marker.

 

    (foreach (command-shell-pipeline (gen-pipe '() 'end)

                                     (basis)

                                     (basis)

                                     (basis))

             (printf "num: ~a~n" (port-value it))))

 

This produces –

num: (#f #f #f)

num: (#t #f #f)

num: (#f #t #f)

num: (#t #t #f)

num: (#f #f #t)

num: (#t #f #t)

num: (#f #t #t)

num: (#t #t #t)

num: end

 

In the above code a gen-pipe, will simply right yield each of it arguments in the same order. One may notice that I use the command-shell-pipeline in a foreach, which means that it is itself an iterator. The pipe device wraps values in something called a ‘port’, to extract the value from the port, one uses the port-value function.

 

The Pipe device is an iterator: To understand why the command-shell-pipeline is itself an iterator (while trampoline is not) one has to look at the diagram above for the pipe device. The left yield of the left most process (or topmost depending on how you look at it) and the right yield of the right most process are yields that transfer control outside of the pipe device. Hence the pipe device itself is an iterator.

 

[updated] The interesting thing about the pipeline is that each process in the pipeline gets state management for free.

 

The pipe device lets you build only a linear data structure, in the sense that each process in the pipe can only communicate with its left and right neighbors. One can use pipe nested in other pipes or in trampolines to make arbitrarily complex control flow structures.

 

Caveats: In retrospect, to be honest, I haven’t really used the pipe much in actual code. It was created more to fill a missing link in the usage of yields that a device than out of practical usage patterns. One might actually resort to nesting trampoline usages more often that one might use the pipe.

 

Also, the more I think about this device I can see more than one ‘correct’ behavior in the model that it presents to the user which sometimes makes it rather un-intuitive to program with. Desipte this it is easy to see how many problems can be implemented as a pipeline of filters that can each maintain state and back track. I need to try and express more problems with the pipe to see how well they can be expressed.

 

Due to some of these concerns the code for the actual pipe device implementation, takes as a parameter a ‘semantics’ function so that the user can describe the behavior required from the pipe externally. One predefined pipe available is what I have called the command-shell-pipe.

 

 

I am also interested in finding analogous constructs to this – one of the things that I should look at is a non-determinism monad used with a monad for state threads. The processes in the pipeline device are similar to what is described as an ‘online transducer’ by Olin Shivers and Matthew Might in Continuations and Transducer Composition. After putting this up online I see that the transducers are studied in detail by many people.

 

Download code.

 

Tuesday, February 28, 2006 8:02:14 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, February 04, 2006

This is a series of blog entries that introduce three devices written using the yield operator for Scheme. Previously I had the following entries that described the requisite features of a complete yield operator and an implementation of such a yield operator for Scheme.

 

Yield the Magnificent

Yield Implementation for Scheme

 

Trampolines, Pipelines and Pi are programming devices or constructs that are built on top of the yield and purely the functional subset of Scheme.

 

There are many ways one can look at the power provided by the yield operator. One way to look at the yield is that it is a call-tree walker. It can move information up and down the call tree. If the yield implementation is sufficiently flexible, it can walk up and down several call-trees which exist concurrently.

 

Another way to look at the yield is to think of it as adding input and output capabilities to a function in functional system. A function during the process of computing its value changes state, creates temporary variables etc. When the result or return value of the function is computed the state of the function collapses and the value is returned. Yield offers a way for functions to do input and out with other functions while they maintain state. More about this later.

 

Trampolines

I want to start off by introducing the simplest of the three devices, the trampoline – sometimes I refer to this construct as the trampoline process as well. (Note: The name trampoline is used only in the context of the yield based programming device I describe here and is not a logical extension of other trampoline constructs you may have encountered before.)

 

The trampoline was introduced to address the following scenario – in a purely functional context what is the meaning of a parameter code block that is passed to an iterator cannot affect the caller’s state. Such a parameter code block is always a closure, in a functional context. On invoking a closure, the closure cannot maintain or modify state in the caller without resorting to some sort of a side effecting operation such as an assignment.

 

The trampoline packages such a computational effect so that the programmer need not think in non-functional terms when using the yield. Instead of using a set! or an assignment as the computational effect, the trampoline device uses a jump as the computational effect.

 

To illustrate what I mean, one has to look at the relation ship between the caller and a yielding callee in an imperative language. The flow of control is almost always as follows for a Ruby or C# yield (yes, I am abstracting out many details here) –

  1. caller calls iterator, control transfers to iterator
  2. iterator does some computation and produces a values, yields it
  3. control transfers to caller,
  4. caller does something with the value – this something is usually a side effect
  5.  caller returns a value to iterator, control transfers back to iterator
  6. steps 2 to 6 can repeat any number of times till the iterator returns.

 

The problem with this functional programming land is that the there is no use in yielding the value if the caller cant to anything with it – an alternative is pass an environment back and forth between the caller and iterator or some such thing, which is always tedious to manage.

 

The trampoline does the following. The trampoline becomes the caller for two functions, both of which are iterators that concurrently maintain state. One iterator place the role of the caller in the imperative case and the other one plays the role of the callee. Since the function playing the roles of the caller is an iterator, its free to maintain any sort of local state it chooses.

 

The trampoline device is a simple device, but to see the correct analogy that argues its purpose is a little tricky. Another way to look at a trampoline is to look at it as two iterators such that when one yields control passes to the other and when the other yields control passes back to the first. The process terminates when one of the two iterators terminate.

 

yield-tramp-1.GIF

 

Here is an example of code that uses the trampoline. When the trampoline process terminates, a list of values that are the return (or last yielded values) of both iterators is returned.

 

This is an iterator that adds values that it gets yielding the current sum at any point.

(define adder

  (lambda-iter ()

     (let loop ((s 0))

       (loop (+ s (yield s))))))

 

This is an iterator that yields the first ‘n’ Fibonacci numbers

(define fibonacci

  (lambda-iter (n)

    (let loop ((a 1)(b 1)(n n))

      (if (not (zero? n))

          (begin

            (yield a)

            (loop b (+ a b) (sub1 n)))))))

 

To sum the first n Fibonacci numbers, one create a simple trampoline between the iterators

(printf "Sum = ~a~n"

        (cadr (trampoline-process (fibonacci 10) (adder))))

 

The cadr is there because the trampoline returns a list of two values, the last value obtained from the Fibonacci and the last value obtained from the adder. We are interested in the sum yielded by the adder and so we take the cadr.

 

The trampoline process that is implemented in the library that I have shared below is a slight generalization of this idea – the generalization being that the trampoline process can take any number of iterators as arguments creates a circular list of control transfer using yield. While having two iterators in a trampoline can be thought of as the equivalent of an imperative caller and callee, there is no simple parallel I can think of for three of more iterators in a trampoline. Below is a diagram of trampoline with three iterators.

 

yield-tramp-2.GIF

 

I frequently use a trampoline while programming in scheme using yield. While it may not be easy to see why trampolines are useful just by using a simple example like the above, they useful in managing code complexity when each of the iterators involved has to do a non trivial computation.

 

As an example I wrote a CPS transformer for a subset of scheme last semester, at the heart of which was a trampoline. One iterator would recursively walk down the source expression tree and would yield out values. The other iterator would reconstruct a CPSed expression. So it gives you the conceptual clarity of writing a two pass CPSer while in effect only one pass happens. (Of course I could mash both the analysis and construction phases into one big hairy ball of mutually recursive functions, but that would not let me focus on one problem at a time and possible build them up as reusable iterator components. This of course was a little non traditional and might not have been easy on the evaluator of the assignment).

 

Download Sources for Yield, the Trampoline and the Sample Code above.

 

In the next related entry I hope to discuss the Pipeline device (aka the Pipe device).

 

Saturday, February 04, 2006 1:26:54 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Sunday, January 29, 2006

What is the Y combinator? I have seen lots of discussions on the web but nothing that seemed like a simple explanation of how it works or why it is needed. I finally understood the Y combinator a few days back and a lot of things snapped in place. So here I will try and explain it from my perspective. In this entry I am taking liberties with various syntaxes – please bear with me.

 

This is the Y-combinator for a normal order lazy language –

Y = \f.((\x.(f (x x))) (\x.(f (x x))))

 

(Here, the ‘\’ denotes a lambda). Scheme is an applicative order language and the above combinator cannot be used as such in scheme because it will lead to an infinite expansion. More on this later.

 

Let’s start from some basics – say we are writing a mathematical expression and we said

a = a + 1

While the above means something in imperative programming languages (like C), it does not mean anything in mathematics.

 

Now say I want to write the factorial function in mathematics, I would say something like this

fact(0) = 1

fact(n) = n * fact(n-1)

And this definition is perfectly valid. However you notice that it is similar to the a = a + 1 expression in the sense that in describing the value of something, we refer to that very thing.

 

This general principle, called a recursive definition, has a problem in the context of pure functions (like the lambda calculus). The problem is as follows – a function is a nameless entity – the only named entities that a function can use are function’s parameters (the variables introduced by the function itself) or parameters of a parent function that it is define in the scope of.

 

For example

(lambda (x y)

      ; can use x and y here

      (lambda (a)

            ; can use a, x and y here

      )

)

 

Now for a recursive definition, a function needs to be able to refer to itself (or like we say in some languages, a functions needs to ‘call’ itself).

 

So, the only way that one can define a recursive function would be to receive the function itself as one of the functions arguments. Think about this for a bit – if you are not used to programming in a language where you can pass a function, or return a function as though it were a value (similar to how you treat an integer), then this might be a hard idea. If you are from a programming background that wherein you are used to ‘higher order’ programming, then I am only stating the obvious here. (The general idea with higher order programming is that functions are first class values, like integers – which means that they can be passed as arguments, returned from functions, stored in data structures etc).

 

So we define a function such that it takes an additional argument which is itself.

Lets say

 

factorial0(factorialx, n) = if n == 0 then 1 else n * factorialx(factorialx, n -1)

factorial(n) = factorial0(factorial0, n)

Does this make sense? If it does not, then don’t bother read further. You notice that the factorial0 function takes 2 arguments one of which is the function itself.

 

Let me rewrite this in scheme-style syntax

(let ((factorial0 (lambda (fact, n)

                    (if (zero? n)

                        1

                        (* n (fact fact (sub1 n)))))))

 (let ((factorial (lambda (n)

                    (factorial0 factorial n))))

      ; I can use factorial here

  ))

 

Make sense? Ok, so now let me write the interesting function factorial0 as a nested lambda as follows -

(lambda (fact) (lambda (n)

(if (zero? n) 1 (* n ((fact fact) (sub1 n)))))

 

You notice that fact is applied to fact, to generate the inner lambda which is then invoked with n-1.

 

So now, suppose one has a handy function of the following form

omega = (lambda (x) (x x))

 

One can define factorial as follows

 (let ((factorial

       (omega (lambda (fact)

                (lambda (n)

                  (if (zero? n)

                      1

                      (* n ((omega fact) (sub1 n)))))))))

  ; use factorial here

  )

 

There you go – so now we have a way of defining recursive function definitions even if your language did not directly allow a recursive definition.

 

So why do we need the Y combinator? You notice that in the definition of factorial we had to do something that looks very strange in the definition of the function – we call the function omega to return a function that we can call. The Y combinator does away with this for us.

 

To start thinking about the Y combinator, think about things this way – suppose the parameter ‘fact’ was already ‘(omega fact)’ then we wouldn’t have to call omega inside the definition of factorial correct. Right?

 

If we wanted that we simply need to rewrite omega this way right –

(lambda (x) (x (x x)))

Such that the first parameter is already applied to itself. But then this is not enough, because when the recursive call is made, the second time the parameter is only fact and not (fact fact) or (omega fact). Well, so we can fix this by having one more level in the definition of omega, such that omega looks like this –

(lambda (x) (x (x (x x))))

 

But then the third time this would not be the correct parameter… ideally we need an infinite application of x, because we don’t know how many times the function will call itself. In other words the omega should look a little like this

(lambda (x) (x (x (x (x (x … ))))

 

This is exactly what the Y combinator does. To see how it does this consider OMEGA defined as

OMEGA = (omega omega)

(or expanding for little omega we get)

OMEGA = ((lambda (x) (x x)) (lambda (x) (x x)))

 

If you haven’t seen this before, look at this device for a bit – do you see how it is infinite self application? If you do then consider the following

((lambda (x) (f (x x))) (lambda (x) (f  (x x)))

It doesn’t matter what f is right now, but see what this does. After one reduction this would look like this -

(f (term term))

Where term = (lambda (x) (f (x x)))

The second reduction would look like this and so on

(f (f (term term)))

(f (f (f (term term))))

 

This, if you notice is similar to what we required earlier, where instead of ‘f’ we used ‘x’.

 

This is the basic mechanism for the Y combinator. And the Y combinator is simply

Y = (lambda (f) ((lambda (x) (f (x x))) (lambda (x) (f  (x x)))

 

So how do we define a recursive function using the Y? Here is factorial

(let ((factorial (Y (lambda (fact)

                      (lambda (n)

                        (if (zero? n)

                            1

                            (* n (fact (sub1 n)))))))))

  ; use factorial here

 )

 

Nice?

 

Now in the beginning I said that this Y is the Y for the last normal order language. Scheme is applicative order and non lazy. What that means is that  if you use the Y combinator defined above in scheme, during the definition of factorial it will try to do the infinite expansion and will never terminate.

 

Instead one just eta-expands the definition to get the applicative order Y combinator. Suppose you say that term is (lambda (x) (f (lambda (y) ((x x) y))))

And we define the combinator as

Y = (lambda (f) (term term))

Then we have the applicative order Y that is usable from scheme and other applicative order languages.

Look at its expansion to see what I mean

(f (lambda (y) ((term term) y))

So the first parameter to ‘f’ (recursive function we are interested in is a lambda – the application inside the lambda is done only if f calls it.

 

Let me define both the Y combinators described here in lambda form

Y = \f.((\x.(f (x x)) (\x.(f (x x)))

Y = \f.((\x.(f (\y ((x x) y))) (\x.(f (\y ((x x) y))))

 

Once you see why the Y combinator is needed and you understand omega – its easy to see what the Y is and how it useful in defining recursive functions in a mathematically pure way.

 

The Y combinator is described as a fixed point function or combinator. The idea is that is lets a function describe its behavior in terms of itself until the function reaches a fixed point – like in the case of factorial the function reaches 0 and does not need to recurse any more.

 

References –

http://en.wikipedia.org/wiki/Y_combinator

http://www.dreamsongs.com/NewFiles/WhyOfY.pdf

 

Though I myself had looked at the Y many times before I never really understood it until I had to write recursive functions in the pure lambda calculus and I used omega to do it. I then happened to mentioned to Prof Friedman that I did not understand the Y and in a minute he set my thinking right.

 

I understand that in the description above I have taken some liberties with syntax and something are described in a hand wavy sort of way. However if you have suggestions about how I can better explain this, let me know. Of course I require that you understand scheme style syntax and higher order programming (at least minimally).

 

Sunday, January 29, 2006 12:57:21 AM (Eastern Standard Time, UTC-05:00)  #    Comments [2]  | 
 Saturday, January 28, 2006

Some context –

 

The other day in Prof Dan Friedman’s Advanced Programming Languages class we had an assignment where we had to write a lambda calculus interpreter that did non deterministic application and we had to write the factorial in it. Also, reduction was to continue even if the top level term was value.

 

As a consequence of working on this we were expected to discover a method of encoding numbers using pure lambda terms. A really long time back Alonzo Church (creator of the lambda calculus) created an encoding now called the Church numerals. I created the following set when working on the factorial problem.

 

t = \a.\b.a     true

f = \a.\b.b     false

 

zero = \x.t

zero? = \x.(x f)

 

s = \x.\b.((b x) b)      successor

p = \x.(x t)             predecessor

 

(Like in Haskell the ‘\’ represent lambda). So we have a way of saying zero, we have a way of checking if a number is zero and we have the ability to add one and subtract one. That gives us all the natural numbers including zero.

 

If you notice, the predecessor define above returns a value that is an invalid number when applied to zero. An alternate ‘type safe’ definition is the following –

p = \x.(((zero? x) x) (x t))

 

Here (p zero) is zero.  

 

Now you may think that that’s not great way of implementing numbers – its inefficient etc. Yes that’s true, but remember that the lambda calculus is a formal system, it’s a mathematical formalism and hence has no notion of cost. In the study of formal systems (other than the study of their complexity) we are interested in asking questions about what the system can do and cannot do. The above encoding I believe is equivalent to the encoding of Church numerals.

 

Following the definition of numbers, I can define ‘add’ in the following obvious way -

add = (Y \add.\a.\b.(((zero? a) b) ((add (p a)) (s b)))

 

Here ‘Y‘ is the fixed point Y combinator and this is equivalent to

add(a, b)  { if a == 0 then return b else return add(a - 1, b + 1) }

 

Similarly one can define ‘mul’ and ‘factorial’ as well.

 

I went ahead and defined operators so that I have scheme/lisp style lists. I am told that there are very similar to Church’s definitions.

 

cons =  \x.\y.\b1.\b2.((b1 f) ((b2 x) y))

car =   \ls.((ls #f) #t)

cdr =   \ls.((ls #f) #f)

null =  \x.x

null? = \ls.(ls #t)

 

Hence we have –

(null? null) -> #t

(car null) -> undefined

(cdr null) -> undefined

(car (cons a b)) -> a

(cdr (cons a b)) -> b

 

Can you shoot yourself in the foot using these definitions? Of course you can – but if you are careful you can write meaningful programs as well. To do other sorts of checks, like to prevent us from doing (s cons) we need some kind of type system or validations rules – but these are not relevant to the discussion of the pure lambda calculus.

 

Here is a basic lambda calculus reference - http://en.wikipedia.org/wiki/Lambda_calculus

Here is a Church encoding reference - http://en.wikipedia.org/wiki/Church_numeral

 

Saturday, January 28, 2006 1:56:34 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  |