Wednesday, March 19, 2008

fund0232[1]

After talking to Pooja, I felt it necessary to mention the great mathematician G. H. Hardy on my blog. Hardy is most famous outside of mathematics for his "A Mathematician's Apology". The book, written in later in life by Hardy, talks among aother things about how mathematics is young man's game. He is also somewhat know for his association with Ramanujan and for being the person responsible for bringing him to Cambridge where his greatest mathematics unfolded.

Quoting one of the mathematicians, C Snow, that Hardy worked with:
http://www-groups.dcs.st-and.ac.uk/~history/Biographies/Hardy.html

A mathematicians apology is, if read with the textual attention it deserves, a book of haunting sadness. Yes, it is witty and sharp with intellectual high spirits: yes, the crystalline clarity and candour are still there: yes, it is the testament of a creative artist. But it is also, in an understated stoical fashion, a passionate lament for creative powers that used to be and that will never come again. I know nothing like it in the language: partly because most people with the literary gift to express such a lament don't come to feel it: it is very rare for a writer to realise, with the finality of truth, that he is absolutely finished.

Hardy was a sort of purist mathematician, one who did his mathematics not for the sake of its applicability to anything, but for the sake of doing great mathematics. Hardy, along with Littlewood and Ramanujan,  is also mention in Apostolos Doxiadis' "Uncle Petros and the Goldbach Conjecture". The link above gives a short summary on his life.

Some quotes:

Asked if he believes in one God, a mathematician answered: "Yes, up to isomorphism".

http://en.wikipedia.org/wiki/G._H._Hardy:

It is never worth a first class man's time to express a majority opinion. By definition, there are plenty of others to do that.

A mathematician, like a painter or a poet, is a maker of patterns. If his patterns are more permanent than theirs, it is because they are made with ideas.

Wednesday, March 19, 2008 12:50:42 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, March 01, 2008

I have started liking groups so much, that I think its worth spending some time drawing out these beautiful things with some care. These mathematical abstractions have some really nice corresponding pictures.

 

image

This is the dihedral group D3, produced by
s^3 = i
(st)^2 = i

 

image   image

The quaternion group.

Quaternions are a generalization of complex numbers. I am a little fascinated with the above group because a long time back before I had known of Group theory I had constructed the above diagram in my attempt to understand quaternions and multi-dimensional geometry.

The discovery of quaternions is credited to the great mathematician Hamilton. Story has it that Hamilton had been pondering the issue for a quite a while and fundamental equation of quaternions came to him when he was taking a walk with his wife. It is said he carved the equation into bridge where he was at that time. The bridge, now has a plaque to this effect. So here is the fundamental equation:

ijk = -1

From this we can also derive i^2 = j^2 = k^2 = -1

 

image

The group produced by
s^3 = i
(-s)ts = t^2

where "-s" is to be read as "s inverse".

Here is a slightly different rendition of the same group:

image

 

Here is a different group:

image

This is the group generated by:
s^4 = i
(-s)ts = t^2
t^15 = i

 

image

This is the group generated by a variation of the above equations:
s^4 = i
(-s)ts = t^2
t^5 = i

Saturday, March 01, 2008 3:21:05 PM (Eastern Standard Time, UTC-05:00)  #    Comments [3]  | 
 Tuesday, February 12, 2008

Symmetry groups are beautiful and fascinating.

image

Cayley diagram for the Symmetry Group of a Cube, 

We can write some simple equations based on "following arrows" in the diagram. If red is "a" and blue is "b", we can say many things. "i" is the identity. If we take a sequence of arrows starting from a point and if the sequence of arrows lands us back at the starting point, then the path forms a closed loop and is equivalent to not having left the point at all.

a^4 = i
(ab)^3 = i
(abb)^2 = i
(aabb)^2 = i

Note that "a" and "b" are equivalent. They can be interchanged in the equations. Hence we may also conclude

b^4 = i
(ba)^3 = i
(baa)^2 = i
(bbaa)^2 = i

Similarly any "a" can be replaced by "-a", the reverse path, or any "b" by "-b". Also not that "aaa" is the same as "-a". Hence we can write:

aaa = -a

And hence,
(aaab)^3 = ((-a)b)^3 = i
(aaabb)^2 = ((-a)bb)^2 = i

Any sequence that forms a loop (equals i) can be started at any point (ie can be rotated) and it still forms a loop. Hence the following are equivalent.

(aba)^4 = i = aba.aba.aba.aba = aab.aab.aab.aab = (aab)^4.

hence we have,
(aab)^4 = i
(baa)^4 = i
(abba)^2 = i

Of course, I have missed some. You can fill them in with what you know now.

 

Now if we select "c" to be some string a and b, such that c^3 = 1, for example c = ab, we can draw a new cayley diagram from the resulting triangles and one of the squares (example a^4 = 1). The "c" arrows are drawn in green here.

image

This maybe visualized as a solid. A cube with each vertex replaced by a triangle and each edge having the width of the triangle's side. Now a very good diagram of the front view:

image 

 

The above group, corresponding to the diagrams, has a name - it is the symmetry group S4. It corresponds to the permutations of 4 character tuple, (A,B,C,D). How many permutations are there? There are 24. These can be though of as the 24 different points on the above diagram.

 

prev: Symmetry Group of a Tetrahedron

Tuesday, February 12, 2008 1:12:27 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Friday, February 08, 2008

Today I was trying to draw the Cayley diagram for symmetry groups of a regular tetrahedron. (A tetrahedron is a like a pyramid, but with a triangular base.) What I ended up with, surprised me. I had started with the idea that I would end up with something that was tetrahedron like - or atleast suggestive of the fact that it started from a tetrahedron.

 Cayley diagram for a symmetry group of the tetrahedron.

Cayley diagram for a symmetry group of the tetrahedron.

Now, this is not the only group one can generate from a tetrahedron. And even for this group, this certainly not the only Cayley diagram as well. We can visualize the above diagram as a 3d object If I were to redraw this slightly, we get -

image

And now if I imagine this as 3d object, I get something looks like this.

image image

front and back respectively.

It is as if each triangle has triangle facing in the opposite direction behind it. And there are 4 such perspectives, each corresponding to a vertex of the tetrahedron. Cute. I wonder if I can make choices for a different set of rotations such that the triangles are not inverted? 

After some playing around with it, I got this. I realized that following any sequence of red-blue-red-blue arrows always leads back to the starting point. If I were to compose every pair of red-blue arrows into a single arrow, lets color this green, then we get a different diagram. This diagram is indeed suggestive of the shape of the tetrahedron. The same shape would result if we were to include the blue arrows instead of the red ones.

Cayley diagram for a symmetry group of the tetrahedron

Cayley diagram for a symmetry group of the tetrahedron

This shape is interesting because you can imagine that looks almost like a large green tetrahedron with the each tip sliced of to reveal a small red triangle. Here is a picture I found on the web:

truncated_tetrahedron_un[1]

This diagram also lets itself to a nice explanation - each triangle at a vertex corresponds to the sub-group for rotation about that vertex.

Friday, February 08, 2008 9:15:15 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Wednesday, January 23, 2008

Here is a rather pleasant introduction to the philosophical point of view underlying Category Theory.

When is one thing equal to some other thing-

One of the templates of modern mathematics, category theory, offers its own formulation of equivalence as opposed to equality; the spirit of category theory allows us to be content to determine a mathematical object, as one says in the language of that theory, up to canonical isomorphism. The categorical viewpoint is, however, more than merely “content” with the inevitability that any particular mathematical object tends to come to us along with the contingent scaffolding of the specific way in which it is presented to us, but has this inevitability built in to its very vocabulary,and in an elegant way, makes profound use of this. It will allow itself the further flexibility of viewing any mathematical object “as” a representation of the theory in which the object is contained to the proto-theory of modern mathematics, namely,to set theory.

I have been spending some amount of time looking into Category theory and it is truly something elegant. A small selection of topics from Category theory have made their way into my reading list for my Oral Qualifiers. Hence, interested reinforced by need.

One of the strange things about category theory, and probably the most elegant thing about it as well is that category theory really has nothing much to do with objects in the way that Set theory does. Set theory and I think most things deal with objects, collections of objects and such. Category theory on the other hand is a theory about relationships, rather than the objects they relate. This shift, to me, is reminiscent of the Leibniz-Clarke viewpoints on the notion of space.

Wednesday, January 23, 2008 12:22:51 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, January 12, 2008

These are beginner level references for Category theory, ones that can be read by students of Computer Science. I haven't had a chance to look at all of this material myself. Ones that I have looked at have been marked with a star. Things that I think are better beginner material have more stars. At the time of this writing I am an absolute beginner in category theory and to most of abstract mathematics (with possible exception of Set theory), hence take my opinions here as just that - beginner level opinions. Finally, material that has been recommended to me by some of my professors has been marked with a +.

Categories
by T S Blyth (amazon) **

An Introduction to Categories in Computing
by Barry Jay (PS file)

Category Theory for Computing Science
by Michael Barr and Charles Wells (homepage of Charles Wells) (PS file of lecture notes) ++

Categories, Types and Structures: Category Theory for the working computer scientist
by Andrea Asperti and Giuseppe Longo (PS book) +

Category Theory for Beginners
by Steve Easterbrook (PDF slides)

Elements of Basic Category Theory
by Alfio Martini (citeseer)

Computational Category Theory
by D.E. Rydeheard and R.M. Burstall (PDF)

A Categorical Manifesto
by Joseph A. Goguen (citeseer)

Basic Category Theory for Computer Scientists
by Benjamin Pierce (amazon) *+

Saturday, January 12, 2008 12:13:27 AM (Eastern Standard Time, UTC-05:00)  #    Comments [2]  | 
 Saturday, January 05, 2008

I am very moved by "Uncle Petros and the Goldbach Conjecture" by Apostolos Doxiadis. It's a work of fiction that makes several references to real people and real mathematics. It seems the author has done a fair bit of research to write this book. The book is about a mathematician "Uncle Petros" who spends his entire life trying to prove the Goldbach Conjecture.

In the course of the book, Doxiadis touches on the lives of G H Hardy, S. Ramanujan, Cantor, Kurt Godel and much of the pain, isolation, failure and achievement that is involved in doing research. The book is a fascinating read and is in many ways very true to life though the main characters are fictious.

The Goldbach Conjecture simply states that every even number greater than 2 can be expressed as the sum of two primes. Wikipedia has some more detail. The conjecture has been known for about 250 years now. There is still no proof, though the fact has been verified for very large numbers. It is a conjecture and not a theorem because it has no proof. Work still continues on this. Dare to try solve it? You may also enjoy looking at the Goldbach weave.

In the early 90's a long standing famous problem, Fermat's Last Theorem was proved by Andrew Wiles. It took him 7 years of exclusive work to prove the theorem. The theorem has been know since 1637! And has escaped all these centuries without proof despite many many mathematicians working on it. One of the things that made Fermat's last theorem famous is that Fermat has scribbled in the margin of his notebook a comment to the effect that he knew a proof but that the margin is not wide enough to note it. (Wikipedia) Wiles's proof is 150 pages and uses mathematics discovered in the 20th century - this is most likely not the proof that Fermat had in mind, if indeed he had a correct proof.

The interesting things about proofs like these are not just that they confirm the truth of statements that we always suspected to be true, but that they advance the state of mathematics. In the course of chasing hard problems, often new theories and new approaches to proof theory are discovered.

Saturday, January 05, 2008 6:06:04 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Wednesday, November 28, 2007

An odd idea from a few days ago:

I watched this video a while back, this is Richard Feynman giving a lecture about discovering laws in physics -

He is talking about laws in physics. (the underlying Philosophy of Science that Feynman describes here is due to Karl Popper)

What Feynman is describing seems fundamentally different from what we do in the formal sciences like math/logic/cs. In the later we usually choose an axiom set that we believe to be right, based on our aesthetics, and then go on to prove other things that are right wrt our axioms.  Only things that are provable are taken to right and things that are right are irrefutably so. The system is inconsistent if we deduce False from the rules that we have. Inconsistent systems are not interesting. All formal methods work like this, in spirit - they keep track of what is right.

In what Feynman is describing, they don’t have a formal notion of right. They have a notion of what is wrong and as long as something cannot be constructively (by experiment) shown to be wrong, they can temporarily accept it to be not-wrong. If you look at this as a formal system, this is one where “what is not wrong yet” is known instead of what is right. Something is not wrong because 1) We don’t know a proof by which we can construct F from it or 2) Given our current inference rules there is no proof for it. But, we may add a new inference rule to the system in the future which may invalidate the belief that something is not wrong. It’s a feels like the opposite of what we do with logics.

Imagine a formal system or a model of computation based on notion like this. We are, in a fundamental sense, giving up the notion of consistency and completeness when we do this. I wonder if there exists a computational model that corresponds to such a “co-logic” of the sort they use in the *real* sciences. Such a system, in spirit, might be able to deal with partially correct data, incorrect assumptions etc. in a natural way. Absolutely correct data (or properties about the data) would be the exception.

(I wonder what this implies for the incompleteness theorem and such. I have been told that the "co-logic" I refer to here is actually co-induction. I see some similarities there, but I am not sure if its exactly that. )

A Tutorial on (Co)Algebras and (Co)Induction - Jacobs, Rutten

A Tutorial on Co-induction and Functional Programming

Wednesday, November 28, 2007 12:03:05 AM (Eastern Standard Time, UTC-05:00)  #    Comments [1]  | 
 Thursday, October 26, 2006

Today over dinner Will Byrd mentioned the Collatz function to Michael Adams and me. The Collatz function is a simple little function that on receiving any natural number seems to always terminate with the result 1. However the strange thing is that no one has been able to prove this property of the function. As a matter of fact one rather famous mathematician is quoted to have said that mathematics isn’t ready to prove the Collatz conjecture and such problems.

 

So what is Collatz function?

 

collatz :: Integer -> Integer

collatz 1 = 1

collatz n | n `mod` 2 == 0 = collatz (n `div` 2)

collatz n                  = collatz (3 * n + 1)

 

Let me translate that into a C* style syntax:

 

int collatz(int n)

{

      if(n == 1)

            return n;

      else if (n % 2 == 0)

            return collatz(n/2);

      else

            return collatz(n*3+1);

}

 

Here is a wikipedia page about it:

http://en.wikipedia.org/wiki/Collatz_conjecture

 

If you look at it hard enough certain patterns emerge and we took some shots at what a proof might entail. In summary its sufficient to say that we didn’t get very far. The interesting consequence though is that enough information about the nature of the function emerged that we were able to create another one:

 

Hence is our new function that we call collatz’ (read as collatz-prime)

 

collatz' :: Integer -> Integer

collatz' 1 = 1

collatz' n | n `mod` 3 == 0 = collatz' (n `div` 3)

collatz' n | n `mod` 3 == 1 = collatz' (4 * n - 1)

collatz' n                  = collatz' (4 * n + 1)

 

Cute?

Later in the evening I found this one which I call collatz’’ (read collatz-double-prime)

 

collatz'' :: Integer -> Integer

collatz'' 1 = 1

collatz'' n | n `mod` 5 == 0 = collatz'' (n `div` 5)

collatz'' n | n `mod` 5 == 1 = collatz'' (6 * n + 4)

collatz'' n | n `mod` 5 == 2 = collatz'' (6 * n + 3)

collatz'' n | n `mod` 5 == 3 = collatz'' (6 * n + 2)

collatz'' n                  = collatz'' (6 * n + 1)

 

Like the original collatz function, both collatz’ and collatz’’ seem to terminate on all inputs with the result 1. Again there is no proof that it will do so. I have (at the time of this writing) verified termination of both of these till for all numbers 1 to100,000.

 

Here is a little verifier:

 

verify :: (Integer -> Integer) -> Int -> Bool

verify f n = foldl isOne True (map f (take n [1..]))

    where

    isOne :: Bool -> Integer -> Bool

    isOne b x = (x == 1) && b

 

 

verify collatz’ 100000 returns True.

 

Here is collatz in Scheme:

 

(define C

   (lambda (n)

     (cond

       ((zero? (sub1 n)) 1)

       ((even? n) (C (quotient n 2)))

       (else (C (+ 1 (* 3 n)))))))

 

And Collatz’

 

(define C

   (lambda (n)

     (cond

       ((= n 1) 1)

       ((= 0 (modulo n 3)) (C (quotient n 3)))

       ((= 1 (modulo n 3)) (C (- (* n 4) 1)))

       (else (C (+ (* n 4) 1))))))

 

Here is a good time to make a potential conjecture: Is it true that there are infinitely many such functions?

 

Thursday, October 26, 2006 1:42:38 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Wednesday, October 25, 2006

Updated

 

For a long time I thought that Roman numerals were the worst counting system out there, until I got down to thinking about Microsoft Excel’s column naming convention. The last time I thought about Excel I wasn’t satisfied with the solution I had – the last time was at Applibase with Sid.

 

So what do I mean? You know the way columns are numbered in excel they go A, B, C.. till Z and then become AA, AB, AC … AZ, BA, BB, BC … BZ, CA, CB… etc. You get the idea I guess. So in some sense its fair to imagine that you can think of these as representing numbers.

 

So can you write a mapping from natural numbers to excel-numbers and vice versa?

 

Lets look at look at the problem in a reduced sense and see what we get – lets consider only A, B and C. Therefore we could count as follows – A, B, C, AA, AB, AC, BA, BB, BC, CA, CB, CC, AAA, AAB, AAC etc.

 

Lets write these down with their natural numbers next to them –

 

1 : A

2 : B

3 : C

4 : AA

5 : AB

6 : AC

7 : BA

8 : BB

9 : BC

10 : CA

11 : CB

12 : CC

13 : AAA

14 : AAB

 

Can you write a function that takes a number and returns the corresponding excel number? Its not as easy as you think. The reverse, ie, taking an excel number and producing a natural number out of it is easier, conceptually.

 

If you think about it enough you will realize that unlike many number systems out there this one is different because it does not have a zero character – hence you can’t express things like A00.

I don’t think I have a good enough solution yet. My current solution still checks if a value is 0 and if so does something otherwise does something else. I would like to write a solution in a clearly recursive fashion such that the only condition-check I do is to terminate the recursion. Think about how you convert a number from natural numbers to a base 2 representation for example – you can write this recursively and you only need a check to terminate your recursion. Here I need checks for other things. I guess in some sense what I am complaining about is that my number to excel-number conversion is not “mathematical” enough.

 

Anyway, here is solution as a piece of Haskell code – it’s a spoiler so don’t look at it until you have tried to solve it yourself. (Set base to 26 to get full A to Z numbering).

 

base = 3

 

ch :: Integer -> Char

ch n = intToDigit ((digitToInt 'a') + (fromInteger n) - 1)

 

num :: Char -> Integer

num c = toInteger $ (digitToInt c) - (digitToInt 'a') + 1

 

mdiv m n | m `mod`n == 0 = m `div` n - 1

mdiv m n = m `div` n

 

mmod m n | m `mod` n == 0 = n

mmod m n = m `mod` n

 

excel2int :: String -> Integer

excel2int s = excel2int' s 0

    where

    excel2int' (c:[]) r = r * base + (num c)

    excel2int' (c:cs) r = excel2int' cs  (r * base + (num c))

 

 

int2excel :: Integer -> String

int2excel n | n <= base = [(ch n)]

int2excel n = (int2excel (mdiv n base)) ++ [ch (mmod n base)]

 

Update

Here is a very clean solution, courtesy of Abhijit Mahabal who pointed it out in a couple of minutes. He even stated a proof and went on to show how you can do this with logarithms…

 

i2e n | n < 3 = [ch (n+1)]

i2e n = i2e (n `div` 3 - 1) ++ [ch ((n `mod` 3) + 1)]

 

The following solution (that I disagree with in spirit) is by Kyle Ross. It looks rather cute for some values of cute –

 

i2e = (gen !!) . pred

 

xs <+> ys = [x ++ y | x <- xs, y <- ys]

gen = base ++ (gen <+> base)

         where base = ["a", "b", "c"]

 

 

Wednesday, October 25, 2006 10:55:51 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Tuesday, October 24, 2006

Write the following into a text file:

 

A -> B;

B -> C;

C -> A;

A -> D;

D -> C;

C -> B;

 

And then run the following ruby script with the text file as a command line parameter:

 

require "win32ole"

 

def load(fn)

      s = File.new(fn).readlines().join ""

end

 

 

gr = WIN32OLE.new "WinGraphviz.DOT"

s = "digraph G {#{load(ARGV[0])}}"

 

png = gr.ToPNG(s)

png.Save("#{ARGV[0]}.png")

 

And then you get the following PNG file:

 

 

Interesting?

 

I recently needed to draw a graph (as in connected graph – in graph theory…) and looked around a bit. I came across the Graphviz library developed by the old AT&T. There is a port of the libaray available for windows and it can be downloaded from:

http://www.graphviz.org/

 

Graphviz or rather, WinGraphviz is available as a COM component which is registered by the installer. So instead of getting my hands dirty with C++,I decided to take the COM bindings in ruby for a walk.

 

This is where the ruby code comes – it creates an object of WinGraphviz.DOT component and uses it to create a graph. Notice how I can interact with a COM object as though it were a regular Ruby object – fun fun.

 

As for the language in the text file – Graphviz graphs can be specified using a language (you can find a spec on their website). The A-> B etc at the beginning of the blog post was simply a subset of this language.

 

Of course if your data is in some other format, you will have to translate it into the graph language – but then again if its in an excel file or something you can just use Ruby to talk to excel COM component and read the data out and then talk to the Graphviz component.

 

Tuesday, October 24, 2006 10:13:08 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Sunday, January 29, 2006

What is the Y combinator? I have seen lots of discussions on the web but nothing that seemed like a simple explanation of how it works or why it is needed. I finally understood the Y combinator a few days back and a lot of things snapped in place. So here I will try and explain it from my perspective. In this entry I am taking liberties with various syntaxes – please bear with me.

 

This is the Y-combinator for a normal order lazy language –

Y = \f.((\x.(f (x x))) (\x.(f (x x))))

 

(Here, the ‘\’ denotes a lambda). Scheme is an applicative order language and the above combinator cannot be used as such in scheme because it will lead to an infinite expansion. More on this later.

 

Let’s start from some basics – say we are writing a mathematical expression and we said

a = a + 1

While the above means something in imperative programming languages (like C), it does not mean anything in mathematics.

 

Now say I want to write the factorial function in mathematics, I would say something like this

fact(0) = 1

fact(n) = n * fact(n-1)

And this definition is perfectly valid. However you notice that it is similar to the a = a + 1 expression in the sense that in describing the value of something, we refer to that very thing.

 

This general principle, called a recursive definition, has a problem in the context of pure functions (like the lambda calculus). The problem is as follows – a function is a nameless entity – the only named entities that a function can use are function’s parameters (the variables introduced by the function itself) or parameters of a parent function that it is define in the scope of.

 

For example

(lambda (x y)

      ; can use x and y here

      (lambda (a)

            ; can use a, x and y here

      )

)

 

Now for a recursive definition, a function needs to be able to refer to itself (or like we say in some languages, a functions needs to ‘call’ itself).

 

So, the only way that one can define a recursive function would be to receive the function itself as one of the functions arguments. Think about this for a bit – if you are not used to programming in a language where you can pass a function, or return a function as though it were a value (similar to how you treat an integer), then this might be a hard idea. If you are from a programming background that wherein you are used to ‘higher order’ programming, then I am only stating the obvious here. (The general idea with higher order programming is that functions are first class values, like integers – which means that they can be passed as arguments, returned from functions, stored in data structures etc).

 

So we define a function such that it takes an additional argument which is itself.

Lets say

 

factorial0(factorialx, n) = if n == 0 then 1 else n * factorialx(factorialx, n -1)

factorial(n) = factorial0(factorial0, n)

Does this make sense? If it does not, then don’t bother read further. You notice that the factorial0 function takes 2 arguments one of which is the function itself.

 

Let me rewrite this in scheme-style syntax

(let ((factorial0 (lambda (fact, n)

                    (if (zero? n)

                        1

                        (* n (fact fact (sub1 n)))))))

 (let ((factorial (lambda (n)

                    (factorial0 factorial n))))

      ; I can use factorial here

  ))

 

Make sense? Ok, so now let me write the interesting function factorial0 as a nested lambda as follows -

(lambda (fact) (lambda (n)

(if (zero? n) 1 (* n ((fact fact) (sub1 n)))))

 

You notice that fact is applied to fact, to generate the inner lambda which is then invoked with n-1.

 

So now, suppose one has a handy function of the following form

omega = (lambda (x) (x x))

 

One can define factorial as follows

 (let ((factorial

       (omega (lambda (fact)

                (lambda (n)

                  (if (zero? n)

                      1

                      (* n ((omega fact) (sub1 n)))))))))

  ; use factorial here

  )

 

There you go – so now we have a way of defining recursive function definitions even if your language did not directly allow a recursive definition.

 

So why do we need the Y combinator? You notice that in the definition of factorial we had to do something that looks very strange in the definition of the function – we call the function omega to return a function that we can call. The Y combinator does away with this for us.

 

To start thinking about the Y combinator, think about things this way – suppose the parameter ‘fact’ was already ‘(omega fact)’ then we wouldn’t have to call omega inside the definition of factorial correct. Right?

 

If we wanted that we simply need to rewrite omega this way right –

(lambda (x) (x (x x)))

Such that the first parameter is already applied to itself. But then this is not enough, because when the recursive call is made, the second time the parameter is only fact and not (fact fact) or (omega fact). Well, so we can fix this by having one more level in the definition of omega, such that omega looks like this –

(lambda (x) (x (x (x x))))

 

But then the third time this would not be the correct parameter… ideally we need an infinite application of x, because we don’t know how many times the function will call itself. In other words the omega should look a little like this

(lambda (x) (x (x (x (x (x … ))))

 

This is exactly what the Y combinator does. To see how it does this consider OMEGA defined as

OMEGA = (omega omega)

(or expanding for little omega we get)

OMEGA = ((lambda (x) (x x)) (lambda (x) (x x)))

 

If you haven’t seen this before, look at this device for a bit – do you see how it is infinite self application? If you do then consider the following

((lambda (x) (f (x x))) (lambda (x) (f  (x x)))

It doesn’t matter what f is right now, but see what this does. After one reduction this would look like this -

(f (term term))

Where term = (lambda (x) (f (x x)))

The second reduction would look like this and so on

(f (f (term term)))

(f (f (f (term term))))

 

This, if you notice is similar to what we required earlier, where instead of ‘f’ we used ‘x’.

 

This is the basic mechanism for the Y combinator. And the Y combinator is simply

Y = (lambda (f) ((lambda (x) (f (x x))) (lambda (x) (f  (x x)))

 

So how do we define a recursive function using the Y? Here is factorial

(let ((factorial (Y (lambda (fact)

                      (lambda (n)

                        (if (zero? n)

                            1

                            (* n (fact (sub1 n)))))))))

  ; use factorial here

 )

 

Nice?

 

Now in the beginning I said that this Y is the Y for the last normal order language. Scheme is applicative order and non lazy. What that means is that  if you use the Y combinator defined above in scheme, during the definition of factorial it will try to do the infinite expansion and will never terminate.

 

Instead one just eta-expands the definition to get the applicative order Y combinator. Suppose you say that term is (lambda (x) (f (lambda (y) ((x x) y))))

And we define the combinator as

Y = (lambda (f) (term term))

Then we have the applicative order Y that is usable from scheme and other applicative order languages.

Look at its expansion to see what I mean

(f (lambda (y) ((term term) y))

So the first parameter to ‘f’ (recursive function we are interested in is a lambda – the application inside the lambda is done only if f calls it.

 

Let me define both the Y combinators described here in lambda form

Y = \f.((\x.(f (x x)) (\x.(f (x x)))

Y = \f.((\x.(f (\y ((x x) y))) (\x.(f (\y ((x x) y))))

 

Once you see why the Y combinator is needed and you understand omega – its easy to see what the Y is and how it useful in defining recursive functions in a mathematically pure way.

 

The Y combinator is described as a fixed point function or combinator. The idea is that is lets a function describe its behavior in terms of itself until the function reaches a fixed point – like in the case of factorial the function reaches 0 and does not need to recurse any more.

 

References –

http://en.wikipedia.org/wiki/Y_combinator

http://www.dreamsongs.com/NewFiles/WhyOfY.pdf

 

Though I myself had looked at the Y many times before I never really understood it until I had to write recursive functions in the pure lambda calculus and I used omega to do it. I then happened to mentioned to Prof Friedman that I did not understand the Y and in a minute he set my thinking right.

 

I understand that in the description above I have taken some liberties with syntax and something are described in a hand wavy sort of way. However if you have suggestions about how I can better explain this, let me know. Of course I require that you understand scheme style syntax and higher order programming (at least minimally).

 

Sunday, January 29, 2006 12:57:21 AM (Eastern Standard Time, UTC-05:00)  #    Comments [1]  | 
 Saturday, January 28, 2006

Some context –

 

The other day in Prof Dan Friedman’s Advanced Programming Languages class we had an assignment where we had to write a lambda calculus interpreter that did non deterministic application and we had to write the factorial in it. Also, reduction was to continue even if the top level term was value.

 

As a consequence of working on this we were expected to discover a method of encoding numbers using pure lambda terms. A really long time back Alonzo Church (creator of the lambda calculus) created an encoding now called the Church numerals. I created the following set when working on the factorial problem.

 

t = \a.\b.a     true

f = \a.\b.b     false

 

zero = \x.t

zero? = \x.(x f)

 

s = \x.\b.((b x) b)      successor

p = \x.(x t)             predecessor

 

(Like in Haskell the ‘\’ represent lambda). So we have a way of saying zero, we have a way of checking if a number is zero and we have the ability to add one and subtract one. That gives us all the natural numbers including zero.

 

If you notice, the predecessor define above returns a value that is an invalid number when applied to zero. An alternate ‘type safe’ definition is the following –

p = \x.(((zero? x) x) (x t))

 

Here (p zero) is zero.  

 

Now you may think that that’s not great way of implementing numbers – its inefficient etc. Yes that’s true, but remember that the lambda calculus is a formal system, it’s a mathematical formalism and hence has no notion of cost. In the study of formal systems (other than the study of their complexity) we are interested in asking questions about what the system can do and cannot do. The above encoding I believe is equivalent to the encoding of Church numerals.

 

Following the definition of numbers, I can define ‘add’ in the following obvious way -

add = (Y \add.\a.\b.(((zero? a) b) ((add (p a)) (s b)))

 

Here ‘Y‘ is the fixed point Y combinator and this is equivalent to

add(a, b)  { if a == 0 then return b else return add(a - 1, b + 1) }

 

Similarly one can define ‘mul’ and ‘factorial’ as well.

 

I went ahead and defined operators so that I have scheme/lisp style lists. I am told that there are very similar to Church’s definitions.

 

cons =  \x.\y.\b1.\b2.((b1 f) ((b2 x) y))

car =   \ls.((ls #f) #t)

cdr =   \ls.((ls #f) #f)

null =  \x.x

null? = \ls.(ls #t)

 

Hence we have –

(null? null) -> #t

(car null) -> undefined

(cdr null) -> undefined