Saturday, May 22, 2004

C# 2.0 has support for closures.

 

This article discusses the implementation of the closures in the C# language and how this has been done using pure compiler magic without any CLR changes. I had previously mentioned C# closures in an earlier blog entry and had linked to Antonio Cisternino’s blog entry:

Closures in CLR 2.0

This is an interesting entry and worth a read. Unfortunately, IMO, Mr. Cisternino’s entry is a not correctly titled. The support for closures is a C# only thing not a CLR level addition.

 

I intend to start off where his blog entry ends. Here I shall look into how it is done.

 

Features such as closures have been taken to legendary fame in languages like scheme and ruby. Recent language developments like Groovy also ship with closure constructs.

 

Microsoft, for some reason, has been calling the new feature of the language as anonymous methods. I don’t know why this feature hasn’t been publicly been spoken of as closure or lambda support in the C# language. Maybe there are subtle differences in the theoretical definition of closures and what the C# language achieves.

 

That said – let’s jump into the matter.

This is C# code that shows off a closure:

 

Code Showing One Anonymous Method / Closure that shares variables with its scope

//closure.cs

//Compile: csc closure.cs

using System;

 

class CMain

{

      delegate void Closure(int n);

     

      static Closure CreateClosure()

      {

            int c = 0;

            return delegate(int n) {

                  Console.WriteLine("Closure: n = {0}, c = {1}",n,c++);

            };

      }

     

      static void Main()

      {

            Closure c1 = CreateClosure();

            Closure c2 = CreateClosure();

            c1(1);

            c1(1);

            c1(1);

            c2(2);

            c2(2);

            c2(2);

      }

}

 

This code is very similar to the groovy snippet I had posted sometime back. When executed, this is the output:

 

> closure.exe

Closure: n = 1, c = 0

Closure: n = 1, c = 1

Closure: n = 1, c = 2

Closure: n = 2, c = 0

Closure: n = 2, c = 1

Closure: n = 2, c = 2

 

If you don’t know what closures are about, the easiest way to appreciate them is to take a careful look at the C# code above. Notice the function CreateClosure() is returning a delegate to a block of code that is part of the function itself. (Normally you would create a delegate to a function). If you don’t understand what a delegate is, let it be. What you need to understand is that the function returns a block of code that is a part of the function.

 

The block of code accepts an integer parameter. Also, you will notice that the local variable of the function (variable ‘c’) is being used in the code block.

 

When the block of code is returned to the caller (in this case main), the Main() can invoke the variable that represents the block of code, causing the code to run.

 

Once upon a time you could create a delegate to an independent function only and not to a code block within a function. A delegate to a function had to have the same signature as the function. The delegate could be thought of as an old C style function pointer. When the delegate is invoked, the function pointed to is called. Delegates came with the additional merit that they were type safe function pointers – most people did not think much more than that about delegates.

 

Now it is a little different. While the anonymous method within the CreateClosure() method actually looks like it simply has a nested function that does not have a name explicitly provided its not that simple. You might venture to guess that the compiler actually goes on to create a new method (by extracting this code block) and simply creates a delegate to the new function, the same way delegates once used to behave.

 

However, notice that the code block uses variable ‘c’ that is defined locally to the enclosing function. If this code block is going to be carved out of the enclosing function, how can it access variable ‘c’? Better yet, take a closer look at the output.

 

It looks like a closure returned from a call to CreateClosure() is seems to remember its value of the variable ‘c’. Some how, the state of the function CreateClosure() is captured in the delegate/closure that it returns. So much so, that the state of two invocations of CreateClosure() seemed to be maintained independent of each other.

 

This is in violation of the way simple C like functions work, where the function state is stored on the stack and the functions stack frame is torn down from the stack and state is lost when the functions return. (Refer ‘The Big Deal about Iterators’)

 

Functions that return closures seemed maintain state even after they have returned. The closure object is maintains a reference to that state. This requirement of maintaining is similar to the implementation of iterators (if you give it some thought).

 

Implementation

This is what our closure.cs looks like under ILDASM.

 

 

Notice the new class called __LocalsDisplayClass$0…1 that has been created. This is interesting culprit.

 

In essence how closures work in C# 2.0  is that the compiler creates a new class that contains member variables correspond to the local variables of CreateClosure() that are being used in the closure/anonymous method that it defines. Thus you can see the local variable ‘c’ of method CreateClosure() can be seen in class __LocalsDisplayClass$0…1.

 

So calling the CreateClosure method creates an instance of the __Locals… class. It then creates an old (classical) delegate to the __AnonymousMethod$0… method of the class. So the actual support for delegates in the CLR hasn’t changed at all. The delegate that is returned from the CreateClosure() method is a normal (old C# 1.x type) delegate.

 

All access to variables that are shared between the CreateClosure() method and the anaymous method are accesses to members of the __Locals… class.

 

Here is IL code of CreateClosure()

 

.method private hidebysig static class CMain/Closure

        CreateClosure() cil managed

{

  // Code size       30 (0x1e)

  .maxstack  3

  .locals init (class CMain/__LocalsDisplayClass$00000001 V_0,

           class CMain/Closure V_1)

  IL_0000:  newobj     instance void CMain/__LocalsDisplayClass$00000001::.ctor()

  IL_0005:  stloc.0

  IL_0006:  ldloc.0

  IL_0007:  ldc.i4.0

  IL_0008:  stfld      int32 CMain/__LocalsDisplayClass$00000001::c

  IL_000d:  ldloc.0

  IL_000e:  ldftn      instance void CMain/__LocalsDisplayClass$00000001::__AnonymousMethod$00000000(int32)

  IL_0014:  newobj     instance void CMain/Closure::.ctor(object,

                                                          native int)

  IL_0019:  stloc.1

  IL_001a:  br.s       IL_001c

  IL_001c:  ldloc.1

  IL_001d:  ret

} // end of method CMain::CreateClosure

 

Notice some things in the above code

- An object of __LocalsDisplayClass$00000001 is created.
- Access to the variable is actually access to the member ‘c’ of this class.
- The delegate is created to the instance of the __LocalsDisplayClass$00000001 class that was created and its __AnonymousMethod$00000000 method.

 

This is the code of the anonymous method, which has now become CMain/__LocalsDisplayClass$00000001::__AnonymousMethod$00000000(int32)

 

.method public hidebysig instance void  __AnonymousMethod$00000000(int32 n) cil managed

{

  // Code size       39 (0x27)

  .maxstack  5

  .locals init (int32 V_0)

  IL_0000:  ldstr      "Closure: n = {0}, c = {1}"

  IL_0005:  ldarg.1

  IL_0006:  box        [mscorlib]System.Int32

  IL_000b:  ldarg.0

  IL_000c:  dup

  IL_000d:  ldfld      int32 CMain/__LocalsDisplayClass$00000001::c

  IL_0012:  dup

  IL_0013:  stloc.0

  IL_0014:  ldc.i4.1

  IL_0015:  add

  IL_0016:  stfld      int32 CMain/__LocalsDisplayClass$00000001::c

  IL_001b:  ldloc.0

  IL_001c:  box        [mscorlib]System.Int32

  IL_0021:  call       void [mscorlib]System.Console::WriteLine(string,

                                                                object,

                                                                object)

  IL_0026:  ret

} // end of method __LocalsDisplayClass$00000001::__AnonymousMethod$00000000

 

This is quite exactly the code that we had written into the CreateClosure() function. Except that the local variable ‘c’ is not a local variable any more.

 

What do you think will happen when there is a function that defines two anonymous methods within it?

 

Code Showing Multiple Anonymous Methods / Closures that share variables with its scope

//closure2.cs

//Compile: csc closure2.cs

using System;

 

class CMain

{

       delegate void Closure(int n);

       static Closure t1,t2;

      

       static void CreateClosure()

       {

              int c1 = 0;

              int c2 = 0;

              int c3 = 0;

              int c4 = 0;

              Console.WriteLine("c4 = {0}",c4);

             

              t1 =  delegate(int n) {

                     Console.WriteLine("Closure: n={0}, c1={1}, c2={2}",

                                n,c1++,c2++);

              };

              t2 = delegate(int n) {

                     Console.WriteLine("Closure: n={0}, c1={1}, c3={2}",

                                n,c1++,c3++);

              };

       }

      

       static void Main()

       {

              CreateClosure();

       }

}

 

 

This is what happens:

 

 

Notice:

- There is still only one class that maintains state.
- The class has two methods (one for each of the anonymous methods)
- The variables that are being used by either of methods are part of the class (c1,c2, c3).
- Variables that are not being used from either closure are not part of the class (c4 is omitted).

 

One final look, what if the closure does not use any variables of its enclosing scope, how would this work?

 

Code Showing an Anonymous Method / Closure that is stateless

//closure3.cs

//Compile: csc closure3.cs

using System;

 

class CMain

{

       delegate void Closure(int n);

      

       static Closure CreateClosure()

       {

              int c = 0;

              return delegate(int n) {

                     Console.WriteLine("Closure: n = {0}",n);

              };

       }

      

       static void Main()

       {

              Closure c1 = CreateClosure();

              Closure c2 = CreateClosure();

              c1(1);

              c2(2);

       }

}

 

The code generated looks like this:

 

 

As expected there is NO hidden class generated in this case. Why? Because these is no need for the anonymous method to save state. The anonymous method itself is added to the class that contains the CreateClosure().

 

 

One more case to look at: what happens when the closure accesses both a local variable of the enclosing method as well as a member variable of the class that the enclosing method is a part of? Because state is persisted in a new class instance, how will the member variables of the original class be accessed?

 

This is C# code that shows this situation:

Code where closure accesses locals and class members

//closure4.cs

//Compile: csc closure4.cs

using System;

 

class CMain

{

       delegate void Closure(int n);

       int a = 10;

      

       Closure CreateClosure()

       {

              int c = 0;

              return delegate(int n) {

                     Console.WriteLine("Closure: n = {0}, a = {1}, c = {0}",n,a++,c++);

              };

       }

      

       static void Main()

       {

              CMain main = new CMain();

              Closure c1 = main.CreateClosure();

              Closure c2 = main.CreateClosure();

              c1(1);

              c2(2);

       }

}

 

This is what the generated code looks like:

 

 

Notice that there is a member in the _Locals… class that is called < this >. The this pointer/reference refers back to the original enclosing class where the anonymous method belonged. Thus it can access member variables.

 

Notes

I think C# closures were created as a by product of trying to implement iterators into the language. The implementation of iterators involved this sort of temporary class creation that preserved state of functions. (A little like Bjarne Stroutrup’s Function Objects).

 

I don’t know if there are reasons why this implementation of anonymous methods cannot be called as proper closures. Anonymous methods cannot use any ref or out type parameters of its enclosing function – this is a limitation. Does this limitation imply that they cannot be called closures? I don’t know. But other than that, it seems to serve the purpose of closures pretty well.

 

The fact that closures have been implemented as a compiler level hack means that there is no overhead at all for code that does not take advantage of these sort of feature. So there is no performance penalty on the CLR itself. Maybe in time, when these features gain adequate popularity there will also exist efficient means of integrating these features into the CLR in a way that does not affect functioning of traditional code.

 

Once we have closures, we can implement a large variety of constructs in the language (iterators being one of them). However since the syntax is a little clunky and the actual implementation a little slow (because there is a hidden managed class involved) this may never happen. All the same this is one amazing feature to have in a mainstream language like C#. Maybe even a little too advanced for some folk, so don’t be surprised is the coding guidelines of your company say NO to anonymous methods, the way they said to C++ macros once upon a time.

 

 

Saturday, May 22, 2004 1:10:56 AM (Eastern Standard Time, UTC-05:00)  #    Comments [4]  | 
 Friday, May 21, 2004

Today I had a rather shocking realization. I realized that C# 2.0 supports closures.

 

It was rather shocking, because here I was running up and down obscure languages looking for features like this and bang C# has it. I was pointed to this blog entry by a good friend of mine at Microsoft: Antonio Cisternino's Blog: Closures in CLR 2.0.

A lot of the content on Mr Cisternino’s blog is rather interesting and I would recommend a visit to

http://rotor.di.unipi.it/cisterni/Lists/My%20Blog/AllItems.aspx

 

The entry on closures is an interesting read. A quick search on google, showed me that the rest of the world seemed to have realized that C# has closures, a long time before I did.

 

 

 

Looking at closures brought back something from hazy old memory from a time when I was more ignorant:

 

Function Objects in C++

 

What is a function object?

 

An object that in some way behaves like a function, of course. Typically, that would mean an object of a class that defines the application operator - operator().

A function object is a more general concept than a function because a function object can have state that persist across several calls (like a static local variable) and can be initialized and examined from outside the object (unlike a static local variable). For example:

 

                class Sum {

                                int val;

                public:

                                Sum(int i) :val(i) { }

                                operator int() const { return val; }                  // extract value

 

                                int operator()(int i) { return val+=i; }           // application

                };

 

                void f(vector<int> v)

                {

                                Sum s = 0;             // initial value 0

                                s = for_each(v.begin(), v.end(), s); // gather the sum of all elements

                                cout << "the sum is " << s << "\n";

                               

                                // or even:

                                cout << "the sum is " << for_each(v.begin(), v.end(), Sum(0)) << "\n";

                }

 

Note that a function object with an inline application operator inlines beautifully because there are no pointers involved that might confuse optimizers. To contrast: current optimizers are rarely (never?) able to inline a call through a pointer to function.

Function objects are extensively used to provide flexibility in the standard library.

 

This is written by none other than Bjarne Stroustrup and you can see the full FAQ here:

http://www.research.att.com/~bs/bs_faq2.html

 

You might relate this to the brief discussion on method instances in the previous entry ‘The Big Deal about Iterators

 

Hopefully I will have a better understanding of how C# does closures, soon enough.

 

Friday, May 21, 2004 4:55:20 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Wednesday, May 19, 2004

This is an entry I have kept pending for a long time. I should have had this out much earlier.

 

Here I am going to talk about why iterators are a hard feature to implement in a conventional stack based language. This will probably help you understand some of the design decisions with respect to implementing iterators in a language and, on the whole, make you a better programmer with respect to this programming construct.

 

I am assuming that you have read Parts 1 and 2

Iterators in Ruby (Part - 1)

Warming up to using Iterators (Part 2)

 

After you have read this entry you should be in better light to grasp Parts 4 and 5.

SICP, Fiber api and ITERATORS ! (Part 4)

Implementation of Iterators in C# 2.0 (Part 5)

 

 

Method Calls and the Stack Frame

Since you have seen some samples of iterator based code in Part 2 lets take a look at how an iterator works. Before we delve into the related issues, first let us take a look at how method calls work on the C stack. (We could have taken an stack based language here, for example .Net IL, but I figured it is easier to stick to C).

 

Consider the following functions

 

void caller()

{

        int a;

        int b;

        int sum = callee(a, b);

}

 

 

int callee(int a, int b)

{

        int temp = a + b;

        return temp;

}

 

That is amazingly simple. However what happens here? When I say what happens here, I mean to ask what happens here at the system stack level.  To do that lets take a look at the dissembly of these functions. If you were to compile this code with the vc++ compiler, you could use the following switch to generate assembly code.

>cl /Facode.cpp.asm code.cpp

 

?callee@@YAHHH@Z PROC NEAR                      ; callee

; File d:\roshanj\work\cpp\iter\func.cpp

; Line 2

      push  ebp

      mov   ebp, esp

      push  ecx

; Line 3

      mov   eax, DWORD PTR _a$[ebp]

      add   eax, DWORD PTR _b$[ebp]

      mov   DWORD PTR _temp$[ebp], eax

; Line 4

      mov   eax, DWORD PTR _temp$[ebp]

; Line 5

      mov   esp, ebp

      pop   ebp

      ret   0

?callee@@YAHHH@Z ENDP                           ; callee

 

?caller@@YAXXZ PROC NEAR                        ; caller

; Line 8

      push  ebp

      mov   ebp, esp

      sub   esp, 12                             ; 0000000cH

; Line 11

      mov   eax, DWORD PTR _b$[ebp]

      push  eax

      mov   ecx, DWORD PTR _a$[ebp]

      push  ecx

      call  ?callee@@YAHHH@Z              ; callee

      add   esp, 8

      mov   DWORD PTR _sum$[ebp], eax

; Line 12

      mov   esp, ebp

      pop   ebp

      ret   0

?caller@@YAXXZ ENDP

 

 

The stack frame of the caller function looks like this:

 

 

And the when the caller() is calling the callee() then both the methods have their stack-frames mounted.

 

 

In Intel based systems the C stack grows downward in memory, which means that each item that is added to the stack causes the top of the stack (SP) to have a lesser address value. So the right way to draw these diagram would have been to draw them upside down. But that detail is not relevant here and so I have depicted them as a conventional stack that grows upwards.

 

When the callee() returns, the stack looks like the initial stack diagram and the variable ‘sum’ contains its required value.

 

 

To reiterate, what happens is that a method that is currently running uses the stack for storing its local variables – or more generally the state of the running method is preserved on the stack. When a method is called, it builds its own stack frame on top of whatever was already on the stack. The called method uses the stack frame to save its state, irrespective of what other methods are already on the stack.

 

You might have heard of this concept called a stack-overflow. That happens when methods calls happen to such an extent that there is not more free space left on the stack for the stack frame of a new method to be created.

 

Whenever a method returns, the part of the stack that is used for its variables is freed up – or so to speak, the methods stack frame is torn down. So when a method returns, its state information, that was on the stack is completely lost. This is necessary, because the parent function or method, might go on to call other methods that go on to use same stack space subsequently.

 

So let us say there is a calling pattern like this

 

function a()

        return

       

function b()

        call a()

        return

       

function caller()

        call a()

        call b()

        return

 

The stack usage will look like this –

 

 

Now that we have a reasonable idea of how the stack is used across method calls (though in reality there are so many many approaches), lets move on to iterators.

 

Iterators

Look at the following ruby snippet that shows an iterator. If you have an interest in C/Cpp/C# and couldn’t care less about Ruby, don’t throw your hands up in the air – the language is used here as an example of a language that implements iterators (and rather well at that), which I am using to communicate the idea.

 

def callee()

        for i in 1..10

                yield i

        end

end

 

def process(v)

        return (v * 10)

end

       

       

def caller()

        callee() {|value|

                value = process(value)

                print value

        }

end

 

If you recall the behavior of the iterator from Part 1 and 2, you will remember that when caller() calls callee() and the callee() invokes the yield statement, the control is back at the caller(). In this case, the value of ‘i’ is yielded from callee() which is received in caller() as value.

 

In C code, when the control returns to the calling function the assumption is that the called function is dead on the stack and that the stack frame is free for subsequent method calls, as shown in the stack diagram above.

 

If we have a similar diagram for the this ruby code, how would we draw it?

 

 

This is where we hit our first wall.

 

Assume that the caller() calls callee() and the callee() has yielded value 1. Now the stack frame of callee() is torn down in the old C way and the process() method is called which goes on to use the same area of the stack that was one used by callee(). After that caller() has done whatever it needs to do with the value it received from callee() it tries to invoke callee() again so that it can get the next value. This is where we hit the wall. The callee() cannot return the next value simply because the previous value it has for ‘i’ is lost when its stack frame got torn down.

 

In other words, on a conventional C stack, the callee() state is lost and therefore cannot resume execution from the point of a yield.

 

What work arounds do we have to this issue?

Let us assume that what Ruby does is simply a compiler hack – let us assume that it never really tears down the stack frame of the callee() at all, but instead it ensures that function returns back to the caller() but with the callee() stack frame still in place. That would look like this

 

 

While this is a nice diagram to look at, what does this mean? Where will the stack frame of the subsequent call to process() go? It cannot over-write the callee() stack frame – so it has to go above it. Like this?

 

 

Woo, now wait a minute, how does the caller() function know how much of the stack the callee() function is using to be able to do this? This is a difficult question to answer. Remember that the callee() is a proper function that can be using a variable amount of the stack at any point of time. So the caller() cannot predict the stack usage of the callee() but lets assume that the callee() passes back the information of its stack usage back along with the value that it is yielding.

 

Ok, so that would solve the problem of how the process() function uses the stack. But there is one more issue. What is the code block inside the caller() (the one that receives the yielded value) needs to push a value onto the stack?

 

If the code block inside the caller, needs to push a value onto the stack, then the pushed value will go on to overwrite the stack-frame of the callee above it. The obvious way to fix the problem is to shift the entire stack pointer to the top of the stack, above the callee() also. You can visualize it like this:

 

 

While this could work, there is one more issue – the local member variables of a function are accessed via EBP offsets. Now this is fine when local variables occur at known distances from the base of the stack frame for the function. To see this happen, you might want to refer back to the assembly code I posted towards the beginning of this article. You notice that most of the member variables are being accessed as _a$[ebp] or _b$[ebp], or similar syntax. The _a$ here does nothing but add a fixed positive offset to the EBP pointer. For the access of local variables when code is in the yield’s parameter block region of the caller() these rules would have to change, as there is an issue of adding an additional offset. The additional offset is introduced because now there is the stack frame of the callee() sitting squarely in the middle of what should have been the stack frame of the caller().

 

These issues crop up when using single iterators. If using multiple iterators or nested iterators and when used in conjunction with stack intensive operations like recursion, the picture because very complicated.

 

Alternative approaches of State maintenance and the concept of Method instances

While it should be clear that, to implement iterators in a language the must support the idea of functions maintaining their state even after they surrender control to their calling methods – it is not very clear as to how this can be implemented on a conventional C stack.

 

One approach is to go for a ‘stackless’ implementation. What that means is that function activations or stack frames are treated as allocated memory blocks on the heap and each function instance (when you call a function it needs to create a stack frame and that can be considered a function or method instance) lives on the heap like any other dynamically allocated object. These mini stacks as specific to function instances and so have no real concept of colliding with each other due to sharing a larger OS/platform maintained stack.

 

I believe languages like lisp/scheme behave in this manner with respect to the language stack. (I have been told this and I hope this is correct).

 

 

Another alternative is to approach the problem like this. Assume that the callee used only static variables. Immediately, the issue of maintaining state of variable of the callee on the stack is eliminated. The only issue then, is to resume execution of the function from the correct place (immediately past the yield) when the function is called again. This can be easily achieved by saving the resume position into an additional variable and implementing a switch case goto construct at the beginning of the function.

 

Now since it is persisting state in static variables we cannot use this recursively or in any other fashion. But we could fix this if we created a structure/class that contains member variables corresponding to the local variables of the function and use an instance of this structure in the function instead of using proper static variables.

 

Languages such as C# and Python use a similar approach to maintaining state of iterators between calls. You can read more about this in the Part 5 of this series.

 

For constructs like iterators to be supported in .Net and similar runtimes, in a native way, will require substantial changes in the way the runtime manages stack-frames and similar entities. Basically languages and runtimes need to be retro-fitted with a concept of function and code block instances the same way object oriented programming is fitted with concepts of class instances called objects.

 

Languages like Sun Microsystems’ Self build on similar concepts for methods/function instantiation.

 

If you give some of the ideas here a little thought, you should be well on your way to dreaming about new languages and fancy programming constructs.

Wednesday, May 19, 2004 9:09:21 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Tuesday, May 18, 2004

 

Trashbin finally has patch.

 

Before trashbin was written I used to wonder a bit out this mythical entity called metadata that is often talked about in the .Net framework. Metadata was mentioned on almost every discourse on .Net and libraries like reflection libs were based squarely on it. Metadata formed the central pillar of design of the self describing type/component system that is called the .Net framework.

 

However, I could find almost nothing that showed me actual metadata, in its physical form. As a consequence I decided to try and understand what metadata was all about. Thus trashbin was written. Trashbin first saw light of day when announced in a Bangalore User Group post sent in the wee hours of morning.

 

From: spark

Subject: metadata viewer: trashbin v0.1, src+bin release

Date: Mon, 09 Jun 2003 13:28:37 -0700

-----------------------------------------------------------