I had dropped a link to Antonio Cisternino’s blog entry about Closure support in CLR 2.0 when I was writing a description of how the C# compiler (Whidbey release) implements closures.
I was completely wrong about what Antonio’s blog entry was about. For some reason, my head being so full of C# that it is, I assumed he was talking about the how closures were implemented in C# for the Whidbey release. Closures in the Whidbey release (Visual Studio 2005) are officially called anonymous methods. So apologies Antonio, I had missed the point.
That said, what Antonio was referring to was a subtle change in the implementation of the delegates in CLR 2.0 so that the future CLR can be better used to implement closures and support functional languages and constructs. I exchanged some mails with him and I think I get what he was talking about.
Before I go any further I would recommend you reading Antonio’s original entry
Closures in CLR 2.0, my entry about the implementation of closures in C# 2.0 using delegates as they are available in CLR 1.x and Antonio’s follow-up entry that speaks of the distinguishing aspect of the new delegate implementation: More about Closures in CLR 2.0
The following are derived from mail exchanges with Antonio.
A delegate can be thought of as an entity that holds a pointer to a function and optionally a reference to the object for which the function is supposed to be an instance function. In CLR 2.0 a simple and subtle change has taken place where in the constraint that the instance pointer has to refer to the object for which the function pointer is member function, has been removed. The function pointer and the object reference don’t have to be related to each other.
In functional languages delegates are often implemented by having a pointer to a function which is the block of code that belongs to the closure and a pointer to an instance of the environment. The environment is the entity that holds the state for the code in the closure block.
In C# 2.0 the environment object is implemented by creating a class for every function that wraps a closure and shares state with it. Such a class is generated name-mangled as __LocalsDisplayClassXXX in C# 2.0.
The code that is part of the closure is now part of the environment class. When a closure is created an instance of the environment class is created and a delegate is returned to the instance and its member function that wraps the closure code. This is what could be done with CLR 1.x delegates as the function had to be a member of the environment class.
With CLR 2.0 delegates, what the compiler writers now have the option of doing is that they can generate a common environment class for all functions that wrap closures that need to capture similar information about its environment.
Here is an example and some notes courtesy of Antonio:
Closures in functional programming languages are often implemented with pointers pairs (env, func) where env is the pointer to the environment of the closure, and func is a pointer to a function whose first argument will be env.
With CLR 1.0 this cannot be achieved because delegates are pairs but with an additional constraint: func should be declared in the class of env.
The problem with this additional constraint on delegates is that you tend to define a class for each closure you make. In a functional programming language you'll pay a significant overhead because for each closure you have to introduce a private class with the environment and the code.
Besides removing that constraint (as it has been done in CLR 2.0) you can define a class with plenty of static methods, one per closure and define a type for each possible environment. This reduces the number of types you need to have.
For instance:
void foo() {
string s;
Cmd d = { Console.WriteLine(s); }
//...
}
void baz() {
Cmd d = { Console.WriteLine("Hello {0}", s); }
The current compiler generates two classes both having a single int field.
With CLR 2.0 the compiler could use a single class as follows:
class Env_String {
string f;
void f(Env_String env) { Console.WriteLine(env.s); }
void g(Env_String env) { Console.WriteLine("Hello {0}", env.s); }
void foo {
Env_String env = new Env_String();
Cmd d = (Cmd)Delegate.CreateDelegate(typeof(Cmd), env,
GetType().GetMethod("f"));
void baz {
GetType().GetMethod("g"));
In the above case the environment class need be only one, as in both the closures, the environment needs to capture the state of only a string variable. The environment class itself can be kept free of the closure specific code and the methods that are generated for the closure can be placed in the class that the original enclosing method was a part of.
This subtle thing had escaped my thinking for sometime. I guess when you are doing your PhD with a university that hosts one of the worlds largest .Net user-groups and are working with matters related to MSR Cambridge, you tend to pick up subtle things a lot easier.
There is one thing that has me thinking is about the implementation of closures in the case of closures being defined inside instance methods. If you refer here, I show a screen shot of an ildasm of that case.
When an instance function is being used the environment will have to have a ‘this’ pointer member that the closure block can access to access any class data members. When the C# compiler is generating one class per function wrapping a closure the ‘this’ can be statically typed to the type of the class that contained the parent method.
If the environment class is to be shared across multiple classes that implement closures, then what will be the type of the ‘this’ pointer?
I am guessing here, but I would expect that they might choose to create the environment class as generic class that is type independent on the ‘this’ pointer. Do you think that sounds right? What are the possible fallouts with that approach? Generic environment classes.
As a matter of fact when you dive into the possibility that generics opens up, it’s rather interesting. We could have the entire environment as a class that contains templated / generic types for every reference type member. This might be efficient to implement because the implementation of generics in the .Net framework does not involve specialization of the runtime class for reference types. Even for value types, I believe it does optimizations to avoid class duplication if the value types have similar footprints as far as the GC is concerned.
One thing is for sure, the future looks interesting for functional and dynamic languages leveraging the CLR.
Among other things, I am looking forward to the MVP India summit that should be happening this weekend 28th May to 31st May.
Remember Me
a@href@title, strike
Powered by: newtelligence dasBlog 2.0.7226.0
Disclaimer The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.
© Copyright 2009, Roshan James
E-mail