Thursday, June 03, 2004

Antonio has a post about generalizing environment classes (that capture state of a closure) with using generic types, such that the class itself captures only the arity of the environment.

 

To quote:

http://rotor.di.unipi.it/cisterni/Lists/My%20Blog/DispForm.aspx?ID=15

In general we may think that the compiler generates several environment classes, for the needed arities; for instance:

 

class Environment3<A, B, C> {

  A a;

  B b;

  C c;

}

 

There is one issue that I can think of, off the top of my head – the CLR has a an excellent approach to generics (among the best I have seen) and is well described in Don Syme’s paper here:

The Design and Implementation of Generics for the .NET Common Language Runtime

http://research.microsoft.com/projects/clrgen/generics.pdf

 

the thing about CLR generics is that they are very efficient for all reference types, because for reference types there is no specialization of templating behavior (as with classical c++ style generics). All reference types use the class definition during runtime.

 

However value types cause the runtime to generate specialized classes to handle type of a value type that is used is a templated entity. Classes definitions are shared by value types only when they share the same foot print with respect to the GC.

 

So it would be better is compiler actually generated specialized classes to hold environment state whenever is knows that the types the environment needs to hold are value types. This simply provides for a performance benefit, because the specialization of the class will not happen at runtime, instead will be done at compile time.

 

 

Antonio also discusses a private member access issue – again I don’t think I fully get him. Assuming the new delegate mechanism is in place we could have classes that look like this

 

class Env

{

        //have only public members

}

 

class Foo

{

        //original method

        void bar()

        {

        }

       

        //anonymous compiler generated method

        void anon_bar()

        {

                //access all members of Foo here

                //access all public members of Env here

        }

}

 

Is there a need to make members to Env private? The entire point of having Env is simply to act as a place holder for some values. Better yet (I don’t know if the old friend method mechanism works), but if friend decls are possible then the anonymous method can be declared as a friend in class Env. This does not add to the class definition in any way, it would simply allow for member access.

 

 

You might want to look at these links to follow the sequence of these posts –

1)       Closures in CLR 2.0

2)       Implementation of Closures (Anonymous Methods) in C# 2.0 (Part 6)

3)       More on CLR 2.0 closures

4)       Closure implementation enhancement in CLR 2.0 using the new delegate mechanism

5)       Again on closures

6)       this post

Thursday, June 03, 2004 12:58:34 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Wednesday, June 02, 2004

This is a tremendously exciting time to be thinking about programming languages and language research. I recently I have come across a lot of material that has made me think a lot.

 

Polyphonic C#

http://research.microsoft.com/%7Enick/polyphony/

            This is a C# like language that is built for concurrency control. Amazing piece of thought exercise there. I recommend looking at Modern Concurrency Abstraction for C#.

 

Xen/X# from Microsoft Research

Xen basically proposes to extend the C# language to better data handling support into the language. 

Unifying Tables Objects and Documents

This should give you a good idea of X#. This is by Erik Meijer of MSR.

Programming with Circles, Triangles and Rectangles

More – interesting reading.

 

C Omega

http://research.microsoft.com/Comega/

A combination of Xen and Polyphonic C#.

You might want to download this ppt that discusses C Omega by none other than Damian Watkins of MSR.

 

Groovy

http://groovy.codehaus.org/

            Groovy is the Ruby like language for the JVM. Ruby itself takes from the power of dynamic object oriented-ness that was so characteristic of smalltalk and whips a powerful expressive language on it. The thing is that Groovy also builds in Xen like concepts.

You might want to download this ppt that discusses Groovy by James Strachan co-author of groovy.

 

Self

http://research.sun.com/self/language.html

An old language from Sun Microsystems. Reading up about self makes you appreciate the spirit of many message passing and prototypes and cloning based pure object oriented systems.

 

 

 

I am not mentioning functional languages here because it is not fair to put up stuff I have no clue about. I must say this is an amazing time to be interested in programming languages.

 

Wednesday, June 02, 2004 7:35:22 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Friday, May 28, 2004

I had dropped a link to Antonio Cisternino’s blog entry about Closure support in CLR 2.0 when I was writing a description of how the C# compiler (Whidbey release) implements closures.

 

I was completely wrong about what Antonio’s blog entry was about. For some reason, my head being so full of C# that it is, I assumed he was talking about the how closures were implemented in C# for the Whidbey release. Closures in the Whidbey release (Visual Studio 2005) are officially called anonymous methods. So apologies Antonio, I had missed the point.

 

That said, what Antonio was referring to was a subtle change in the implementation of the delegates in CLR 2.0 so that the future CLR can be better used to implement closures and support functional languages and constructs. I exchanged some mails with him and I think I get what he was talking about.

 

Before I go any further I would recommend you reading Antonio’s original entry

Closures in CLR 2.0, my entry about the implementation of closures in C# 2.0 using delegates as they are available in CLR 1.x and Antonio’s follow-up entry that speaks of the distinguishing aspect of the new delegate implementation: More about Closures in CLR 2.0

 

The following are derived from mail exchanges with Antonio.

 

A delegate can be thought of as an entity that holds a pointer to a function and optionally a reference to the object for which the function is supposed to be an instance function. In CLR 2.0 a simple and subtle change has taken place where in the constraint that the instance pointer has to refer to the object for which the function pointer is member function, has been removed. The function pointer and the object reference don’t have to be related to each other.

 

In functional languages delegates are often implemented by having a pointer to a function which is the block of code that belongs to the closure and a pointer to an instance of the environment. The environment is the entity that holds the state for the code in the closure block.

 

In C# 2.0 the environment object is implemented by creating a class for every function that wraps a closure and shares state with it. Such a class is generated name-mangled as __LocalsDisplayClassXXX in C# 2.0.

 

The code that is part of the closure is now part of the environment class. When a closure is created an instance of the environment class is created and a delegate is returned to the instance and its member function that wraps the closure code. This is what could be done with CLR 1.x delegates as the function had to be a member of the environment class.

 

With CLR 2.0 delegates, what the compiler writers now have the option of doing is that they can generate a common environment class for all functions that wrap closures that need to capture similar information about its environment.

 

Here is an example and some notes courtesy of Antonio:

 

Closures in functional programming languages are often implemented with pointers pairs (env, func) where env is the pointer to the environment of the closure, and func is a pointer to a function whose first argument will be env.

 

With CLR 1.0 this cannot be achieved because delegates are pairs but with an additional constraint: func should be declared in the class of env.

The problem with this additional constraint on delegates is that you tend to define a class for each closure you make. In a functional programming language you'll pay a significant overhead because for each closure you have to introduce a private class with the environment and the code.

 

Besides removing that constraint (as it has been done in CLR 2.0) you can define a class with plenty of static methods, one per closure and define a type for each possible environment. This reduces the number of types you need to have.

 

For instance:

 

void foo() {

 string s;

 Cmd d = { Console.WriteLine(s); }

 //...

}

 

void baz() {

 string s;

 Cmd d = { Console.WriteLine("Hello {0}", s); }

 //...

}

 

The current compiler generates two classes both having a single int field.

With CLR 2.0 the compiler could use a single class as follows:

 

class Env_String {

 string f;

}

 

void f(Env_String env) { Console.WriteLine(env.s); }

void g(Env_String env) { Console.WriteLine("Hello {0}", env.s); }

 

void foo {

 Env_String env = new Env_String();

 Cmd d = (Cmd)Delegate.CreateDelegate(typeof(Cmd), env,

GetType().GetMethod("f"));

 //...

}

 

void baz {

 Env_String env = new Env_String();

 Cmd d = (Cmd)Delegate.CreateDelegate(typeof(Cmd), env,

GetType().GetMethod("g"));

 //...

}

 

In the above case the environment class need be only one, as in both the closures, the environment needs to capture the state of only a string variable. The environment class itself can be kept free of the closure specific code and the methods that are generated for the closure can be placed in the class that the original enclosing method was a part of.

 

This subtle thing had escaped my thinking for sometime. I guess when you are doing your PhD with a university that hosts one of the worlds largest .Net user-groups and are working with matters related to MSR Cambridge, you tend to pick up subtle things a lot easier. :-)

 

There is one thing that has me thinking is about the implementation of closures in the case of closures being defined inside instance methods. If you refer here, I show a screen shot of an ildasm of that case.

 

When an instance function is being used the environment will have to have a ‘this’ pointer member that the closure block can access to access any class data members. When the C# compiler is generating one class per function wrapping a closure the ‘this’ can be statically typed to the type of the class that contained the parent method.

 

 

If the environment class is to be shared across multiple classes that implement closures, then what will be the type of the ‘this’ pointer?

 

I am guessing here, but I would expect that they might choose to create the environment class as generic class that is type independent on the ‘this’ pointer. Do you think that sounds right? What are the possible fallouts with that approach? Generic environment classes.

 

As a matter of fact when you dive into the possibility that generics opens up, it’s rather interesting. We could have the entire environment as a class that contains templated / generic types for every reference type member. This might be efficient to implement because the implementation of generics in the .Net framework does not involve specialization of the runtime class for reference types. Even for value types, I believe it does optimizations to avoid class duplication if the value types have similar footprints as far as the GC is concerned.

 

One thing is for sure, the future looks interesting for functional and dynamic languages leveraging the CLR.

 

Among other things, I am looking forward to the MVP India summit that should be happening this weekend 28th May to 31st May.

 

 

Friday, May 28, 2004 1:16:56 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Sunday, May 23, 2004

How many of us recognize the name of this man?

 

Mitch Kapor was the founder of the Lotus corporation. He was the man who designed the Lotus 1-2-3. If you know you history, Lotus was one of the only large applications companies that was a serious challenger for Microsoft in its early years. There were years spent over the battle for the spreadsheet that was fought on both the old Mac as well as the old DOS machines.

 

Microsoft’s offering those days were called Multiplan. Multiplan was fairly beat by Lotus 123 in almost all fronts. Microsoft eventually thought through their faults and strengths and eventually released Excel – the spreadsheet battle was over.

 

Mitch Kapor himself, is one person I think of as being fairly amazing.

 

He was co founder of the EFF, the Electronic Frontier Foundation along with John Perry Barlow. The EFF was the organization that for the first time stood up for hackers rights and digital rights. This was of significant and epic proportions in the early 90s when hacker arrests and crackdowns were gaining a witch-hunt like momentum.

 

“The EFF is a non-profit civil liberties organization working in the public interest to protect privacy, free expression, and access to public resources and information online, as well as to promote responsibility in new media.”

 

The EFF was the organization that for the first time took the American Secret Service to court over the ruling and prosecution of the ‘hacker’ Knight Lightning. The EFF won and it literally brought the end of an era about how people of ‘hackers’ and the rules for information security.

 

I highly recommend reading this book called the Hacker Crackdown by Bruce Sterling. The book reflects the ethos of a time when the parameters of information security were very different from how we think of them now. Considering the license of the book, what it intends to convey and what I hope it may change about your thinking, I would recommend downloading a softcopy of the book.

 

Today I happened to come across Mitchell Kapor’s website and blog.

Website: http://www.kei.com/homepages/mkapor/

Blog: http://blogs.osafoundation.org/mitch/

 

I found this entry, right on top and I couldn’t help smiling:

 

May 09, 2004

Now I'm Mad

 

Some idiot Atkins Diet spammer just posted 53 bogus comments in this blog. I'm disabling comments (globally) shortly and figuring out if there's any recourse.

 

They don't know it yet, but they picked the wrong person to do this to.

 

 

Sunday, May 23, 2004 8:24:41 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, May 22, 2004

C# 2.0 has support for closures.

 

This article discusses the implementation of the closures in the C# language and how this has been done using pure compiler magic without any CLR changes. I had previously mentioned C# closures in an earlier blog entry and had linked to Antonio Cisternino’s blog entry:

Closures in CLR 2.0

This is an interesting entry and worth a read. Unfortunately, IMO, Mr. Cisternino’s entry is a not correctly titled. The support for closures is a C# only thing not a CLR level addition.

 

I intend to start off where his blog entry ends. Here I shall look into how it is done.

 

Features such as closures have been taken to legendary fame in languages like scheme and ruby. Recent language developments like Groovy also ship with closure constructs.

 

Microsoft, for some reason, has been calling the new feature of the language as anonymous methods. I don’t know why this feature hasn’t been publicly been spoken of as closure or lambda support in the C# language. Maybe there are subtle differences in the theoretical definition of closures and what the C# language achieves.

 

That said – let’s jump into the matter.

This is C# code that shows off a closure:

 

Code Showing One Anonymous Method / Closure that shares variables with its scope

//closure.cs

//Compile: csc closure.cs

using System;

 

class CMain

{

      delegate void Closure(int n);

     

      static Closure CreateClosure()

      {

            int c = 0;

            return delegate(int n) {

                  Console.WriteLine("Closure: n = {0}, c = {1}",n,c++);

            };

      }

     

      static void Main()

      {

            Closure c1 = CreateClosure();

            Closure c2 = CreateClosure();

            c1(1);

            c1(1);

            c1(1);

            c2(2);

            c2(2);

            c2(2);

      }

}

 

This code is very similar to the groovy snippet I had posted sometime back. When executed, this is the output:

 

> closure.exe

Closure: n = 1, c = 0

Closure: n = 1, c = 1

Closure: n = 1, c = 2

Closure: n = 2, c = 0

Closure: n = 2, c = 1

Closure: n = 2, c = 2

 

If you don’t know what closures are about, the easiest way to appreciate them is to take a careful look at the C# code above. Notice the function CreateClosure() is returning a delegate to a block of code that is part of the function itself. (Normally you would create a delegate to a function). If you don’t understand what a delegate is, let it be. What you need to understand is that the function returns a block of code that is a part of the function.

 

The block of code accepts an integer parameter. Also, you will notice that the local variable of the function (variable ‘c’) is being used in the code block.

 

When the block of code is returned to the caller (in this case main), the Main() can invoke the variable that represents the block of code, causing the code to run.

 

Once upon a time you could create a delegate to an independent function only and not to a code block within a function. A delegate to a function had to have the same signature as the function. The delegate could be thought of as an old C style function pointer. When the delegate is invoked, the function pointed to is called. Delegates came with the additional merit that they were type safe function pointers – most people did not think much more than that about delegates.

 

Now it is a little different. While the anonymous method within the CreateClosure() method actually looks like it simply has a nested function that does not have a name explicitly provided its not that simple. You might venture to guess that the compiler actually goes on to create a new method (by extracting this code block) and simply creates a delegate to the new function, the same way delegates once used to behave.

 

However, notice that the code block uses variable ‘c’ that is defined locally to the enclosing function. If this code block is going to be carved out of the enclosing function, how can it access variable ‘c’? Better yet, take a closer look at the output.

 

It looks like a closure returned from a call to CreateClosure() is seems to remember its value of the variable ‘c’. Some how, the state of the function CreateClosure() is captured in the delegate/closure that it returns. So much so, that the state of two invocations of CreateClosure() seemed to be maintained independent of each other.

 

This is in violation of the way simple C like functions work, where the function state is stored on the stack and the functions stack frame is torn down from the stack and state is lost when the functions return. (Refer ‘The Big Deal about Iterators’)

 

Functions that return closures seemed maintain state even after they have returned. The closure object is maintains a reference to that state. This requirement of maintaining is similar to the implementation of iterators (if you give it some thought).

 

Implementation

This is what our closure.cs looks like under ILDASM.

 

 

Notice the new class called __LocalsDisplayClass$0…1 that has been created. This is interesting culprit.

 

In essence how closures work in C# 2.0  is that the compiler creates a new class that contains member variables correspond to the local variables of CreateClosure() that are being used in the closure/anonymous method that it defines. Thus you can see the local variable ‘c’ of method CreateClosure() can be seen in class __LocalsDisplayClass$0…1.

 

So calling the CreateClosure method creates an instance of the __Locals… class. It then creates an old (classical) delegate to the __AnonymousMethod$0… method of the class. So the actual support for delegates in the CLR hasn’t changed at all. The delegate that is returned from the CreateClosure() method is a normal (old C# 1.x type) delegate.

 

All access to variables that are shared between the CreateClosure() method and the anaymous method are accesses to members of the __Locals… class.

 

Here is IL code of CreateClosure()

 

.method private hidebysig static class CMain/Closure

        CreateClosure() cil managed

{

  // Code size       30 (0x1e)

  .maxstack  3

  .locals init (class CMain/__LocalsDisplayClass$00000001 V_0,

           class CMain/Closure V_1)

  IL_0000:  newobj     instance void CMain/__LocalsDisplayClass$00000001::.ctor()

  IL_0005:  stloc.0

  IL_0006:  ldloc.0

  IL_0007:  ldc.i4.0

  IL_0008:  stfld      int32 CMain/__LocalsDisplayClass$00000001::c

  IL_000d:  ldloc.0

  IL_000e:  ldftn      instance void CMain/__LocalsDisplayClass$00000001::__AnonymousMethod$00000000(int32)

  IL_0014:  newobj     instance void CMain/Closure::.ctor(object,

                                                          native int)

  IL_0019:  stloc.1

  IL_001a:  br.s       IL_001c

  IL_001c:  ldloc.1

  IL_001d:  ret

} // end of method CMain::CreateClosure

 

Notice some things in the above code

- An object of __LocalsDisplayClass$00000001 is created.
- Access to the variable is actually access to the member ‘c’ of this class.
- The delegate is created to the instance of the __LocalsDisplayClass$00000001 class that was created and its __AnonymousMethod$00000000 method.

 

This is the code of the anonymous method, which has now become CMain/__LocalsDisplayClass$00000001::__AnonymousMethod$00000000(int32)

 

.method public hidebysig instance void  __AnonymousMethod$00000000(int32 n) cil managed

{

  // Code size       39 (0x27)

  .maxstack  5

  .locals init (int32 V_0)

  IL_0000:  ldstr      "Closure: n = {0}, c = {1}"

  IL_0005:  ldarg.1

  IL_0006:  box        [mscorlib]System.Int32

  IL_000b:  ldarg.0

  IL_000c:  dup

  IL_000d:  ldfld      int32 CMain/__LocalsDisplayClass$00000001::c

  IL_0012:  dup

  IL_0013:  stloc.0

  IL_0014:  ldc.i4.1

  IL_0015:  add

  IL_0016:  stfld      int32 CMain/__LocalsDisplayClass$00000001::c

  IL_001b:  ldloc.0

  IL_001c:  box        [mscorlib]System.Int32

  IL_0021:  call       void [mscorlib]System.Console::WriteLine(string,

                                                                object,

                                                                object)

  IL_0026:  ret

} // end of method __LocalsDisplayClass$00000001::__AnonymousMethod$00000000

 

This is quite exactly the code that we had written into the CreateClosure() function. Except that the local variable ‘c’ is not a local variable any more.

 

What do you think will happen when there is a function that defines two anonymous methods within it?

 

Code Showing Multiple Anonymous Methods / Closures that share variables with its scope

//closure2.cs

//Compile: csc closure2.cs

using System;

 

class CMain

{

       delegate void Closure(int n);

       static Closure t1,t2;

      

       static void CreateClosure()

       {

              int c1 = 0;

              int c2 = 0;

              int c3 = 0;

              int c4 = 0;

              Console.WriteLine("c4 = {0}",c4);

             

              t1 =  delegate(int n) {

                     Console.WriteLine("Closure: n={0}, c1={1}, c2={2}",

                                n,c1++,c2++);

              };

              t2 = delegate(int n) {

                     Console.WriteLine("Closure: n={0}, c1={1}, c3={2}",

                                n,c1++,c3++);

              };

       }

      

       static void Main()

       {

              CreateClosure();

       }

}

 

 

This is what happens:

 

 

Notice:

- There is still only one class that maintains state.
- The class has two methods (one for each of the anonymous methods)
- The variables that are being used by either of methods are part of the class (c1,c2, c3).
- Variables that are not being used from either closure are not part of the class (c4 is omitted).

 

One final look, what if the closure does not use any variables of its enclosing scope, how would this work?

 

Code Showing an Anonymous Method / Closure that is stateless

//closure3.cs

//Compile: csc closure3.cs

using System;

 

class CMain

{

       delegate void Closure(int n);

      

       static Closure CreateClosure()

       {

              int c = 0;

              return delegate(int n) {

                     Console.WriteLine("Closure: n = {0}",n);

              };

       }

      

       static void Main()

       {

              Closure c1 = CreateClosure();

              Closure c2 = CreateClosure();

              c1(1);

              c2(2);

       }

}

 

The code generated looks like this:

 

 

As expected there is NO hidden class generated in this case. Why? Because these is no need for the anonymous method to save state. The anonymous method itself is added to the class that contains the CreateClosure().

 

 

One more case to look at: what happens when the closure accesses both a local variable of the enclosing method as well as a member variable of the class that the enclosing method is a part of? Because state is persisted in a new class instance, how will the member variables of the original class be accessed?

 

This is C# code that shows this situation:

Code where closure accesses locals and class members

//closure4.cs

//Compile: csc closure4.cs

using System;

 

class CMain

{

       delegate void Closure(int n);

       int a = 10;

      

       Closure CreateClosure()

       {

              int c = 0;

              return delegate(int n) {

                     Console.WriteLine("Closure: n = {0}, a = {1}, c = {0}",n,a++,c++);

              };

       }

      

       static void Main()

       {

              CMain main = new CMain();

              Closure c1 = main.CreateClosure();

              Closure c2 = main.CreateClosure();

              c1(1);

              c2(2);

       }

}

 

This is what the generated code looks like:

 

 

Notice that there is a member in the _Locals… class that is called < this >. The this pointer/reference refers back to the original enclosing class where the anonymous method belonged. Thus it can access member variables.

 

Notes

I think C# closures were created as a by product of trying to implement iterators into the language. The implementation of iterators involved this sort of temporary class creation that preserved state of functions. (A little like Bjarne Stroutrup’s Function Objects).

 

I don’t know if there are reasons why this implementation of anonymous methods cannot be called as proper closures. Anonymous methods cannot use any ref or out type parameters of its enclosing function – this is a limitation. Does this limitation imply that they cannot be called closures? I don’t know. But other than that, it seems to serve the purpose of closures pretty well.

 

The fact that closures have been implemented as a compiler level hack means that there is no overhead at all for code that does not take advantage of these sort of feature. So there is no performance penalty on the CLR itself. Maybe in time, when these features gain adequate popularity there will also exist efficient means of integrating these features into the CLR in a way that does not affect functioning of traditional code.

 

Once we have closures, we can implement a large variety of constructs in the language (iterators being one of them). However since the syntax is a little clunky and the actual implementation a little slow (because there is a hidden managed class involved) this may never happen. All the same this is one amazing feature to have in a mainstream language like C#. Maybe even a little too advanced for some folk, so don’t be surprised is the coding guidelines of your company say NO to anonymous methods, the way they said to C++ macros once upon a time.

 

 

Saturday, May 22, 2004 1:10:56 AM (Eastern Standard Time, UTC-05:00)  #    Comments [4]  | 
 Friday, May 21, 2004

Today I had a rather shocking realization. I realized that C# 2.0 supports closures.

 

It was rather shocking, because here I was running up and down obscure languages looking for features like this and bang C# has it. I was pointed to this blog entry by a good friend of mine at Microsoft: Antonio Cisternino's Blog: Closures in CLR 2.0.

A lot of the content on Mr Cisternino’s blog is rather interesting and I would recommend a visit to

http://rotor.di.unipi.it/cisterni/Lists/My%20Blog/AllItems.aspx

 

The entry on closures is an interesting read. A quick search on google, showed me that the rest of the world seemed to have realized that C# has closures, a long time before I did.

 

 

 

Looking at closures brought back something from hazy old memory from a time when I was more ignorant:

 

Function Objects in C++

 

What is a function object?

 

An object that in some way behaves like a function, of course. Typically, that would mean an object of a class that defines the application operator - operator().

A function object is a more general concept than a function because a function object can have state that persist across several calls (like a static local variable) and can be initialized and examined from outside the object (unlike a static local variable). For example:

 

                class Sum {

                                int val;

                public:

                                Sum(int i) :val(i) { }

                                operator int() const { return val; }                  // extract value

 

                                int operator()(int i) { return val+=i; }           // application

                };

 

                void f(vector<int> v)

                {

                                Sum s = 0;             // initial value 0

                                s = for_each(v.begin(), v.end(), s); // gather the sum of all elements

                                cout << "the sum is " << s << "\n";

                               

                                // or even:

                                cout << "the sum is " << for_each(v.begin(), v.end(), Sum(0)) << "\n";

                }

 

Note that a function object with an inline application operator inlines beautifully because there are no pointers involved that might confuse optimizers. To contrast: current optimizers are rarely (never?) able to inline a call through a pointer to function.

Function objects are extensively used to provide flexibility in the standard library.

 

This is written by none other than Bjarne Stroustrup and you can see the full FAQ here:

http://www.research.att.com/~bs/bs_faq2.html

 

You might relate this to the brief discussion on method instances in the previous entry ‘The Big Deal about Iterators

 

Hopefully I will have a better understanding of how C# does closures, soon enough.

 

Friday, May 21, 2004 4:55:20 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Wednesday, May 19, 2004

This is an entry I have kept pending for a long time. I should have had this out much earlier.

 

Here I am going to talk about why iterators are a hard feature to implement in a conventional stack based language. This will probably help you understand some of the design decisions with respect to implementing iterators in a language and, on the whole, make you a better programmer with respect to this programming construct.

 

I am assuming that you have read Parts 1 and 2

Iterators in Ruby (Part - 1)

Warming up to using Iterators (Part 2)

 

After you have read this entry you should be in better light to grasp Parts 4 and 5.

SICP, Fiber api and ITERATORS ! (Part 4)

Implementation of Iterators in C# 2.0 (Part 5)

 

 

Method Calls and the Stack Frame

Since you have seen some samples of iterator based code in Part 2 lets take a look at how an iterator works. Before we delve into the related issues, first let us take a look at how method calls work on the C stack. (We could have taken an stack based language here, for example .Net IL, but I figured it is easier to stick to C).

 

Consider the following functions

 

void caller()

{

        int a;

        int b;

        int sum = callee(a, b);

}

 

 

int callee(int a, int b)

{

        int temp = a + b;

        return temp;

}

 

That is amazingly simple. However what happens here? When I say what happens here, I mean to ask what happens here at the system stack level.  To do that lets take a look at the dissembly of these functions. If you were to compile this code with the vc++ compiler, you could use the following switch to generate assembly code.

>cl /Facode.cpp.asm code.cpp

 

?callee@@YAHHH@Z PROC NEAR                      ; callee

; File d:\roshanj\work\cpp\iter\func.cpp

; Line 2

      push  ebp

      mov   ebp, esp

      push  ecx

; Line 3

      mov   eax, DWORD PTR _a$[ebp]

      add   eax, DWORD PTR _b$[ebp]

      mov   DWORD PTR _temp$[ebp], eax

; Line 4

      mov   eax, DWORD PTR _temp$[ebp]

; Line 5

      mov   esp, ebp

      pop   ebp

      ret   0

?callee@@YAHHH@Z ENDP                           ; callee

 

?caller@@YAXXZ PROC NEAR                        ; caller

; Line 8

      push  ebp

      mov   ebp, esp

      sub   esp, 12                             ; 0000000cH

; Line 11

      mov   eax, DWORD PTR _b$[ebp]

      push  eax

      mov   ecx, DWORD PTR _a$[ebp]

      push  ecx

      call  ?callee@@YAHHH@Z              ; callee

      add   esp, 8

      mov   DWORD PTR _sum$[ebp], eax

; Line 12

      mov   esp, ebp

      pop   ebp

      ret   0

?caller@@YAXXZ ENDP

 

 

The stack frame of the caller function looks like this:

 

 

And the when the caller() is calling the callee() then both the methods have their stack-frames mounted.

 

 

In Intel based systems the C stack grows downward in memory, which means that each item that is added to the stack causes the top of the stack (SP) to have a lesser address value. So the right way to draw these diagram would have been to draw them upside down. But that detail is not relevant here and so I have depicted them as a conventional stack that grows upwards.

 

When the callee() returns, the stack looks like the initial stack diagram and the variable ‘sum’ contains its required value.

 

 

To reiterate, what happens is that a method that is currently running uses the stack for storing its local variables – or more generally the state of the running method is preserved on the stack. When a method is called, it builds its own stack frame on top of whatever was already on the stack. The called method uses the stack frame to save its state, irrespective of what other methods are already on the stack.

 

You might have heard of this concept called a stack-overflow. That happens when methods calls happen to such an extent that there is not more free space left on the stack for the stack frame of a new method to be created.

 

Whenever a method returns, the part of the stack that is used for its variables is freed up – or so to speak, the methods stack frame is torn down. So when a method returns, its state information, that was on the stack is completely lost. This is necessary, because the parent function or method, might go on to call other methods that go on to use same stack space subsequently.

 

So let us say there is a calling pattern like this

 

function a()

        return

       

function b()

        call a()

        return

       

function caller()

        call a()

        call b()

        return

 

The stack usage will look like this –

 

 

Now that we have a reasonable idea of how the stack is used across method calls (though in reality there are so many many approaches), lets move on to iterators.

 

Iterators

Look at the following ruby snippet that shows an iterator. If you have an interest in C/Cpp/C# and couldn’t care less about Ruby, don’t throw your hands up in the air – the language is used here as an example of a language that implements iterators (and rather well at that), which I am using to communicate the idea.

 

def callee()

        for i in 1..10

                yield i

        end

end

 

def process(v)

        return (v * 10)

end

       

       

def caller()

        callee() {|value|

                value = process(value)

                print value

        }

end

 

If you recall the behavior of the iterator from Part 1 and 2, you will remember that when caller() calls callee() and the callee() invokes the yield statement, the control is back at the caller(). In this case, the value of ‘i’ is yielded from callee() which is received in caller() as value.

 

In C code, when the control returns to the calling function the assumption is that the called function is dead on the stack and that the stack frame is free for subsequent method calls, as shown in the stack diagram above.

 

If we have a similar diagram for the this ruby code, how would we draw it?

 

 

This is where we hit our first wall.

 

Assume that the caller() calls callee() and the callee() has yielded value 1. Now the stack frame of callee() is torn down in the old C way and the process() method is called which goes on to use the same area of the stack that was one used by callee(). After that caller() has done whatever it needs to do with the value it received from callee() it tries to invoke callee() again so that it can get the next value. This is where we hit the wall. The callee() cannot return the next value simply because the previous value it has for ‘i’ is lost when its stack frame got torn down.

 

In other words, on a conventional C stack, the callee() state is lost and therefore cannot resume execution from the point of a yield.

 

What work arounds do we have to this issue?

Let us assume that what Ruby does is simply a compiler hack – let us assume that it never really tears down the stack frame of the callee() at all, but instead it ensures that function returns back to the caller() but with the callee() stack frame still in place. That would look like this

 

 

While this is a nice diagram to look at, what does this mean? Where will the stack frame of the subsequent call to process() go? It cannot over-write the callee() stack frame – so it has to go above it. Like this?

 

 

Woo, now wait a minute, how does the caller() function know how much of the stack the callee() function is using to be able to do this? This is a difficult question to answer. Remember that the callee() is a proper function that can be using a variable amount of the stack at any point of time. So the caller() cannot predict the stack usage of the callee() but lets assume that the callee() passes back the information of its stack usage back along with the value that it is yielding.

 

Ok, so that would solve the problem of how the process() function uses the stack. But there is one more issue. What is the code block inside the caller() (the one that receives the yielded value) needs to push a value onto the stack?

 

If the code block inside the caller, needs to push a value onto the stack, then the pushed value will go on to overwrite the stack-frame of the callee above it. The obvious way to fix the problem is to shift the entire stack pointer to the top of the stack, above the callee() also. You can visualize it like this:

 

 

While this could work, there is one more issue – the local member variables of a function are accessed via EBP offsets. Now this is fine when local variables occur at known distances from the base of the stack frame for the function. To see this happen, you might want to refer back to the assembly code I posted towards the beginning of this article. You notice that most of the member variables are being accessed as _a$[ebp] or _b$[ebp], or similar syntax. The _a$ here does nothing but add a fixed positive offset to the EBP pointer. For the access of local variables when code is in the yield’s parameter block region of the caller() these rules would have to change, as there is an issue of adding an additional offset. The additional offset is introduced because now there is the stack frame of the callee() sitting squarely in the middle of what should have been the stack frame of the caller().

 

These issues crop up when using single iterators. If using multiple iterators or nested iterators and when used in conjunction with stack intensive operations like recursion, the picture because very complicated.

 

Alternative approaches of State maintenance and the concept of Method instances

While it should be clear that, to implement iterators in a language the must support the idea of functions maintaining their state even after they surrender control to their calling methods – it is not very clear as to how this can be implemented on a conventional C stack.

 

One approach is to go for a ‘stackless’ implementation. What that means is that function activations or stack frames are treated as allocated memory blocks on the heap and each function instance (when you call a function it needs to create a stack frame and that can be considered a function or method instance) lives on the heap like any other dynamically allocated object. These mini stacks as specific to function instances and so have no real concept of colliding with each other due to sharing a larger OS/platform maintained stack.

 

I believe languages like lisp/scheme behave in this manner with respect to the language stack. (I have been told this and I hope this is correct).

 

 

Another alternative is to approach the problem like this. Assume that the callee used only static variables. Immediately, the issue of maintaining state of variable of the callee on the stack is eliminated. The only issue then, is to resume execution of the function from the correct place (immediately past the yield) when the function is called again. This can be easily achieved by saving the resume position into an additional variable and implementing a switch case goto construct at the beginning of the function.

 

Now since it is persisting state in static variables we cannot use this recursively or in any other fashion. But we could fix this if we created a structure/class that contains member variables corresponding to the local variables of the function and use an instance of this structure in the function instead of using proper static variables.

 

Languages such as C# and Python use a similar approach to maintaining state of iterators between calls. You can read more about this in the Part 5 of this series.

 

For constructs like iterators to be supported in .Net and similar runtimes, in a native way, will require substantial changes in the way the runtime manages stack-frames and similar entities. Basically languages and runtimes need to be retro-fitted with a concept of function and code block instances the same way object oriented programming is fitted with concepts of class instances called objects.

 

Languages like Sun Microsystems’ Self build on similar concepts for methods/function instantiation.

 

If you give some of the ideas here a little thought, you should be well on your way to dreaming about new languages and fancy programming constructs.

Wednesday, May 19, 2004 9:09:21 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Tuesday, May 18, 2004

 

Trashbin finally has patch.

 

Before trashbin was written I used to wonder a bit out this mythical entity called metadata that is often talked about in the .Net framework. Metadata was mentioned on almost every discourse on .Net and libraries like reflection libs were based squarely on it. Metadata formed the central pillar of design of the self describing type/component system that is called the .Net framework.

 

However, I could find almost nothing that showed me actual metadata, in its physical form. As a consequence I decided to try and understand what metadata was all about. Thus trashbin was written. Trashbin first saw light of day when announced in a Bangalore User Group post sent in the wee hours of morning.

 

From: spark

Subject: metadata viewer: trashbin v0.1, src+bin release

Date: Mon, 09 Jun 2003 13:28:37 -0700

-----------------------------------------------------------

 

hi group, 

i had been spending some of my odd freetime and sometimes lost sleep into exploring the .net exe/dll format and peeking into metadata glue.

well, the glue is rather interesting to look into and gives you a small insight into where information for things like the class loader and the reflection api get their data. trashbin is small viewer for metadata info that i am releasing with src. this is the first version anywhere and so is expected to be buggy - do mail. if internals interests you, take a look:

http://www.thinkingms.com/pensieve/homepage/work/trashbin/trashbin.htm

cheers :)

rosh

 

Since its release, trashbin has more or less worked fine for me and I have been using it for about a year now.
In essence, this is what trashbin does:

 

>trashbin

Spark (?)  Managed(.Net)/Native PE-COFF file viewer. Version 0.2

May 2003, contact: rosh@mvps.org

Last update: May 2004

 

usage: trashbin [options]

 

        portable executable info:

        /dos     display dos header

        /sig     display the file signature

        /coff    display coff header

        /pe      display pe/optional header

        /dd      display data directories in pe header

        /sec     display section headers

        /exp     display export table

        /imp     display import table

        /reloc   display relocation information

        /tls     display Thread Local Storage information

 

        managed info:

        /corhdr          display the common language runtime header

        /mdhdr           display metadata headers

        /md:Strings      display metadata stream #Strings

        /md:Blob         display metadata stream #Blob

        /md:US           display metadata stream #US (user strings)

        /md:GUID         display metadata stream #GUID

        /md:#~           display optimised metadata tables stream-header

        /mdtab           display optimised metadata tables

 

        other:

        /type    indicates the type of the PE file

        /csv     enable excel compatible, CSV output

 

        ps. The name trashbin is 'inspired' from dumpbin :)

 

Since most people who are reading this entry might be interested in what metadata is and what the PE file format is like, here goes:

 

The PE file format is Microsoft’s Portable Executable File Format. Essentially most exes and dlls that you will see on a windows system have this file format. Yes Exes and DLLs have the same format. The difference largely lies in the fact that a DLL file does not necessarily have an entry point defined. Here is a little about the Exe/Dll format:

 

Once upon a time, there used to be old DOS exes that came with what was the DOS exe header. Microsoft retained the DOS exe header in all subsequent exe formats so that the executables would be compatible across their operating systems. Which is why, you can run any windows or .Net exe on any Microsoft operating system (including dos) and see it run. Of course these programs would not do anything in the dos environment other than display a message saying that the exe would run under windows. The point however is that the exe did validly execute on a 15 or twenty year old system that was built for processors that did not have a concept of memory beyond 1Mb.

 

The DOS header is known for it special signature bytes MZ. Open any exe file and notice that the very first two bytes are MZ. MZ stands for Mark Zbikowski, the person who developed the DOS exe file format. Prior to that the executable format was called the Com file format. Those of you who have had the chance to work on DOS would not have forgotten what a pleasure some of those COM files used to be. The COM format belonged to the then popular CP/M operating system of Digital of the great Gary Kildall. Gary Kildall was pioneer in a way few people were.. anyway that is story for later.

 

The following is a dump of the initial few bytes of an exe file showing the then new MZ DOS header:

 

>hexv HelloWorld.exe

 

0000:0000│ 4D 5A 90 00 03 00 00 00 │ 04 00 00 00 FF FF 00 00 │ MZÉ▒♥▒▒▒│♦▒▒▒  ▒▒

0000:0010│ B8 00 00 00 00 00 00 00 │ 40 00 00 00 00 00 00 00 │ ╕▒▒▒▒▒▒▒│@▒▒▒▒▒▒▒

0000:0020│ 00 00 00 00 00 00 00 00 │ 00 00 00 00 00 00 00 00 │ ▒▒▒▒▒▒▒▒│▒▒▒▒▒▒▒▒

 

In a sense Mr Zbikowsky has the most popular initials on the planet. The DOS exe header itself is available as a structure that is defined in the winnt.h header file that is available on almost every windows based c/cpp dev environment.

 

Now the DOS exe header did not suffice to hold a lot of the new information that the exe had to present to the operating system, when windows came along. So new structures were introduced which were the akin to the old unix based Common Object File Format (COFF). There is plenty of literature available about this on the net.

 

The PE file is denoted by the signature bytes “PE  ". If you download trashbin the source code has some embedded urls that give you information about the PE file format itself. Those may prove valuable for your understanding of the actual exe file format.

 

Just to connect all that I have been talking about to trashbin and how you can actually examine an exe file with it, these are the relevant switches.

 

        /dos     display dos header

        /sig     display the file signature

        /coff    display coff header

        /pe      display pe/optional header

 

Now that we have covered that ground, lets move on. The PE file has a data structure called the Data Directory which is displayed through the  /dd option.

 

        /dd      display data directories in pe header

 

The data directory basically contains pointers to various data structures inside the PE file. The DD has 16 entries and a dump of the DD looks like this:

 

C:\WINNT\system32>trashbin tracert.exe /dd

_IMAGE_DATA_DIRECTORY

0       VirtualAddress = 0

        Size = 0

1       VirtualAddress = 0x19ac

        Size = 0x78

2       VirtualAddress = 0x3000

        Size = 0x11b8

3       VirtualAddress = 0

        Size = 0

4       VirtualAddress = 0

        Size = 0

5       VirtualAddress = 0

        Size = 0

6       VirtualAddress = 0x1090

        Size = 0x1c

7       VirtualAddress = 0

        Size = 0

8       VirtualAddress = 0

        Size = 0

9       VirtualAddress = 0

        Size = 0

10      VirtualAddress = 0

        Size = 0

11      VirtualAddress = 0x240

        Size = 0x7c

12      VirtualAddress = 0x1000

        Size = 0x88

13      VirtualAddress = 0

        Size = 0

14      VirtualAddress = 0

        Size = 0

15      VirtualAddress = 0

        Size = 0

 

 

This is trashbin examining tracert program that comes with windows. The tracert program is a native exe (not a .net exe). Entries in the DD have a predefined meaning, these are defined in winnt.h as follows:

 

#define IMAGE_DIRECTORY_ENTRY_EXPORT          0   // Export Directory

#define IMAGE_DIRECTORY_ENTRY_IMPORT          1   // Import Directory

#define IMAGE_DIRECTORY_ENTRY_RESOURCE        2   // Resource Directory

#define IMAGE_DIRECTORY_ENTRY_EXCEPTION       3   // Exception Directory

#define IMAGE_DIRECTORY_ENTRY_SECURITY        4   // Security Directory

#define IMAGE_DIRECTORY_ENTRY_BASERELOC       5   // Base Relocation Table

#define IMAGE_DIRECTORY_ENTRY_DEBUG           6   // Debug Directory

//      IMAGE_DIRECTORY_ENTRY_COPYRIGHT       7   // (X86 usage)

#define IMAGE_DIRECTORY_ENTRY_ARCHITECTURE    7   // Architecture Specific Data

#define IMAGE_DIRECTORY_ENTRY_GLOBALPTR       8   // RVA of GP

#define IMAGE_DIRECTORY_ENTRY_TLS             9   // TLS Directory

#define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG    10   // Load Configuration Directory

#define IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT   11   // Bound Import Directory in headers

#define IMAGE_DIRECTORY_ENTRY_IAT            12   // Import Address Table

#define IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT   13   // Delay Load Import Descriptors

#define IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR 14   // COM Runtime descriptor

 

You can compare these against what is present in the tracert program dump.

Now lets do a DD dump of a .net exe

 

>trashbin HelloWorld.exe /dd

_IMAGE_DATA_DIRECTORY

0       VirtualAddress = 0

        Size = 0

1       VirtualAddress = 0x2370

        Size = 0x4b

2       VirtualAddress = 0x4000

        Size = 0x340

3       VirtualAddress = 0

        Size = 0

4       VirtualAddress = 0

        Size = 0

5       VirtualAddress = 0x6000

        Size = 0xc

6       VirtualAddress = 0

        Size = 0

7       VirtualAddress = 0

        Size = 0

8       VirtualAddress = 0

        Size = 0

9       VirtualAddress = 0

        Size = 0

10      VirtualAddress = 0

        Size = 0

11      VirtualAddress = 0

        Size = 0

12      VirtualAddress = 0x2000

        Size = 0x8

13      VirtualAddress = 0

        Size = 0

14      VirtualAddress = 0x2008

        Size = 0x48

15      VirtualAddress = 0

        Size = 0

 

The interesting thing to note is that in a managed exe, the 15th entry is non zero. The 14th entry point to the Common Runtime Header or the CorHdr. The CorHdr structure is defined in corhdr.h.

 

The CorHdr is where (so to speak) the managed world starts. So .Net exe files are regular PE files which have all the .Net specific content in a particular offset in the exe file. The idea was that .Net was designed to be platform neutral. So Microsoft could assume that the facilities provided by the exe file format would be available in file formats of other operating systems where .Net would one day run. For this sake .net specific content assumes that the whatever the parent file format that is enclosing it, the .Net specific content can be made laid out in a large well defined binary blob.

 

A description of the CorHdr, Metadata layout and other intricacies of the underlying system can be found described (in varying detail) in the ECMA specification of the Common Language Infrastructure. Partition 2 that describes metadata is the one that you would want to look it, in this regard.

http://msdn.microsoft.com/net/ecma/

 

Another interesting resource for the technical mind is Serge Lidin’s book “Inside the .Net IL Assembler” . The book is available as an Indian India edition also, so it is affordable.

 

Trashbin lets you view this part of the managed exe file.

These are the relevant switches:

 

        /corhdr          display the common language runtime header

        /mdhdr           display metadata headers

        /md:Strings      display metadata stream #Strings

        /md:Blob         display metadata stream #Blob

        /md:US           display metadata stream #US (user strings)

        /md:GUID         display metadata stream #GUID

        /md:#~           display optimised metadata tables stream-header

        /mdtab           display optimised metadata tables

 

Trashbin is not a dissembler – it simply lets you view the other data that is present in the exe/dll file. I figured I don’t need to do a dissembler because ILdasm and several other tools do the job so well. ILdasm btw is written by Serge Lidin.

 

The metadata in the managed file is divided into a set of streams. A stream is like an area of memory reserved for content of a specific type. These are namely –

#Strings

#US

#Blob

#GUID

#~ and #-

 

The #Strings stream keeps all the strings in the application that include things like names of classes, methods, parameters, namespaces, assemblies etc. So this stream basically contains all kind of strings that are part of the source code itself – these are where the names that are used by the reflection library are available.

 

The #US stream is the one for user defined strings. So when you say Console.WriteLine(“Hello World”) in your program, the “Hello World” goes into #US and the Console.WriteLine goes into #Strings. The strings present in the #US set are Unicode strings – so each character is two bytes.

 

Here are some nice and friendly hex dumps of these regions from a exe file followed by the corresponding information being ripped by trashbin:

 

#Strings

 

0000:0460│ 00 00 00 00 00 00 00 00 │ 00 3C 4D 6F 64 75 6C 65 │ ▒▒▒▒▒▒▒▒│▒

0000:0470│ 3E 00 48 65 6C 6C 6F 57 │ 6F 72 6C 64 2E 65 78 65 │ >▒HelloW│orld.exe

0000:0480│ 00 6D 73 63 6F 72 6C 69 │ 62 00 53 79 73 74 65 6D │ ▒mscorli│b▒System

0000:0490│ 00 4F 62 6A 65 63 74 00 │ 43 61 6C 63 00 48 65 6C │ ▒Object▒│Calc▒Hel

0000:04A0│ 6C 6F 57 6F 72 6C 64 53 │ 61 6D 70 6C 65 00 48 65 │ loWorldS│ample▒He

0000:04B0│ 6C 6C 6F 57 6F 72 6C 64 │ 00 41 64 64 00 2E 63 74 │ lloWorld│▒Add▒.ct

0000:04C0│ 6F 72 00 4D 61 69 6E 00 │ 53 79 73 74 65 6D 2E 44 │ or▒Main▒│System.D

0000:04D0│ 69 61 67 6E 6F 73 74 69 │ 63 73 00 44 65 62 75 67 │ iagnosti│cs▒Debug

0000:04E0│ 67 61 62 6C 65 41 74 74 │ 72 69 62 75 74 65 00 61 │ gableAtt│ribute▒a

0000:04F0│ 00 62 00 61 72 67 73 00 │ 43 6F 6E 73 6F 6C 65 00 │ ▒b▒args▒│Console▒

0000:0500│ 57 72 69 74 65 4C 69 6E │ 65 00 45 78 63 65 70 74 │ WriteLin│e▒Except

0000:0510│ 69 6F 6E 00 67 65 74 5F │ 4D 65 73 73 61 67 65 00 │ ion▒get_│Message▒

 

>trashbin HelloWorld.exe /md:Strings

METADATA STREAM #Strings

        Offset : "String"

        0x1    : ""

        0xA    : "HelloWorld.exe"

        0x19   : "mscorlib"

        0x22   : "System"

        0x29   : "Object"

        0x30   : "Calc"

        0x35   : "HelloWorldSample"

        0x46   : "HelloWorld"

        0x51   : "Add"

        0x55   : ".ctor"

        0x5B   : "Main"

        0x60   : "System.Diagnostics"

        0x73   : "DebuggableAttribute"

        0x87   : "a"

        0x89   : "b"

        0x8B   : "args"

        0x90   : "Console"

        0x98   : "WriteLine"

        0xA2   : "Exception"

        0xAC   : "get_Message"

 

#US

 

0000:04B0│ 65 00 00 00 00 17 48 00 │ 65 00 6C 00 6C 00 6F 00 │ e▒▒▒▒↨H▒│e▒l▒l▒o▒

0000:04C0│ 20 00 57 00 6F 00 72 00 │ 6C 00 64 00 00 00 00 00 │  ▒W▒o▒r▒│l▒d▒▒▒▒▒

 

>trashbin test.exe /md:US

METADATA STREAM #US

0x1, (23 bytes)

    Txt: H.e.l.l.o...W.o.r.l.d..

    Hex: 48 00 65 00 6c 00 6c 00 6f 00 20 00 57 00 6f 00 72 00 6c 00 64 00 00

 

I will just skip over the #GUID and #Blob streams for now. The #~ stream is the interesting one that is the stream that actually contains the metadata tables. These is an alternate stream thaty can be present which is the #- stream. The #- stream again contains metadata tables but these are called the un-optimized tables because certain sort orders are not maintained in these tables. The Microsoft compiles always emit optimized tables and since I have not been using any other compilers (Mono too seems to emit optimized tables) I don’t have support for #- in trashbin.

 

Lets focus on #~. The #~ is the real metadata, if you would like to think of it that way, It is actually a small relational database that is compressed down to the last bit. There are a large number of tables (with predefined schemas) that can occur here. These tables provide information about the exe or dll (or precisely the assembly) that they are trying to describe.

 

These tables cross reference each other as well as reference entries in the other streams, such as the #GUID and #Strings streams. The trashbin option /md:#~ gives you the header of #~ stream, which is a kind of summary view of what the stream contains:

 

>trashbin HelloWorld.exe /md:#~

METADATA STREAM #~

        TABLES HEADER

                MajorVersion = 1

                MinorVersion = 0

                HeapSizes = 0

                        #String Index = 2 bytes wide

                        #GUID Index = 2 bytes wide

                        #Blob Index = 2 bytes wide

                Valid  = 0x00000900021547

                Sorted = 0x0002003301fa00

 

        METADATA Tables

                RID.               TableName    [No of Rows]

                0.                    Module    [1]     Row=10 bytes

                1.                   TypeRef    [4]     Row=6 bytes

                2.                   TypeDef    [3]     Row=14 bytes

                6.                    Method    [4]     Row=14 bytes

                8.                     Param    [3]     Row=6 bytes

                10.                MemberRef    [5]     Row=6 bytes

                12.          CustomAttribute    [1]     Row=6 bytes

                17.            StandAloneSig    [2]     Row=2 bytes

                32.                 Assembly    [1]     Row=22 bytes

                35.              AssemblyRef    [1]     Row=20 bytes

        Table Count = 10

 

The above listing says that helloworld.exe contains 10 tables in its metadata headers.

To give you a taste of what I am talking about lets look at some of these tables:

 

To look at the metadata tables, use the option

 

        /mdtab           display optimised metadata tables

 

This is the table called TypeDef that has a listing of all the types defined in this assembly.

 

[RID=2] Table TypeDef

     [  DATA]Flags    [STRING]Name     [STRING]Namespace [CI:64 ]Extends  [RID: 4]FieldList [RID: 6]MethodList

   1.[  0x  ]0        [  0x  ]1        [  0x  ]0         [RID: 2] 0       [RID: 4] 1        [RID: 6] 1        

   2.[  0x  ]100001   [  0x  ]30       [  0x  ]35        [RID: 1] 1       [RID: 4] 1        [RID: 6] 1        

   3.[  0x  ]100001   [  0x  ]46       [  0x  ]35        [RID: 1] 1       [RID: 4] 1        [RID: 6] 3        

 

The table has a field called name – which is the name of the type. The value of the names field are actually offsets into the #Strings stream. So if you care to compare (I have provided a listing of the #Strings stream earlier), what the table is saying is that this assembly has 3 types which are namely , Calc and HelloWorld. At this point let me drop the source code of the HelloWorld.cs file from which this assembly was compiled:

 

//HelloWorld.cs

namespace HelloWorldSample

{

        using System;

       

        public class Calc

        {

                public int Add(int a, int b)

                {

                        return a+b;

                }

        }

       

        public class HelloWorld

        {

                public static void Main(string[] args)

                {

                        try

                        {

                                Calc c = new Calc();

                                Console.WriteLine(c.Add(10,20));

                        }

                        catch(Exception e)

                        {

                                Console.WriteLine(e.Message);

                        }

                }

        }

}

 

As you can see the source does define two classes called Calc and HelloWorld. A type in .Net terms is a very strict entity that encompasses  value types and reference types and type entities like classes, structures, enums etc.

 

Here is a table that shows me what types are being referenced in this assembly:

 

[RID=1] Table TypeRef

     [CI:75 ]ResolutionScope [STRING]Name     [STRING]Namespace

   1.[RID:35] 1              [  0x  ]29       [  0x  ]22       

   2.[RID:35] 1              [  0x  ]73       [  0x  ]60       

   3.[RID:35] 1              [  0x  ]90       [  0x  ]22       

   4.[RID:35] 1              [  0x  ]A2       [  0x  ]22       

 

The types are Console, Object, Exception etc

 

Here is a list of all the methods that are defined within this assembly;

 

[RID=6] Table Method

     [  DATA]RVA      [  DATA]ImplFlags [  DATA]Flags    [STRING]Name     [  BLOB]Signature [RID: 8]ParamList

   1.[  0x  ]2050     [  0x  ]0         [  0x  ]86       [  0x  ]51       [  0x  ]A         [RID: 8] 1       

   2.[  0x  ]2064     [  0x  ]0         [  0x  ]1886     [  0x  ]55       [  0x  ]10        [RID: 8] 3       

   3.[  0x  ]2078     [  0x  ]0         [  0x  ]96       [  0x  ]5B       [  0x  ]14        [RID: 8] 3       

   4.[  0x  ]20C8     [  0x  ]0         [  0x  ]1886     [  0x  ]55       [  0x  ]10        [RID: 8] 4       

 

You can see the method name add, Main etc here, as well as auto-generated methods such as .ctor (the constructor). 

 

In short if you examine the metadata headers carefully you get an idea about the extent of meta information being stored about your program and what can be supported by any reflection based API. A complete listing of the metadata tables are provided in the ECMA specs, so that’s your best reference, Mr Lidin’s book also does a good job of this and I used both of these while I was writing trashbin.

 

So there – I have said what I wanted to say about trashbin. This might be a good starting point to explore the exe and dll file formats.

 

Before I stop, I must add, do take a look at the switches

        /exp     display export table

        /imp     display import table

 

these show the methods that are exported from an exe/dll (so that it maybe dynamically loaded and invoked from another process) and shows what method calls it invokes. This is kind of the ‘metadata’ of the old unmanaged world.

 

Typing trashbin ntdll.dll /exp in the system32 folder will give you a listing of the native NT api, which might make for good reading again (if that sort of stuff gets your interest).

 

Most of the content of this blog entry was presented at the Bangalore .Net User group as one of the UG meetings talks.

 

 

 

All that said, here is the real reason I started out on this blog entry: Recently Kaushik Srenevasan pointed out a bug in trashbin’s #Strings parsing routine. Kaushik is a ‘MS Student Ambassador’ and has a blog here: http://dotnetjunkies.com/WebLog/kaushik/

 

I had, for some reason, assumed the presence of padding 0s when I had written the original code. That more or less always worked, because the #Strings stream was expected to be padded with zeros till is a 4 byte boundary.

 

Its has been patched according to Kaushik’s suggestion, so Thank you Kaushik.

 

 

 

Trashbin can be downloaded from its homepage here:

http://www.thinkingms.com/pensieve/homepage/work/trashbin/trashbin.htm

 

 

As foot note I must say that the Microsoft Dumpbin program that ships with Visual Studio does most of the things that trashbin does and a few additional things. Dumpbin however does not display metadata information from managed PE files yet.

 

 

Tuesday, May 18, 2004 7:20:34 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  |