Sunday, April 25, 2004

Iterators in Ruby (Part - 1)

Warming up to using Iterators (Part 2)
< I am yet to write a part 3 >
SICP, Fiber api and ITERATORS ! (Part 4)

 

I had a look at the implementation of iterators in C#. What follows is based on code generator I have seen on the C# Whidbey post PDC release. Things might have changed by now as things are moving to technology preview phase.

 

This is example code that is present in MSDN:

 

// yield-example.cs

using System;

using System.Collections;

public class List

{

    public static IEnumerable Power(int number, int exponent)

    {

        int counter =0;

        int result = 1;

        while(counter++ < exponent)

        {

            result = result * number;

            yield result;

        }

    }

 

    static void Main()

    {

        // Display powers of 2 up to the exponent 8:

        foreach(int i in Power(2, 8))

            Console.Write("{0} ", i);

    }

}

 

Notice the introduction of a nice little ‘yield’ keyword? The behavior of C# iterators in this context is a lot like ruby iterators that I have been talking about in previous articles. Knowing a little about the state management requirements for iterators and the fact that the CLR is stack based, how are iterators implemented in C#?

 

The implementation of iterators in C# is not driven by the CLR in any way, it is completely implemented in the language as a compiler construct.

 

Let me explain what the compiler tries to do – the compiler examines the function that does the yield

 

    public static IEnumerable Power(int number, int exponent)

    {

        int counter =0;

        int result = 1;

        while(counter++ < exponent)

        {

            result = result * number;

            yield result;

        }

    }

 

Lets just ignore the yield statement for now and look at the method as thought it contained only a loop. It would basically look like this:

 

    public static IEnumerable Power(int number, int exponent)

    {

        [initial code]

        while([loop condition])

        {

            [loop body]

        }

        [post loop code]

    }

 

This is then generated into a sequence of IL statement blocks that have the following jumps between them:

 

 

Now, what yield would require is that the method exit at each point a yield statement occurs. The next time the method is invoked, execution continues immediately past the yield statement with all variables preserving their values.

 

In the CLR, when a method returns its stack frame is torn down. So there is no way that the local variables can actually preserve state. The solution taken by the C# team is turn the method that implements the yield statement into a class.

 

Such a class would

 

·         Have all local variables of the method as members of the class

·         Have a special variable that hold the value that is being yielded.

·         Have a special variable to indicate where the method should continue from, the next time it is invoked.

 

When the caller of such a method runs and encounters the foreach loop that invokes the iterator an object of this class gets created. This object is maintained as long as the foreach lop is running. When the loop exits the object is disposed.

 

That’s how iterators are implemented in C#. :)

 

Now here are some details:

 

 

The method that implements the iterators generates not one but two classes. Both the classes are generated as nested/inner classes to the class that contains the method. The classes are named as
(method name)$(number )_IEnumerableImpl
(method name)$(number)_IEnumeratorImpl

 

I am not sure about the exact reasoning behind the generation of two classes. The earlier standard for writing enumerators in C# probably required, but from the standpoint of implementing iterators, I don’t understand the need.

 

The first of these, the IEnumerableImpl simply creates an instance of the second class and returns it to the caller.

 

The second class IEnumeratorImpl is the interesting one. This class has data members for all the local variables as well as our two special data members.

 

Compare the data members (the cyan colored diamond shapes) to the original  local variables of the method.

 

    public static IEnumerable Power(int number, int exponent)

    {

        int counter =0;

        int result = 1;

        while(counter++ < exponent)

        {

            result = result * number;

            yield result;

        }

    }

 

The parameters number and exponent are there as such and the local variables counter and result are there with some name mangling (I would expect this is to avoid clashes with duplicate names in nested scopes, though that is not allowed in C# (duh?)).

 

The two new members on the class are

·         $PC

·         $_current

 

$_current is the member that holds the yielded value. In the case of the above method, $_current holds the value of ‘result’. It is an ‘object’ type for there will be a nice boxing and gc overhead when moving around an int type – I don’t know why something was not done for special casing value types.

 

$PC is the interesting variable. Remember our little diagram above that showed execution through IL. In the case of the iterators, the method does not simply execute in a loop like shown there, but executes one iteration of the loop on each call. The $PC is the variable that keeps track of where the code should jump to, the next time the method is called. Understandably $PC is someone idea of program counter ;)

 

The code method called MoveNext() in the class actually does the work that the method power() originally did. This is what is look like in IL code.

 

 

Sorry I am not very good with diagrams, but if you look at it you will see that the code is simply built for repeated invocation. Each time according to $PC the code braches to a new location and executes. It then sets the value of $PC to a new location of entry before the function exits.

 

In C# the yielded value is assigned to the $_current member variable and the MoveNext() itself exits by returning a true or false. A true indicates that the method returned through a yield and the false indicates that the method has completed execution. Subsequent calls to the method, after it has returned false will simply cause the method to return false and the $_current will not be updated.

 

So how does the caller of an iterative method behave? In this case its the main. The  caller simply does the following –

 

Invoke Power$00000000__IEnumerableImpl. GetEnumerator() to get an instance of Power$00000000__IEnumeratorImpl

 

Invoke Power$00000000__IEnumeratorImpl.MoveNext()

 

If result is true, use invoke Power$00000000__IEnumeratorImpl.get_Current() which will return the $_current. The foreach loop will cause the MoveNext() to be invoked again after the reurned value is consumed.

 

If result is false, break out of foreach loop.

 

After the loop breaks out the instance of Power$00000000__IEnumeratorImpl is dispose and is available for garbage collection. 

 

I am looking forward to the beta preview to see f things have changed. There is a lot of rather redundant code generated by the compiler here that I have not mentioned. When you are reading IL you might want to skip over those parts.

 

Probably in the future the CLR will contain constructs that enable true iterators and closures. Present day processors don’t natively support such constructs and so implementation will have to be hacks on the C stack or using some kind of class-object mechanism like shown here.

 

As a foot note I would like to mention that the Python also implements iterators in a manner similar to C#. There the function that yields is also converted into a class that maintains state. I believe the MoveNext() equivalent in Python is simply next(). Python however raises an exception to signal end of iteration. C# uses a Boolean return value to indicate this.

 

If this topic holds your interest then I recommend reading:

 

Coroutines in C
by Simon Tatham

 

C# 2.0 Create Elegant Code with Anonymous Methods, Iterators, and Partial Classes
by Juval Lowy (MSDN Mag)

 

Charming Python: Iterators and simple generators - New constructs in Python 2.2
by David Mertz (developerWorks)

Wednesday, May 12, 2004 5:41:44 PM (Eastern Standard Time, UTC-05:00)
Hi There,
This is a very interesting post - this ofcourse is all on Whidbey, how about as a follow up to this on how iterators in Whidbey are different than the way Java imeplements them?

I dunno, if you were at the PDC or not, but this was one of the interesting coversations that had come up there with one of the archietects of the CLR.

Amit.
Thursday, May 13, 2004 8:53:43 AM (Eastern Standard Time, UTC-05:00)
If I am wrong here I'll take it back, but my minimal understanding of Java says that Java simply doesnt have iterators. I looked here:
http://java.sun.com/features/2003/05/bloch_qa.html
and here:
http://java.sun.com/developer/technicalArticles/releases/j2se15/

What Java has is comparable to the enumrators that C# had in v1.0.
That is to say, Java has this:

ArrayList< Integer> list = new ArrayList< Integer>();
for (Iterator i = list.iterator(); i.hasNext();) {
Integer value=(Integer)i.next();
}

This basically involves getting an iterator object and asking the object for more items. This is similar to what we used to do with the IEnumerable. Java now has an improvisation on having to write the above sort of loop by providing a 'enhanced for loop', which is:

ArrayList< Integer> list = new ArrayList< Integer>();
for (Integer i : list) { ... }

Looks nice doesnt it? What it actually does is that is simply wraps up the clunkier syntax of the earlier loop, when it comes to iterators. This is similar to the foreach loop we had in C# 1.0 where we would have written
foreach(int i in list) { ... }

The language however does not seem to have the ability to write itertors the way C# or Ruby has. It cannot do yield and such things yet. But considering the way Sun is getting serious about improving the JVM and the Java language technically, I would expect that they have such constructs as part of the language in due time.




Roshan
Friday, May 14, 2004 9:59:12 AM (Eastern Standard Time, UTC-05:00)
Roshan,
You are correct, I was not elluding that java does it better or anything, but rather to the fact as a comparison in an otherwise good post.

Basically (and as you say it), the current imeplementation of iterators is all "compiler magic", you can look at it how ATL does things - fancy macros where the compiler does the heavy lifting. In the JVM, there is no "concept" of an iterator - which is what the fundamental difference is when compared to whidbey, where the IL has the notion of an iterator and keeps its "meaning".

Amit.
Friday, September 23, 2005 9:06:31 PM (Eastern Standard Time, UTC-05:00)
One of the limitation of the current approach is that you can put a yield inside a try/catch inside a iterator, because you can jump into a try block (not veribiable). The C# compiler still manages to render a using statement because it "renders" differently this pattern (and remember enumerators are disposable in 2.0).
Wednesday, March 08, 2006 12:47:44 AM (Eastern Standard Time, UTC-05:00)
What about Part 1-4, non of the links are working??
Sanjay
Friday, August 04, 2006 1:53:49 AM (Eastern Standard Time, UTC-05:00)
Hi There,
This is a very interesting post - this ofcourse is all on Whidbey, how about as a follow up to this on how iterators in Whidbey are different than the way Java imeplements them?

I dunno, if you were at the PDC or not, but this was one of the interesting coversations that had come up there with one of the archietects of the CLR.

Amit.
Thursday, March 08, 2007 10:01:23 PM (Eastern Standard Time, UTC-05:00)
let's get it right...

while (counter++ < exponent)
{
result = result * number;
yield return result;
}

http://msdn2.microsoft.com/en-us/library/9k7k7cf0.aspx
glen
Name
E-mail
Home page

Comment (Some html is allowed: a@href@title, strike) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

Enter the code shown (prevents robots):

Live Comment Preview