Tuesday, April 20, 2004

An old Time magazine article on Bill Gates:
In Search of the Real Bill Gates

http://www.time.com/time/gates/cover0.html

 

A write up on Ruby by Matz himself:
The Ruby Programming Language

http://www.informit.com/articles/article.asp?p=18225&seqNum=2
The following is the outcome of a simple longest-word search program over
      /usr/share/dict/words (409067 bytes). These were tested on my Pentium-200MHz
      Linux machine.

Program

Lines

Seconds

Ruby

14

1.046

Perl

15

0.593

Python

16

5.001

As stated before, Ruby is a bit slower than Perl because of the overhead
      for method searching; however, it's much faster than Python.

 

 

Groovy programming language, reputed to be a lot Ruby like:
http://groovy.codehaus.org/

Groovy is a new agile dynamic language for the JVM combining lots of great features from languages like Python, Ruby and Smalltalk and making them available to the Java developers using a Java-like syntax.

(Groovy Entry)

 

Codehaus

Finally a project site that encourages the need to commercially use projects

http://codehaus.org/

The Codehaus differentiates itself from other similar efforts in several ways. The Codehaus places a firm priority on the production of useful code, and less on non-coding exercises such as voting, committee-forming and proposal-writing. Each project is provided autonomy to organize as it wishes and to address its own customer concerns and requirements directly. Codehaus is not entirely open to any and all projects. Projects must be sponsored or introduced through an informal manner by an existing haus-member and deemed to be "interesting".

 

Codehaus aims to support commercially useful projects, and thus does not sponsor or assist with projects licensed under the GPL or other business-hostile licenses.

 

 

Structure and Interpretation of Computer Programs (SICP)

If you are a fresher or are getting started on programming, or like thinking about programming I highly recommend reading the SICP.

http://mitpress.mit.edu/sicp/

"I think that it's extraordinarily important that we in computer science keep fun in computing. When it started out, it was an awful lot of fun. Of course, the paying customers got shafted every now and then, and after a while we began to take their complaints seriously. We began to feel as if we really were responsible for the successful, error-free perfect use of these machines. I don't think we are. I think we're responsible for stretching them, setting them off in new directions, and keeping fun in the house. I hope the field of computer science never loses its sense of fun. Above all, I hope we don't become missionaries. Don't feel as if you're Bible salesmen. The world has too many of those already. What you know about computing other people will learn. Don't feel as if the key to successful computing is only in your hands. What's in your hands, I think and hope, is intelligence: the ability to see the machine as more than when you were first led up to it, that you can make it more."

 

Alan J. Perlis (April 1, 1922-February 7, 1990)

Tuesday, April 20, 2004 7:56:35 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Monday, April 19, 2004

Last night we did it again.

We went for this movie (50 First Dates) and came home feeling a little giddish. I was feeling a little giddish before the movie after nearly having my head ripped off sitting on a Torra Torra, in a fair in Bangalore.

 

So after the movie and the drive back home, what do we decide to do, like the nice normal people we are? We decide that we need to drink coffee at 12am and discuss programming. So we head off to Leela Palace where there is a late night Barista.

 

Something about the way coffee affects my head, when drunk late at night, especially after a movie needs some investigation. Sidharth was my comrade is arms, or rather comrade in coffee. So what do we do? we go there and sit down and drink coffee and I start off on SICP (Structure and Interpretation of Computer Programs) which I have been postponing for several years now.

 

I think part of why I was so adamant about starting out on SICP in the middle of the night is that I feel life (like usual) isn’t going anywhere. It turns out that a lot of smart people at various Universities decided that I was wasn’t smart enough to warrant a formal higher education in Computer Science and the place I want to be the most, doesn’t seem to want me around because of some technicality (for the fifth time). So since life wasn’t going anywhere, I figured I’d just have teach myself the things I want to know, my own way.

 

A little fast-forward in time and what finally ends up happening is that Sidharth and I end up talking about a certain MSDN article.

Implementing Coroutines for .NET by Wrapping the Unmanaged Fiber API

http://msdn.microsoft.com/msdnmag/issues/03/09/CoroutinesinNET/default.aspx

We ended up in a rather (heated) philosophic discussion about how iterators could be implemented, till 4am, which is what this blog entry is about.

 

If you have been reading about iterators in my previous blog entries

Iterators in Ruby (Part - 1)

Warming up to using Iterators (Part 2)

Then the idea is probably growing on you already. What Sidharth and I did is put in some thinking about how iterators could be implemented. This entry is going to break the logical flow of these two articles, but I am letting it be. I will probably have a part 3 post that will bridge the gap between Parts 1 and 2 and what I am going to say here about iterators.

Also, like a lot of things on this blog, I am not an authority on the subject so I am just guessing at how these things actually work.

 

 

Iterators

The thing about iterators is that there are two functions involved that have to maintain execution state at the same time. So example when a function calls another function, the caller is frozen and the callee executes – so the caller maintains execution state during the run time of the callee.

 

def callee

      yield 1

      yield 2

      yield 3

end

 

def caller

      callee { |n|

#parameter block to the iterator

puts n
}

end

 

When the callee is an iterator, the control actual leaves the callee and returns to the caller, when the execution is in the parameter block of the iterator. However we don’t see this sort of behavior in a normal C stack. Why? because when a function on the C stack returns to the caller, the function’s activation record on the stack is destroyed.

 

How do we do this?

The approach in the MSDN article uses an API called the fiber API.

 

Fiber Approach

The fibers can the thought of as threads that don’t have the scheduler attached to them.  So unless a fiber is explicitly passed control it will not be executed, unlike a thread which is invoked by scheduler for a time slice.

 

What Ajai Shankar (the author of the MSDN article) does is use fibers to represent iterators. So in the above snippet, the function callee() would actually execute on a different fiber from  caller. So when control needs to shift to the parameter block, which is to be executed in the caller() function, a fiber is a switch occurs.

 

When the parameter has finished execution a context switch occurs again.

 

What further happens is that the author has wrapped up all this dirty jumping around into a managed C++ class that invokes the OS api. He then goes onto write C# code (really!) that uses yield, almost the same way Ruby would use it.

 

(pasted)

class CorIter {

    public void Next() {

        object[] array = new object[] {1, 2, 3, 4};

        for(int ndx = 0; true; ++ndx)

            Yield(arr[ndx]);

    }

}

 

If you get the general idea, then lets move on.

 

The problems with using the fiber API, among other problems, are

·         Every fiber is like a thread, which means that the more the iterators the more the number of fiber specific stack frames and such that get created – which means  more the code bloat for code like this.

·         Using the fiber api actually makes this a very OS specific solution – other OSes that the CLR may wish to target may not have provisions for building up such an API.

·         Exceptions: exceptions in the windows world are strung to the TLS (Thread Local Storage) of the thread of execution – this may behave rather odd when fibers are mixed into the picture.

 

Let ignore everything and just examine the first problem, the issue of creating separate stack frames per fiber and thus bloating the system – if we could solve this one, then I think (and I might be wrong), would bring more credit to this approach.

 

Wrapping State in a Caller Object

One other approach to supporting iterators is to ensure that one of the two functions (the caller or the callee) maintain state using some mechanism other than the C stack.

 

Lets take a look at the caller:

 

def caller

      callee { |n|

            puts n

      }

end

 

or maybe a C# equivalent.

 

void caller()
{

      foreach(int n in callee())

      {

            Console.WriteLine(n);

      }

}

 

This method can actually be though off as consisting of three parts

 

void caller()
{

     

      foreach(int n in callee())

      {

           

            Console.WriteLine(n);

      }

     

}

 

We could create an object to hold the state of the function that would hold these three parts. Something like this:

 

class caller_object

{

      //declare all local variable so the class as member variables here

      void do_part1()

      {

}

 

void do_codeblock() //part 2

{
}

 

void do_part3()

{

}

}

 

The idea is that we create an object that has member variables that represent the local variable of the caller.  So we execute the caller as three parts

 

void caller()

{

      caller_object co = new caller_object()

      co.do_part1();

      callee(co);

      co.do_part3();

}

 

The caller method now is simply a wrapper around the class that represents the caller function as an object. When the method do_part1() is called on the class, the object will have the same state as the original caller() function when it has just run till the point where the iterator is invoked.

 

Then the callee() is invoked and the object that represents the caller’s state is passed to the callee. The callee then goes on to invoke the object’s do_codeblock() every time a yield is required.

 

Since the callee never returns till it has completed execution it maintains state on the runtime stack, like a normal function. The do_codeblock() has the same code that the code block of the for each loop had and it can also maintain any state changes into the object. Finally when the callee() exits the object’s do_part3() is invoked.

 

This is similar to what the iterators accomplish. Here the state is stored in an object and not on the stack. However, here a full managed type that represents that caller has to be created. I didn’t like that too much.

 

Wrapping State in a Callee Object

This is similar to the above approach, except that roles are reversed. We create an object that can represent the callee. The callee then returns to the caller at every yield statement.

 

The callee state is maintained in the object representing it. There is an excellent write up you can read about a similar approach here:

Coroutines in C

http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

 

The idea there is that the state of the function is retained in a state variable. The state variable is used to jump back to the point where the function had previously yielded from. Code would look a little like this:

 

(pasted)

int function(void) {

    static int i, state = 0;

    switch (state) {

        case 0: /* start of function */

        for (i = 0; i < 10; i++) {

            state = 1; /* so we will come back to "case 1" */

            return i;

            case 1: /* resume control straight after the return */

        }

    }

}

 

Now this example uses static variables but it is easy to imagine this being extended such that each variable is the member of some object.

 

(pasted)

It's a little bit ugly, because suddenly you have to use ctx->i as a loop counter where you would previously just have used i; virtually all your serious variables become elements of the coroutine context structure. But it removes the problems with re-entrancy, and still hasn't impacted the structure of the routine.

 

(Kudos to Pooja, for coming up with this idea at one sitting).

 

 

 

When C# announced the coming of iterators in the language and a new yield keyword, I was excited. In the mood of the MSN co-routines article, I had expected a CLR level support for iterators.

 

It turns out that the C# teams approach is similar to that of the saving the callee state in an object. (I am not very sure about whether its the caller or the callee, in case I am wrong in assuming that it’s the callee, which seems to be the more logical choice, I will blog about it).

 

In the Co-routines in C article, the author talks of writing macros that wraps up the behavior.  Since the compiler does the temporary object creation and hides all the mess from you, in the case of C#, it seems like a reasonable alternative.

 

 

A modified form of the Fiber API idea

The reason I don’t really like the way C# does iterators right now is because it is a hack. They did not want to change the CLR for a feature that may not catch on. So I guess, they used a less expensive approach. If I am wrong, I would like to be corrected. I would expect that more serious CLR level support will come up for iterators if the idea’s introduced in Whidbey C# become popular.

 

The other reason I don’t really like the approach, the real reason, is that the .Net type system is a fairly comprehensive type system designed to propagate an idea of types as a level playing field for language agnostic components to interact. Introducing a type into the system just to retain a function’s state does not seem consistent with this philosophy.

 

Fiber API on the other hand more naturally lend themselves to the way I would choose to think of iterators – as functions that can be frozen during execution and be continued.

 

Now this might seem like a weak argument, but it seems to better to use the processors abilities to do a context switch to actually freeze execution of a block of code, that write the code as code that manages members of an object (only for the purpose that the object can be used to retain the state of the code).

 

The Fiber API like approach seemed to do this more naturally. I would expect that the CLR in future would internally provide some API similar to that of the OS provided fibers so that it can do iterators and closures and probably even continuations.

 

Some basic requirements would be that implementing such features don’t slow down execution of code that don’t require any of these features. Such features should be reasonably efficient with respect time as well as space.

 

Let me try and discuss the space issues here. In fiber API there would be need for creating totally new independent stack frames for each fiber. This is wasteful.

 

Would it be possible so that we have a modified API, which will behave like fibers, share stack space with the common C stack and can use the processor context switching abilities to freeze function execution, rather than save state as a managed object.

 

A little bit of brainstorming last night and we had this:

 

In the .Net world, we have the luxury of being able to predict the stack usage of a function under execution with IL directives like “.maxstack”. Which is to say - we know how much space the function will use on the managed stack.

 

The stack frame for regular method calls would look like this:

  

 

This is obvious for anyone who understands how methods are laid out on the stack. The only advantage that we have here is that in the .Net world. We know exactly how much stack space a given method will use.

 

Now if the method calls an iterators that has a yield, we create a Fiber, but a special sort that would use the main stack itself as its stack frame. So the newly created method instance (the iterator itself) will reside on the call stack, above the caller.

 

 

Now the usual semantics of stack usage are allowed on this fiber. The fiber behaves like any other thread would behave, owning the stack. To allow methods to keep track of their callee’s we add a reference to the activation record of the callee.

 

  

 

The interesting part, when the iterator needs to yield a value. When it does control is switched back to the original fiber. The activation record of the iterator is still maintained on the stack. Further method calls would however place their activation records above the iterator’s activation and behave as though it was normal C stack.

 

 

Thus I think it is possible to have fiber API like constructs to implement iterators, share stack space have reasonably efficient implementations too. The only real over head introduced here is a level of indirection when activation records are torn down from the stack frame.

 

I feel that this is a more co-routine like approach that the one that involves creating hidden managed objects.

 

I would like to wish that this idea can be extended to implement proper continuations also, that is not very easy. Here the stack management is very easy because as any point a sleeping fiber will contain only one activation record on the stack. A continuation will require that activation objects live and die on the manage stack as though they were proper objects and some sort of garbage collection routine will be required on the stack.

 

I am extremely open to opinions about this entry, because I am treading on many areas that I am not very well versed with. I am hoping that the idea of freezing execution state via fiber like constructs is more efficient that the approach that involves creating full managed objects.

 

Monday, April 19, 2004 7:48:40 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, April 17, 2004

Yesterday I was taking a talk at the Bangalore User Group. There was this one person in the crowd who I noticed, have noticed before too who was actually listening to the talk. Listening in the sense of trying to understand how things work.

 

At least that is what I hope because somewhere is his/her eyes there was this interest in computer science, not just in knowing what the technology can do for you, but actually caring about technology for sake of the science that it caries.

 

 

I get to talk to a lot of professional programmers though various technical communities and one thing I feel is that there are very few people, in most gatherings none, who really care about computing. There are folk who are passionate about their one patch of grass, some about all the software that Microsoft write, some who will simply hate everything that MS writes simply because it is written by MS. Some who will praise anything that is Java, some who will praise anything that is Linux, some who will praise anything that is GNU, some who will praise anything that is Open Source. And other who will hate the same software simply because of the same reasons that someone likes it.

 

This dichotomy bothers me. You actually feel a little lonely in talks, when you are standing there scanning the eyes of your audience for someone who seems to understand computing, someone who seems to care about computing.

 

And you see that glimmer occasionally in people’s eyes when they pursue something and understand what they mean – but that glimmer dies off very fast for most.

 

And then again occasionally you see someone in the audience who you think is genuinely interested. Not interested for the sake of fitting it into their own value perceptions (though we are all that way at some level), not interested because they want to yes-sir the speaker, but interested because they really care at some level.

 

 

The smart ones argues back for what is right and will completely refine their views if you are right and expect you to if they are. The smart ones.

 

 

And what is good Pheadrus,

And what is not good,

Need we ask anyone to tell us these things?

Saturday, April 17, 2004 4:39:44 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 

Yesterday I delivered a talk at the Bangalore .Net User Group.

 

I gave a small introduction to the concept of the Windows Scripting Host and how the scripting host allows for multiple language interpreters to be plugged in and run via the same scripting interface. I also talked a little about the object model that is exposed to scripts under windows. Most of the time I was comparing this to the general notion of shell script in Unix, and how in the windows culture the need for small discrete utilities strung together by shell-script is replaced by having a language agnostic object model and API in windows scripting world.

 

The talk then went on to introduce WMI or Windows Management Instrumentation API. I went on to talk about how the API works, the class and namespace architecture and how to do reflecting in the WMI API. The sheer power of this API and how little understood and appreciated it is in Windows has always bothered me. Probably I will blog about WMI itself another day.

 

The talk finally wrapped by showing of some basic C# code that can use WMI. Since there were several queries about the samples and presentation I thought I will put them up here.

 

Here is the presentation for the talk.

Windows Scripting Host.ppt

 

Code snippets:

To run these on your machine, copy and  them and save them as the corresponding file names. The go to the command line and type:
cscript 

 

wshbasic.vbs

This displays all the running processes on your machine

 

set serv = GetObject("winmgmts://./root/cimv2")

set objs = serv.InstancesOf("Win32_Process")

for each obj in objs

      wscript.echo obj.name

next

 

 

wshprocess.vbs

This shows more information about each process by accessing some more properties of the process object

 

set serv = GetObject("winmgmts://./root/cimv2")

set objs = serv.InstancesOf("Win32_Process")

for each obj in objs

      wscript.echo obj.name &" "& obj.ProcessId &" "& _

            obj.ThreadCount &" "& obj.KernelModeTime

next

 

wshproperties.vbs

This shows how you can find out what all properties the process has. This is generally applicable to any class not just to Win32_Process

 

set obj = GetObject("winmgmts://./root/cimv2:Win32_Process")

wscript.echo "------------- Properties ----------------"

for each prop in obj.Properties_

      wscript.echo prop.name

next

 

wshmethods.vbs

And this shows you how you can access the methods of a object. You could easily put these two scripts together that will show you more or less complete information about an object.

 

set obj = GetObject("winmgmts://./root/cimv2:Win32_Process")

wscript.echo "------------- Methods ----------------"

for each prop in obj.Methods_

      wscript.echo prop.name

next

 

killprocess.vbs

This shows you how to call a method on a object. This example will close all the IE browser instances on your computer. The earlier script, wshmethods.vbs, would show you that a terminate() method existed for the Win32_Process class.

 

set serv = GetObject("winmgmts://./root/cimv2")

set objs = serv.InstancesOf("Win32_Process")

for each obj in objs

        if obj.name = "IEXPLORE.EXE" then

            obj.Terminate()

      end if

next

 

wqlkillprocess.vbs

This shows off the usage of WQL or the Windows Query Language in interacting with WMI. It does the same task as the above script.

 

set serv = GetObject("winmgmts://./root/cimv2")

set objs = serv.ExecQuery("select * from Win32_Process where name = 'IEXPLORE.EXE'")

for each obj in objs

      obj.terminate()

next

 

classes.vbs

This shows you how to know what other classes exist. Notice all the above examples used only the Win32_Process class. WMI provides hundreds of classes that you can use for various tasks. It is a fairly broad API. This shows you the classes that exist. You can then find out more about each class by examining its methods and properties using the scripts I have given earlier.

 

set serv = GetObject("winmgmts://./root/cimv2")

set results = serv.SubClassesOf()

for each result in results

      wscript.echo result.Path_

next

 

recclasses.vbs

In the WMI API, classes are organized into namespaces and the above script shows only the scripts available in the root/cimv2 namespace. This script will recursively traverse the namespace hierarchy and list all the classes available under each namespace. Neat ?

 

dispNS ""

 

sub dispCLS(ns)

      Set serv = GetObject("winmgmts:\\.\root\" & ns)

      Set classes = serv.SubclassesOf()

      For Each classobj In classes

            wscript.echo vbTab & classobj.Path_.Path

      Next

end sub

 

sub dispNS(ns)

      Set serv = GetObject("winmgmts:\\.\root" & ns)

      Set namespaces= serv.InstancesOf("__NAMESPACE")

      For Each namespace In namespaces

            wscript.echo "Classes in Namespace = " & ns & namespace.name

            dispCLS ns & namespace.name

      Next

end sub

 

That’s it with the script samples. Try running them, you might be surprised. You can look at MSDN for further documentation of the WMI classes, not all are documented. A couple more of tips:

 

If you want to connect to a another computer, not the local one, then change the GetObject() call in the above scripts to GetObject(“winmgmts:\\”). By specifying “.” the script will establish a connection to local computer.

 

To run a WMI script, the script needs to be run under admin privileges. Understandable considering the powerful things that these scripts can do. If you are not running as admin on your machine (which you shouldn’t be), you can use impersonation as shown in the code below.

 

I got this from:

http://www.activexperts.com/activmonitor/windowsmanagement/wmi/samples/wmiremote/

 

Sub ListShares( strComputer, strUser, strPassword )

    Dim strObject

    Dim objLocator, objWMIService, objShare

    Dim colShares

 

    Set objLocator = CreateObject( "WbemScripting.SWbemLocator" )

    Set objWMIService = objLocator.ConnectServer ( strComputer, "root/cimv2", strUser, strPassword )

    objWMIService.Security_.impersonationlevel = 3

    Set colShares = objWMIService.ExecQuery( "Select * from Win32_Share" )

        For Each objShare In colShares

            Wscript.Echo objShare.Name & " [" & objShare.Path & "]"

    Next

End Sub

 

Now finally the C# source, its easy to guess what this does:

 

using System.Management;

using System;

 

class CMain

{

        static void Main()

        {

                string qs = "select * from Win32_process";

                ObjectQuery q = new ObjectQuery(qs);

                ManagementObjectSearcher sr = new ManagementObjectSearcher(q);

                foreach(ManagementObject obj in sr.Get())

                {

                        Console.WriteLine(obj["Name"]);

                }

               

        }

}

 

I also have a do-all, end-all killer Perl script that does almost everything that you can think of with WMI. I will however make a post about that another day. I have an entry about this here

Saturday, April 17, 2004 4:38:49 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Friday, April 16, 2004

Yesterday Sajith mailed me a this. It is a flash version of the old Prince of Persia game - ver 1.0.
Try it, its fun.

Which reminded me of the version I had done:
http://www.thinkingms.com/pensieve/homepage/old_work/prince_of_persia.htm

There were a couple of neat things my Price did such as have steps, complex net like patterns that the prince could run behind (the original prince could never be covered by any complex surface). I have a couple of screenshots to prove my point.

 

Friday, April 16, 2004 4:54:16 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 

In the past article Iterators in Ruby (Part - 1) I talked about the concept of iterators and how iterators are available in Ruby. In this part I will dwell on how iterators are used, so that the concept grows on you.

 

For someone used to the C/C++ world, the constructs provided by those languages suffice to express any idea of their choosing. While that is true, programming in a C-like language causes us to close our minds to other styles of programming and other constructs that might exist. Programming in C after a point is about writing the next big program optimized with lots of data-structure usage and trying to tie a new algorithm down into C. Sometimes the joy of programming, where the language lets you do your job - express ideas as code, is lost. Sometimes we spend our time servicing our language syntax and spoon-feeding our compilers. The fact that languages might actually evolve so that you can get on with your job, was alien for a long time to me.

 

If you have read the first part you might be wondering how iterators are used in Ruby. Admittedly, the idea would seem a little complex and maybe contrived to the uninitiated.

 

In Ruby, iterators are used pervasively. Its there all over the place and once you get started on Ruby, you will probably end up using an iterator without realizing that you are using one. The Ruby libraries are rich with iterators of various sorts.

 

Simple Loops

When you start of on Ruby code, you might see loops of the sort:

 

10.times {

      print “hello world”

}

 

This, as you might expect, prints ‘hello world’ 10 times.

 

How does this work? Ruby is a pure object oriented language. The number 10 is an integer, and the integer class exposes a method called ‘times’. The times method is an iterators that yields values from 0 to its value -1.

 

Since it yields values, can we catch them ? Yes.

 

10.times {|n|

      print n

}

 

And this prints all the values from 0 to n-1.
'times' is an iterator.

 

File Handling

Let’s look at some file handling in Ruby. The following code will open a file and read each line of the file and print the line among with its line number.

 

file = File.new(“filename.txt”)

c = 0
file.each_line {|line|

      c = c + 1

      print “#{c}: #{line}”

}

 

The code is simple. I open a file and create a file object. I ask the object to yield each line to me. As I get each line I print it out along with the line number. This is as logically expressive as I have seen in any language that I have used. All the mess stays out of your way and you get to focus on the job at hand.

 

The each_line is a method of the File class and it yields each line in the file. The variable ‘line’ will hold the value of each line. Slick?

 

(I you are wondering what “#{c}: #{line}” means – in a string #{ } is a substitution. You can write any expression into the curly braces. Here the values of c and line get substituted into the string)

 

Arrays / Collections

Similarly collection types expose an “each” method which yields every member of the collection. So if I had to iterate over an array I would write:

 

array = [1,2,3,4]

array.each {|m| puts m }

 

The above code creates an array of 4 elements and accesses each element using the iterator “each”.

 

In similar fashion, a lot of the Ruby library exposes functionality as iterators. So much so, that I rarely write for loops in Ruby.

 

Recursive Directory Enumeration

Now let us try and write code of our own. Something you may all have written is code that will find all the text files in a folder and is sub folders. The usual approach is to write a recursive function.

 

The function will try and remember a list of text files, in the current directory and the list of sub directories it has. It will then recursively call each of the subdirectories, each of which will do the same task. The problem is that if every time a text file is to be found, some processing is to be done, things get very complicated. The usual approach is to find all the text files and create a big list of filenames, which is then processed later.

 

Here is an approach with iterators. Try and implement this in your favorite language that does not have iterators and see how it looks.

 

def textfiles(dir)

        Dir.chdir(dir)

        Dir["*"].each do |entry|

                yield dir+"\\"+entry if /^.*\.txt$/ =~ entry

                if FileTest.directory?(entry)

                        textfiles(entry){|file| yield dir+"\\"+file}

                end

        end

        Dir.chdir("..")

end

 

textfiles(“c:\\”){|file|

        puts file

}

 

What the above code does is simple. I have defined a method called textfiles() that takes a directory name as a parameter.

 

The code looks exactly like you would explain it algorithmically.

  1. Go to the folder (chdir)
  2. Take a look at the contents (Dir[“*”])
  3. See is an entry is a text file, if so yield it (yield dir+"\\"+entry if /^.*\.txt$/ =~ entry)
  4. See is an entry is a directory, if so, recurse into it
    (
    if FileTest.directory?(entry)
       textfiles(entry){|file| yield dir+"\\"+file}
    end)

 

Simple?  Notice that the beauty of code is that the yield actually sends the value of a filename down a recursive hierarchy.

 

As a disclaimer, if you are using Ruby, then you might a well finish off in one line by saying:


Dir[“**/*.txt”].each{|file| puts file }

 

 

Friday, April 16, 2004 3:28:18 AM (Eastern Standard Time, UTC-05:00)  #    Comments [1]  | 

After smoothing a out a few issues that dasBlog has, this blog is now functional. dasBlog requires that the user under whose permissions ASP.Net is running was write permissions to the folders - content, logs and siteconfig. On a win 2003 box ASP.Net runs as \NETWORK SERVICE.

So what you need to do, if you are setting up dasBlog, is to allow wirte permissions to our 3 folders.

This is also fine time to roll out some links:

Channel 9
Some folk at MS have dished out Channel 9. Channel 9 is where you can see into the big borg entity of MS and maybe come away with the feeling that they are not a big borg entity at all.
http://channel9.msdn.com/

IronPython: Python is being shifted to .Net by Jim Hugunin.
He is the same person who developed Jython, the Java implementation of python. .Net has been for sometime considered a difficult platform to shift to for dynamic languages such as Python and Ruby. Ruby might be a tad bit more difficult beacuse of all the tricks it does with continuations, closures, iterators and such.

http://www.hole.fi/jajvirta/weblog/20031210T0901.html
I'd guess that anyone who reads this weblog also reads Jeremy Hylton's weblog (which is in my opinion currently perhaps the best technical Python related weblog), but I still thought it was worthwhile to mention that the great Jim Hugunin has a new project, named IronPython, which is an implementation of Python for the Microsoft Common Language Runtime environment. The remarkable thing is that IronPython runs faster than the Python implementation in C according to the pystone benchmark. (See Hugunin's original message for full details.)

Miguel de Icaza, lead developer of the Mono framework, also comments on Hugunin's remark with delight and says that this might "stop the meme of '.NET is slow for scripting languages'".

Hugunin himself is busy for the whole January, but hopes to continue the development of IronPython after that.
Written by Jarno Virtanen at 2003-12-10 09:39

Miguel De Icaza and Nat Friedman go dancing(!) with Microsoft's CTO
http://primates.ximian.com/~miguel/archive/2004/Apr-12.html
This is a must see.

Electronic Intifada
I found this on Miguel's site and I wish more people cared.
http://electronicintifada.net/new.shtml
The Electronic Intifada (EI), found at electronicIntifada.net, publishes news, commentary, analysis, and reference materials about the Israeli-Palestinian Conflict from a Palestinian perspective. EI is the leading Palestinian portal for information about the Israeli-Palestinian conflict and its depiction in the media.

The Phoenix Research and Development Kit from MSR
This is one of the things, where, I feel, the future is brewing. The Phoenix RDK is a language/compiler/runtime generation and research framework comparable in scale (with the little that I know) to the National Compiler Infrastructure (NCI) project.
The Phoenix RDK homepage: http://research.microsoft.com/phoenix/

Friday, April 16, 2004 12:02:50 AM (Eastern Standard Time, UTC-05:00)  #    Comments [3]  | 
 Thursday, April 15, 2004

Very soon I should have a blogging engine of my own up and working. I got a copy of DasBlog and with a bit of tweaking it seems to suits me rather fine. I would however like to see

·         Hierarchical comments

·         Ability to delete comments without getting into XMLs

·         Enabling description views only on certain aggregate views.

·         Where is the archives feature?

 

The blog should be going up on www.thinkingms.com, a site that is run by Pandu. The only problem is that I don’t seem to be thinking MS all the time :-)

 

Until the blog is formally up I guess you will see me manually predate entries at the end of each entry.

 

(To the tune of Jingle Bells)

blogging site blogging site

blogging all the way,

oh what fun it is to send

an entry on its way ... hey !

Apr 15 2004 Thursday 12-23PM

Thursday, April 15, 2004 1:55:29 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  |