Friday, July 20, 2007

As an indirect result of some of conversations in the past few days at Google and Microsoft, I was left thinking that I need a better way to explain to someone why basic research of languages and language constructs is necessary. This crops up every now and then, and it would be useful if I had a simple answer to this. I think I have an approximate answer now, what do you think:

I think basic research about language constructs is necessary because it helps separate concerns. It separates of the quality of the abstraction that the construct provides from the details of the implementation of the construct. In this way one can study and construct and make observation about how valuable it would be as a tool for expressing computation. Once we have a great abstraction, we can then focus our attention on the best possible implementation of the abstraction or some select component of it, now with a clear understanding of the expressive power of the end result.

As an example our present notions of concurrent programming with threads and locks/semaphores is a broken idiom - they are hard to program with, are brittle and don’t compose well. The notion that it would be nicer to write concurrent programs as memory transactions came about and gained favor because the quality of the abstractions that memory transactions provided was far greater than that of locks and semaphores. By quality I mean that memory transactions compose far better, provide guarantees about progress, fairness etc and are not prone to the sorts of pitfalls that lock-based approaches suffer from. This view of transactions matured into a set of key combinators that gave us "composable" memory transactions. Once the idea that "memory transactions are good" gained ground it spurred a cottage industry of people trying to efficiently implement these abstractions.

The underlying point is that the search for abstractions to represent computation is not hindered by details of implementation. What we refer to as an implementation is only the expression of an idea in terms of other abstractions that previous languages have imposed on us. Limiting ourself to the scope of these, baises us against ideas that dont appear natural to them. I like to think of this as a form of "the evil of premature optimisation" at the level of ideas.

Friday, July 20, 2007 12:31:34 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Wednesday, July 04, 2007

One of the things that I have been working on in my free time is a compiler for a subset of the Scheme programming language. This compiler is written in Scheme and is a small, relatively fast compiler.

Languages and Usage

It supports a very basic core language:

  1. fixed size integers
  2. vectors
  3. scheme pairs (via cons)
  4. quoted constants
  5. let, letrec, top level define
  6. booleans, if
  7. set! (mutable variables)
  8. lambdas, higher order functions

The core language that it supports is a higher order programming language (with first class functions) and proper tail calls.

The compiler takes input in the supported subset of scheme and emits x86 assembly code. Here is the "life-cycle" of a compilation: The emitted assembly has entry point into scheme, a sort of scheme_main(). To use the compiler one needs to compile the generated assembly into .obj using any intel-syntax-compatible assembler - I use ml.exe, which is Microsoft's assembler. The .obj then needs to be compiled and linked against a minimal bootstrap C source file which essentially allocates some amount of memory from the OS and them jumps into the scheme code. I use cl.exe, Microsoft's C++ compiler, for this purpose. The assembler and C compiler are free tools that maybe downloaded from Microsoft's website.

All of the core compilation, including the register allocator and code generator are written in Scheme. The generated code is rather efficient (read: not terrible), though a lot can be done improve its quality. In fact the quality of code it generates substantially better than one or two compilation systems that I know of that are used commercially. The code generator is a macro-based instruction selector and the register allocator is a linear allocator as opposed to a graph-based allocator.

The compiler code base is purely functional (for the most part ...) - the minor variations are control operators like yield, recursive functions etc. which are borderline-functional depending upon the school of thought that you belong to.

Organisation

At the point of this writing, the bulk of the compiler source is limited to one file called compiler.ss. This file contains all of the traditional compiler passes as well as the register allocator and the code generator. This file is a little above two thousand lines of scheme code. While that is pretty small for a compiler, I believe it could be made smaller and prettier still.

The compiler is a collection of "passes". The input to the first pass is the source language. The output from the last pass is x86 assembly. Each pass does a small transformation on the source tree - for example a pass could remove one-armed ifs and replace them with ifs that have a do-nothing else branch. The compiler has a fairly large number of passes (greater than 15 or so), each doing a simple and relatively specific transformation. One could think of a pass as a mini-compiler that turns a specific input grammar into a specific output grammar. I believe that it was Kent Dybvigs original observation that the traditional wisdom about expensive-compiler-passes is essentially misguided. Hence his own production quality compiler has many many passes, is blazingly fast, produces efficient code and has a manageable code base.

In compiler.ss, passes are written as a collection of a "visitors". A visitor is a small code transformer that takes one syntactic form and does a transformation on it. For example a visitor could be one that takes a one-armed if and returns a proper if that has an else branch. A pass is constucted by composing several visitors together. The infrastructure that enables us to write the passes as a collection of visitors has most of the basic elements of a term rewriting system. The advantage of writing the compiler as a collection of visitors makes things easy -  I can completely isolate each code transformation from the rest of the system and be free to focus on solving that little problem well.

The source tree is represented as a scheme s-expression and hence requires no parsing and none of the traditonal noise that gets in the way of compilation. At some point towards the middle of the passes the intermediate language that is output is referred to as an "IL". This IL is rich enough to represent all of the details of the scheme program while not have any scheme specific components to it. It is very much like a machine-independant assembly language - almost like C. The hope is that I maybe able to compile other languages to this IL and them share the backend of the compiler - the code generator, register allocator etc.

Most of the other files are essentially infrastructure pieces (sometimes rather generic) and helpers. The source walking visitor infrastructure is in micro-pass.ss. The yield control operator is implemented via callcc and set! in yield.ss. The super handy match macro is implemented in match.ss (I am not the original author of the match pattern matcher).

Usage and Installation notes

You will need the following to use the compiler

  1. The compiler sources
  2. Petite Chez Scheme (Just get the latest from www.scheme.com)
  3. Microsoft's C++ compiler suite or something compatible (You need the compiler and assembler, these are free downloads) 
  4. Ruby (Only needed if you wish the run the test suite)

Install Petite and the Microsoft tools. Makes sure that you have petite, cl.exe and ml.exe in your command line path. Compile a little hello-world with cl and see if things are installed and working fine. Extract the compiler sources to some folder. Thats it you are done. If you want to run the test suite, have ruby also in your path - this maybe a good idea if you want to play around with the sources.

There are two batch files for your convenience:

test-all.cmd: Runs the test suite. The test suite consist of the files in the tests folder.

compile.cmd: This compiles a specific file for you.

Consider a little program that recursively removes a value from a list:

(define foldr (lambda (op unit xs)
(if (null? xs)
unit
(op (car xs) (foldr op unit (cdr xs))))))
(define remove (lambda (xs x)
(foldr (lambda (v ls)
(if (pair? v)
(cons (remove v x) ls)
(if (eq? x v)
ls
(cons v ls))))
()
xs)))



(let* ((vals '(0 1 2 0 0 0 3
(1 2 0 0 3 4)
(1 2 0
(1 2 0 3 4) 4)
4 5 0
(1 2 0 0 3 4 0)
7 8 0))
(res (remove vals 0)))
(cons vals (cons res ())))

We compile foo.ss as follows:
> compile foo.ss
produces foo.asm and foo.exe, among other things. foo.exe when run produces:
((0 1 2 0 0 0 3 (1 2 0 0 3 4) (1 2 0 (1 2 0 3 4) 4) 4 5 0 (1 2 0 0 3 4 0) 7 8 0) (1 2 3 (1 2 3 4) (1 2 (1 2 3 4) 4) 4 5 (1 2 3 4) 7 8))

The generated foo.asm is the result of the compilation. Its the .asm file that you want to eyeball to make observations about how inifintely retarded the quality of my output asm is in comparison to your-favourite-compiler.

Source Language

The best way to find the supported subset of the language is to grep the tests folder. However, here is an attempt:

  • 30-bit numbers, ascii characters, vectors, lists, #t, #f, (), (void)
  • quoted constants composed of the above
  • cons, car, cdr, set-car!, set-cdr!
  • make-vector, vector-ref, vector-set!, vector-length
  • let, letrec, let*, set!
  • fx+, fx-, fx*, fxadd1, fxsub1, fxlognot, fxlogor, fxlogand
  • fixnum->char, char->fixnum,
  • pair?, null?, fixnum?, char?, vector?, boolean?, fxzero?, procedure?
  • fx=, fx<, fx<=, fx>, fx>=, char=?, char<=?, char>?, char>=?, eq?, not
  • lambda, if, begin, define (top level only)

If you are unfamiliar with these operators you can use Chez Scheme's index as a reference. Here are the disappointing bits: there is no quasiquote, no input-output, no forgein function interface, no file loads or includes ... yet. Note that while this is only the core subset the compiler supports, this is mostly rich enough to build a library of commonly used stuff.

Since the supported language has no output statements, the result of of program execution (the final result) is merely printed out at the console. This can be of any of the supported data type, including any s-expression (Note: I dont do anything to handle circular expression so brace yourself to see lots of parenthesis if thats the case).

XEmacs nonsense:
The best way to read the sources would be to use xemacs with folding mode enabled to ;{{{ and ;}}}. To do this, you need to add the following to your init.el file:
(folding-add-to-marks-list 'scheme-mode "{{{" "}}}" nil t)

The most useful folding mode commands are folding-mode, folding-context-next-action. It maybe nice to bind them to key combinations like so:
(global-set-key (kbd "C-c C-m") 'folding-mode)
(global-set-key (kbd "C-c C-f") 'folding-context-next-action)

Future Work and What it lacks

There are many things I would like to add to the compiler. Since I am writing this compiler entirely for the fun of it, it is one of my objectives to make choices that are interesting as opposed to useful. I believe most programming languages today are designed with the aim of being useful which is partly why they all suck; they are crammed with hasty decisions substantiated by all sorts of reasons. Eventually I suspect my compiler will fall prey to usefulness and/or boredom, until then I hope to have fun.

Most of the items in the menu below haven't been implemented yet because of time or because I am not compelled enough by the best ideas I have.

GC: The topmost on my list of things to add is a GC. While it is fairly easy to do a copy-collector in the C-runtime system, I have been toying with the idea of writing the GC in the subset of scheme that I compile. It is interesting to see the minimal set of primitives that will have to be added to the language and how cleanly this can be expressed.

Threading: It would be interesting to write a compiler that supports a mutlithreaded language. There are easy and not so nice ways to accomplish in the present compiler, and there are deeper more elegant ways to this as well.

Errors: The compiler assumes good input programs. This is never the case in "real" life and an early error checking pass is required. Traditionally good error reporting is considered hard in compilers and with good reason. However I may never get down to doing this because this leans more towards useful than interesting in my present world view.

STM: I have been doing some work on STM over the past year or so and I have some pet algorithms of my own that need compiler support to implement well. Once I have STM, like Satnam Singh shows, it maybe possible to express Join patterns and such in terms of them.

Large Objects and Strings: The compiler doesnt have any native support for strings. One may treat vectors of characters of strings, but I wonder if I can do better - for example have a fast string-copy. Similarly the compiler works with a fixed size heap at this point. Allocation requests that overshoot this heap size would simply run into a guard page and segfault the system. Support extensible and large object heaps would be nice.

Pattern Language: I would really like to add compilation support for a pattern language. At some level I suspect that pattern languages are more basic than lambdas. The existence of Barry Jay's Pattern calculus strengthens this view. Scheme's best pattern matcher 'match' is added as a macro to the language. It would be nice if the compiler to primitively support a generic enough pattern matcher.

Continuations: I would like to natively support delimited continuations in some form - most likely my yield operator or a more traditional shift-reset. These can however be added nicely into the compilation once the GC and the threading issues have been addressed because those will affect the runtime stack architecture a lot.

Module System: Some sort of a module system.

Macros: Though this is strictly not an addition to the compiler, it would be intersting to support a pattern language of some sort. Some system where the compiler itself was part if the runtime system would be awesome - maybe some approximation of Walid Taha's MetaML. The thing is that MetaML does not have an catamorphism operator and is hence not very useful for describing syntax extensions: maybe this can be coupled with a pattern system in some way.

Types: At some point I would be intersted in adding a type inferencer to the system. The compilation at this point is rather blind and fast. If one attempts to do a primitive addition between a cons cell and a number, the compiler provides absolutely no security - not even a runtime exception. Scheme solution, as is most "dynamic" languages, is to verify the types of things at runtime before doing primitive operations. These checks are expensive and it would be interesting to see to what extent I can substitute these with compile time assurances.

Parser: With all of these additions, the input language is no longer Scheme. Atleast no self respecting Schemer would allow me the pleasure of their company if I said so. This calls for a different input syntax, one that is more amenable to type decorations as well. While writing a parser is a easy, I am fairly stumped by what a good syntax would be. The syntax should also lend itself to macro generation and possibly tot a Haskell-like do-synatx so that I can hide the set! behind monadic state - maybe ST or some such state monad. If you have any good ideas about syntax design, let me know.

As a matter of fact if you are interested in toying with any part of the compiler let me know. Chances are that I am going to disagree with you and reject your code in the first place, but all the same it would be fun to know someone else think.

Credits

I started work on this compiler in the course of Prof. Kent Dybvig's famous compiler class, in Spring 2007. Kent is the author of the amazing Chez Scheme compiler. This compiler would not have happened without Kent and his awesome grad student Abdulaziz Ghulom.

Why is the compiler written in Scheme? The compiler is written in Scheme because Scheme is a beautiful, expressive language. It is my opinion that writting your system in the most expressive medium that you have leads to better and more manageable code. A direct conseuence of having mangeable code is that it encourages one to implement and experiment with more interesting compilation strategies - in time the extra abstraction privded by an expressive language pays off in spades. I have seen too many systems die the death of complexity by being prematurely optimised into a C-like language - in time the code becomes expert-only and eventually just unmanageable.

Download Compiler Source

Wednesday, July 04, 2007 4:50:44 PM (Eastern Standard Time, UTC-05:00)  #    Comments [2]  | 
 Tuesday, July 03, 2007
  Spirit 

Which part of the lack of something can you point at and say, there should have been something here? This is what is missing.

Tuesday, July 03, 2007 11:43:18 AM (Eastern Standard Time, UTC-05:00)  #    Comments [1]  | 
 Saturday, May 05, 2007

Despite my many many associations with Microsoft over the past years, I haven't ever really been to the holy land of Redmond. Over the years I have had one too many missed oppurtunities to go to Seattle/Redmond, each planned visit getting stalled for some reason or the other. So every once in while when some one hears about some association I have had with Redmond and asks me about Seattle I have to politely tell them that I have never really been there. All that is about to change.

But not quiet in the way I expected it would - tomorrow morning I fly to Seattle from where on Monday, the day after, I fly to my host company's HQ at MountainView California. After a week of orientation in sunny California, I fly back to Seattle where I will be spending all summer... with Google.

Strange as it may be, my first trip to the holy land is to work at a different altar. I will be working as an intern at Google, Kirkland this summer. Though Google's is generally secretive about its work, I believe I will be working on something very close to my interests in programming languages. :)

Saturday, May 05, 2007 8:48:43 AM (Eastern Standard Time, UTC-05:00)  #    Comments [7]  | 
 Saturday, April 21, 2007

This semester I decided to do something very different from my usual Programming Languages work and decided to do something AI-ish to relax. The class I chose was Biomorphic Computation by Mike Gasser. The empahsis in this course was on algorithms/techniques for generally "intelligent" behaviour that are heavily biased in favor of what is described as "biological plausibility". It was a fun course and Mike Gasser is a great lecturer. One of the things we are required to do for teh course is to work on some sort of a project.

A colleague (Craig Michaud) and I teamed up to do something fun - we decided to explore the idea of multiple sexes. In nature we mostly encouter creatures that reproduce assexually or reproduce by mating. The argument in favor of mating from an evolutionary standpoint is that mating involves mixing of genes from both parents and hence offspring have qualities of both parents. Thereby the the entire state-space of genomes is more effectively explored by mating as opposed to assexual repoduction. Assexual reproduction relies only on freak-mutation to explore the genome-space. In nature however mating always involves two sexes - if the argument in favor of mating is that that 2 parents means better gene mixing, then why don't we have creatures with larger numbers of sexes? Say 3 sexes or 4.. or even larger numbers.

As caveat note that we are not intersted in the nature of gender roles - we are not interested in males and females and such distinctions, we are only interested in looking at the effect on evolution when a species has n-sexes, without any specific gender roles. In other words an individual of this n-sex species can be either the first parent or the second parent or the third parent and such.

To avoid biasing the experiment heavily in favor of lower numbers of sexes we decided to create a system of egg laying and egg fertilisation to facilitate mating. When its time to mate, a critter looks for an egg in its vicinity to fertilise. If its finds one that has not been parented by it previously, it fertlises the egg. If there are not eggs in the vicinity, the critter lays an egg. That's the basic idea. When I was runnign the experiment today after tunign some parameters, I noticed something that I didnt expect.

I had built in an energy scale up factor for egg-hatching. If a new born critter is required to have x-units of energy, the sum of energy lost my the contributing parents need be only k*x, where k is a scale factor between 0 and 1. What this means is that whenever an egg hatches, the total gain of energy in the system is more than the energy lost. This scale up factor was added with the thought that some eggs might be lost and anyway the energy utilisaion of a full grown parent is usually less than that of a new born, so to compensate a scale-up would be nice.

In the above screen shot the critters are the little pac-man like things. Their relative health is indicated by their opacity. The more transparent ones are really weak. The eggs are the little cirlces. The eggs progress from a bluish-violet color to red, when they are ready to hatch.

The unexpected thing was that creatures started to organise themselves into little breeding clusters consisting of several parents each. I have circled these in a screen shot above. What the clusters do is that they simply sit around and lay eggs and fertilise them. By the time a parent is too exhausted and dies, a child is born taking advanatge of the scale up energy factor. Hence the cluster on the whole survives and even grows, even though the individuals die out. Spooky!

Saturday, April 21, 2007 5:07:49 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Tuesday, April 17, 2007

'Twas in another lifetime, one of toil and blood
When blackness was a virtue and the road was full of mud
I came in from the wilderness, a creature void of form.
"Come in," she said,
"I'll give you shelter from the storm."

And if I pass this way again, you can rest assured
I'll always do my best for her, on that I give my word
In a world of steel-eyed death, and men who are fighting to be warm.
"Come in," she said,
"I'll give you shelter from the storm."

Not a word was spoke between us, there was little risk involved
Everything up to that point had been left unresolved.
Try imagining a place where it's always safe and warm.
"Come in," she said,
"I'll give you shelter from the storm."

I was burned out from exhaustion, buried in the hail,
Poisoned in the bushes an' blown out on the trail,
Hunted like a crocodile, ravaged in the corn.
"Come in," she said,
"I'll give you shelter from the storm."

Suddenly I turned around and she was standin' there
With silver bracelets on her wrists and flowers in her hair.
She walked up to me so gracefully and took my crown of thorns.
"Come in," she said,
"I'll give you shelter from the storm."

Now there's a wall between us, somethin' there's been lost
I took too much for granted, got my signals crossed.
Just to think that it all began on a long-forgotten morn.
"Come in," she said,
"I'll give you shelter from the storm."

Well, the deputy walks on hard nails and the preacher rides a mount
But nothing really matters much, it's doom alone that counts
And the one-eyed undertaker, he blows a futile horn.
"Come in," she said,
"I'll give you shelter from the storm."

I've heard newborn babies wailin' like a mournin' dove
And old men with broken teeth stranded without love.
Do I understand your question, man, is it hopeless and forlorn?
"Come in," she said,
"I'll give you shelter from the storm."

In a little hilltop village, they gambled for my clothes
I bargained for salvation an' they gave me a lethal dose.
I offered up my innocence and got repaid with scorn.
"Come in," she said,
"I'll give you shelter from the storm."

Well, I'm livin' in a foreign country but I'm bound to cross the line
Beauty walks a razor's edge, someday I'll make it mine.
If I could only turn back the clock to when God and her were born.
"Come in," she said,
"I'll give you shelter from the storm."

Shelter from the Storm
Bob Dylan, 1974

Monday, April 16, 2007 11:44:24 PM (Eastern Standard Time, UTC-05:00)  #    Comments [2]  | 
 Wednesday, April 04, 2007

This one of gems that come across mailing lists every now and then:
http://www.haskell.org/mailman/listinfo/haskell

------------

Asked how she had come to choose GHC as the topic for her award-nominated PhD dissertation, freshly graduated doctor of software archeology Simone Tolduso revealed:

"At first, there were a few small curiosities that triggered my interest, like why were darcs patches sent to the cvs-ghc mailinglist, or why did GHC releases traditionally bundle the predecessor of the current Cabal version when the missing libraries depended on its successor?

But then I looked into the repository, with its layers on layers of build systems, source formats, deprecation warnings, directory structure fragments, todo logs, broken builds resulting either from OS-tools advancing and playing havoc with the built-in assumptions of fragile build configurations or from multiple, partially completed, mutually incompatible heart-liver-and-lung transplants supporting the newest language extensions (which of course were all needed to build the compiler branch supporting said features, and whose documentation tended to be spread over user manual, API comments, mailing list threads, research papers, plus half a dozen different Wikis and ticket trackers), supported by often outdated documentation in a never-ending variety of formats, and I knew I had stumbled onto a goldmine.

Not to mention remains of earlier projects (what were fptools, or libraries?), a variety of test and compilation languages (including Haskell, C, Perl, Python, alongside the usual scripting suspects), or the proliferation of sediment layers into user space by the simple, but ingeneous, means of binary incompatibility. In spite of its comparatively small size, the project was beginning to rival the complexities of other Microsoft products of the same period.

In what seems to have been an attempt to push open source ideas to their logical conclusion, you actually had to guess at the right combination of versions for a number of independently evolving toolchains, libraries, OSes, and use those to bootstrap from a consistent snapshot of the compiler, library, and sometimes even tool sources, or nothing works - a situation which was later increasingly exacerbated by the dispersion of the Haskell Cabal replacing coordinated releases. Preliminary mining of the relevant mailinglist and bug tracker archives suggests that binary releases were mainly public data points used to indicate intermediate states of GHC _not_ suitable for specific applications (apart from the obligatory Cabal pre-version lacking the new features needed for installing the extra libraries, other examples include versions of Data.ByteString _not_ based on the famous paper, _not_ supporting essential optimisations, or _not_ supporting API safety fixes). So there seemed to be no way to avoid direct access to the source repositories with their associated build processes and toolchains.

And let us not forget that, unlike the programmers at the time, we are in the fortunate situation of already having complete repositories for the pieces and dependencies involved. Finding matching versions is a non-trivial, but essentially combinatorial exercise, while for them, the process of building GHC would often have involved developing and submitting the patches that make up our repositories of all the pieces of software GHC builds depended on.


We still haven't found the key that enabled the ancients to navigate this labyrinth and to keep their toolchains up to date while still making any progress in their daily work, not to mention recording such progress via darcs (in itself written in Haskell, and not free of troubles). Agent-based simulations of developer communities at arbitrary slices through the repositories show the majority of agents getting stuck in a recursive cycle of installing, debugging, and updating dependency chains without ever reaching a productive state, so we do know that we are missing some crucial information.

Several of my correspondents have come to favour the somewhat controversial theory that the general programmer in those days must have been substantially more intelligent than people are today. And it does make sense, in a way - I mean, if anyone had been the slightest bit bothered by all this complexity, surely someone would have tried to simplify things?

Of course, my work has not all been happy progress: for instance, while there really was an 'evil mangler', the equally persistent rumour that GHC was named after some scottish town has turned out to be a wild goose chase (cf Appendix GC); my colleagues in dirt archeology assure me there was no town called 'glorious'. The 'real' archeologists, as they call themselves, had a field day laughing about my gullability there. Still, there are so many burried treasures in this area - just waiting to be investigated."

Dr Tolduso is currently working on a follow-on project, "Haskell by committee - design and syntax through the ages".

Dept. of Software Archeology, University of New Atlantis
(for immediate release)

Wednesday, April 04, 2007 10:09:10 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, March 24, 2007

Ruby's yield has been traditionally credited to CLU and Icon. I finally found a refrence for CLU's yield statement. The defintion of yield is closely related to their definition of iterator. This is from the CLU manual available at http://www.pmg.lcs.mit.edu/CLU.html

 

A.7.11 Yield Statement

Yield statements may occur only in the body of an iterator. The form of a yield statement is

yield [ ( expression , ... ) ]

It has the effect of suspending operation of the iterator and returning con­trol to the invoking for statement. The values obtained by evaluating the expressions (from left to right) are passed to the for statement to be as­ signed to the corresponding list of identifiers. The type of each expression must be included in the corresponding yield type of the iterator. After the body of the for loop has been executed, execution of the iterator is resumed at the statement following the yield statement.

 

A.8.2 Iterators

An iterator computes a sequence of items, one at a time, where each item is a group of zero or more objects. It has the form

idn = iter [ parms ] args [ yields ] [ signals ] [ where ] 
         routine body
         end idn

where

yields ::= yields ( type spec , ... )

The idn following the end of the iterator must be the same as the idn naming the iterator. In this section we discuss nonparameterized iterators, in which the parms and where clauses are missing. Parameterized modules are discussed in section A.8.4; own variables are discussed in section A.8.5. The form of an iterator is very similar to the form of a procedure. There are only two differences:

1. An iterator has a yields clause in its heading in place of the returns clause of a procedure. The yields clause specifies the number, order, and types of objects yielded each time the iterator produces the next item in the sequence. If zero objects are yielded, the yields clause is omitted.

2. Within the iterator body, the yield statement is used to present the caller with the next item in the sequence. An iterator terminates in the same manner as a procedure, but it may not return any results. An iterator is an object of some iterator type. For a nonparameterized iterator, this type is derived from the iterator heading by removing the iterator name, rewriting the formal argument declarations with one idn per decl, deleting the formal argument names, and, finally, replacing iter by itertype.

An iterator can be invoked only by a for statement. The execution of iterators is described in section A.7.6.

 

A.7.6 For Statement

The only way an iterator can be invoked is by use of a for statement. The iterator produces a sequence of items (where an item is a group of zero or more objects) one item at a time; the body of the for statement is executed for each item in the sequence. The for statement has the form

for [ idn , ... ] in invocation do body end

or

for [ decl , ... ] in invocation do body end

The invocation must be an iterator invocation. The idn form uses previ­ously declared variables to serve as the loop variables, while the decl form introduces new variables, local to the for statement, for this purpose. In either case, the type of each variable must include the corresponding yield type of the invoked iterator. Execution of the for statement proceeds as follows. First the iterator is invoked, and it either yields an item or terminates. If the iterator yields an item, its execution is temporarily suspended, the objects in the item are assigned to the loop variables, and the body of the for statement is executed. The next cycle of the loop is begun by resuming execution of the iterator from its point of suspension. Whenever the iterator terminates, the entire for statement terminates. If the for statement terminates, this also terminates the iterator.

Saturday, March 24, 2007 8:13:31 PM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  |