Wednesday, July 13, 2005

After several months, and several ISPs, TMS is back up and stable. Which means that our blogs are live again.
Thanks Pandu

:)

Wednesday, July 13, 2005 1:03:15 AM (Eastern Standard Time, UTC-05:00)  #    Comments [9]  | 
 Friday, May 27, 2005

sky5.jpg

"I am not sure I want to be perfect and finished. Talk about boredom..."
"Look at the sky," Don said.
"Well, it is always a perfect sky, Don."
"Are you telling me that even though it's changing every second, the sky is always a perfect sky?"

sky1.jpg

sky3.jpg

sky4.jpg

Friday, May 27, 2005 12:28:11 AM (Eastern Standard Time, UTC-05:00)  #    Comments [5]  | 
 Saturday, May 07, 2005

For a long time I have been using this little program misnamed netshortcut. It’s a little command line that pops when you press a key-combination; commands that it supports are configured into a rules file using a rudimentary declarative language that evolved ‘as-appropriate’. Read more about it here

http://www.thinkingms.com/pensieve/homepage/work/netshortcut/netshortcut.htm

 

While I had been using NS for a while, there was this growing frustration that it could be made to do more things – if it had a better designed language underlying it. This frustration continued for a long time, until sometime early last year I got to working it out. I had lots of help from Sid, against whom I would bounce the creations of my genius.

 

Today the notebook in which we had worked out our plans resurfaced after a year. I didn’t want to loose this stuff again, though it was going to be an embarrassment for the future – and hence this blog entry.

 

I remember Sid saying that is some south american there are a sort of monkeys that have really long tails that hang down branches when they sit. And I say “so?”. And he says ‘So maybe you can use that idea in your language’. I ponder that for a while and then figure that I won’t be so useful. I hindsight, maybe Sid was right after all…

 

So here is the language, in wonderfully non formal description -

 

<pattern> {

      do ( <args> ) { <code> }

      is { <value> }

 

      comment <string>

 

      def <ident> {

            render <password | list | text>

            include <filename>

            alias <pattern> <name>

            code { <code> }

            <one or more pattern blocks>

           

}

}

 

That’s it. All programs have to begin in some scope, so this language’s global scope is same as the scope that corresponding to the inside of an def <ident> { <this is the global scope> }.

 

Remember that this language is fundamentally declarative is used primarily as metdata for a command line. The standing idea was the code blocks inside do() {} and is {} would be handled by a regular programming language that allowed interpreter hosting – the idea was to go with ruby then.

 

There is also the notion that braces can be eliminated except for do and is blocks. If a var does not have a do or is block is its value is simple returned back up the tree.

 

Here is a definition for supporting invoking the browser to do a google search. The user at the NS command line would type something like this –

gg <what to search for>

 

Imagine we have a file called library.rb that has

def browser(url)

      # invoke browser with specified url

end

def urlEncode(value)

      #return a urlEncoded version of value

end

 

This will be the NS rules file –

 

include “library.rb”

 

gg {

       do (arg ) { browser(“www.google.com?q=#{arg}”) }

       comment “Search Google”

       def arg {

              * {

                     is { urlEncode(value) }

              }

}

}

 

Note that ‘value’ that is passed to the urlEncode() is a special variable that holds the value of the current match (* matches to anything).

 

Now this does look a little clunky (and yes unintuitive) but with the defaults and with some braces reduction it would look like this –

 

include “library.rb”

 

gg do(arg){ browser(“www.google.com?q=#{urlEncode(arg)}”) } comment “Search Google”

 

Neat?

 

The language actually allows for a LOT of flexibility for controlling what gets executed for what the user types, what comments are displayed, how the UI gets rendered etc. I cant really describel all of that here, but this is the essence -

-          The lowest do() {} block that is satisfied is executed wrt a command.

-          The is{} block at any level computes a value that may serve as a argument value to a do block higher up the tree.

-          The code {} blocks are executed in pre-order (when going down the tree)

-          The do(){} and is{} blocks are executed in post-order after the command is completed.

-          Any {} other than for do, is and code can be skipped. Doing so reduces the rest of the line to the scope of the starting instruction’s block.

 

The new NS is also to support a notion of setting scope. You use

@ <rest of command here>

to set scope. For example, if you are going to be doing a lot of google searches, you do a

@ gg

And from that point on whatever you type at the command line will the value of the arg parameter of the do(){} of the gg pattern.

 

I have not really go down to implementing this yet.

Someday…

Saturday, May 07, 2005 12:32:28 PM (Eastern Standard Time, UTC-05:00)  #    Comments [9]  | 
 Wednesday, April 13, 2005

Refers:

Part 1 - The weekend ‘Scheme’ compiler in C#

Part 2 - Compiling Scheme style function dispatch to IL

 

This is fun. I added variable argument support. So everything is of type void foo(object[] args) now.

 

Effectively I can have the full set of lambda signatures -

(lambda () <code>)

(lambda (a) <code>)

(lambda (a b) <code>)

(lambda a <code>)

(lambda (a . b) <code>)

 

Lambda’s now have a preamble that maps formal parameters and does arity checking. In the cases of lambda’s I populate I create one local variable each to represent each parameter. When there is a variable number of arguments I create a scheme list out of the section of the object[] and assign the list to the local variable.

 

I hope all this is consistent with scheme.

 

Download

Wednesday, April 13, 2005 9:49:55 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 

Refers:

Part1 - The weekend ‘Scheme’ compiler in C#

 

The compiler as of now does not do functions with variable arguments. In scheme functions like + can take any number of arguments.

 

Function Definition and Variables

You right a function – more correctly a lambda in scheme like this (lambda <arguments> <code>) and you bind it to a variable like so

(define foo (lambda <args> <code>))

 

Effectively foo is a variable that can hold any type – at this point of time it is bound to the lambda. The way I represent this in the compiler is by creating foo to be of type object and then assigning to it a delegate instance which will invoke the function generated corresponding to the lambda.

 

The equivalent of

object lambda1(<args>)

{

}

delegate object call_lambda1(<args>);

 

object foo = new call_delegate1(lambda1);

 

This means that every function call via foo, basically causes a type cast from object to the corresponding delegate type and then an invocation via indirection. This is certainly slower than direct function calls, but this is atleast a close enough mapping to the semantics of scheme.

 

Function Dispatch

Looking at the above its easy to see that I will need delegates types of various arities and function dispatch will happen accordingly. So when compiling

(foo 1 2)

I know that foo is some value that is of delegate type that will take two arguments. I don’t really have to know the function that its bound, I just need a guarantee of the arity of the function and then I can invoke it.

 

The above call will get compiled to the following in IL

ldsfld     object Program::foo

castclass  [SchemeLibs]SchemeLibs.call2

ldc.i4     0x1

box        [mscorlib]System.Int32

ldc.i4     0x2

box        [mscorlib]System.Int32

callvirt   instance object [SchemeLibs]SchemeLibs.call2::Invoke(object, object)

- where call2 is a predefined delegate that takes two parameters.

 

Problems with variable argument dispatch

The problem with functions with variable arguments is this. When the compiler sees (foo 1 2)

what can it conclude about the arity of the function? With variable arguments, it can only say that foo is a delegate that points to function that requires not more that 2 mandatory arguments.

 

But that’s not too good in the IL world because the opcodes that you need to call functions with variable arguments are different from the opcodes that take a fixed count of arguments.

 

In the clr world there are two approaches to passing variable arguments to functions that I am aware of.

 

params array

Firstly what C# officially does which is

void foo(params object[] args)

{

      foreach(object arg in args)

      {

      }

}

 

foo(1, 2, 3, 4);

 

If you look at this in IL the compiler basically creates an array in the caller, fills it with the arguments and then calls the function with an array as the parameter.

 

arglist

The second approach - unofficially what C# does is the real way of supporting variable arguments at the CIL level using the IL instruction arglist. Here is an example from Vijay Mukhi’s book on IL –

.method public hidebysig static vararg void abc(int32 i) il managed

{

       .locals (value class [mscorlib]System.ArgIterator V_0)

       ldloca.s   V_0

       //create the arglist object

       arglist

       call       instance void [mscorlib]System.ArgIterator::.ctor

(value class [mscorlib]System.RuntimeArgumentHandle)

       br.s       IL_001d

       //get one argument at a time

       IL_000b:  ldloca.s   V_0

       call       instance typedref [mscorlib]System.ArgIterator::GetNextArg()

       refanyval  [mscorlib]System.Int32

       //do something with the value here

       ldind.i4

       call       void [mscorlib]System.Console::WriteLine(int32)

       //get the next ones index

       IL_001d:  ldloca.s   V_0

       call       instance int32 [mscorlib]System.ArgIterator::GetRemainingCount()

       ldc.i4.0

       bgt.s IL_000b

       ret

}

 

This does have an equivalent in C# via an undocumented (yes!! <insert swear word here>) keyword called __arglist. For more on this take a look at Vijay Mukhi’s discussion on Arrays and Undocumented C# Types and Keywords by Peter Bromberg (a C# MVP).

 

If you look at both of these, you see that the code generator of the compiler has to be able to look at the call (foo 1 2) and determine what instruction to generate. There is really no way I can determine the runtime type of foo. (Yes there are optimization possible in certain limited cases, but in a general sense no).

 

This has me thinking that the only way to solve the problem might be to always treat all function calls as calls with variable arguments. Yes, I know that’s bad. So now function dispatch will have to go through the overhead of

-          type casting a an object to delegate type

-          create a data structure for the variable arguments (for both the params and arglist case)

-          call indirectly via a delegate

 

I am tempted to go with params instead of arglist for implementing variable arguments for a bunch of reasons. Both create some sort of GC managed list structure for extra parameters. Its just that params has a more logical mapping in the C# world – so it would be easier to implement some library functions in C# that take only one argument of the form object[].

 

If you have a better idea about how to do function dispatch that accommodates for variable arguments, let me know.

 

 

Wednesday, April 13, 2005 4:53:37 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Monday, April 11, 2005

This blog isn’t dead, it just smells that way. Actually, we have been having trouble with our ISP – something configuration has changed that causes the worker process in which context out web-application runs, to not have write permissions to the web folder where the entries and comments are stored. So that means that I cannot make blog entries and no one cane post comments. We need to have this sorted out or move to another ISP real quick. For now my solution is to generate the blog entry xml file on my local computer and upload it via ftp. That’s how you are seeing this entry.

 

Now to the real topic of this post - I have been toying around with Scheme for some time now – I would like to say a long time – but considering how long scheme has been around (as opposed to say C#), my time is not that long.

 

Over the weekend (literally) I have made for myself a compiler that compiles a language that has a lot of braces. I would really like to say that it is scheme – it is not – but it is very scheme-like. Why it is not scheme -

-          It is a subset of the language

-          It has some differences in semantics even in the subset that I managed to implement

 

Sometime last Thursday I watched a 90 minute video that shows how one can write a Scheme to C compiler. Sidharth originally pointed me to this. Well, I didn’t watch all 90 minutes – shortly after the 60th minute or so I jumped up all charged up, ready to write a scheme compiler.

 

What the video showed me is how to solve some hard problems in scheme compilation – basically how get rid of closures and continuations. The video shows you how to do a closure conversion and CPS conversion. Those very two issues about scheme compilation that I had nagging for sometime – watching the video took them away.

 

This is the subset of scheme that I implemented so far. It has support for lambda, define, if and lists. In terms of library which is in C# and is extensible and I have +, =, car, cdr, cons, display, remainder, newline etc.

 

So effectively I can compile code like this

(define print (lambda (s)

      (display "Value is ")

      (display s)

      (newline)))

 

(define gcd (lambda (a b)

      (if (= b 0)

            a

            (gcd b (remainder a b)))))

 

(print (gcd 50 70))

 

and compile it to generate a .Net exe that actually runs :)

 

This has been really good fun and a rather creative thinking exercise. I currently plan to take it forward and implement a few more things I am curious about. Its good fun to see what code like this does to your thinking about things.

 

Right now the code flow of the compiler is structured like this –

Scanner -> Parser -> [Abstract Syntax Tree] -> Code generator -> exe

 

The once I have a stable code generator, I am expecting that instead of focusing on being able to compile every construct in scheme, I can translate constructs down into the simpler constructs which the code generator already handles.

 

For example I don’t intend to compile a ‘cond’ statement – instead I would take the AST that has the ‘cond’ and transform it into an AST which can the ‘cond’ replaced with a nested ‘if’. Similar closure conversion and CPS conversion (continuation passing style conversion) would be other AST transforms.

 

So the code flow would be like this –

Scanner -> Parser -> [AST] -> AST Transforms -> [AST] -> Code generator -> exe

 

Once I have the present code a little stabilized and I have a framework for doing tree transforms, I should have a fairly complete implementation of a scheme like language.

 

 

Things I cant solve -  

Why I say scheme-like language is because I doubt if I will ever actually implement a whole standards compliant system – simply for the sheer effort. That aside I presently don’t know enough to be able to compile some aspects of the language.

 

In scheme not many things have special status. As an example variable name is simply a binding to some value.

 

So I can say -

(define x (lambda (a) (+ a 10)))

which binds variable x to a lambda (a function) that takes a parameter and adds 10 to it and returns it.

 

Now I can call it and assign the result to another variable ‘a’ -

(define a (x 10))

 

Now in the same program I can define x to be something else, say just a number…

So typing of variables go for a toss. Similarly I cannot really hardcode any function calls in the generated code – I have to indirect them through a variable that holds a delegate, because at runtime I cannot really know which function that variable is bound to. But those are solvable problems.

 

The hard problem comes up when you deal with what scheme calls special forms. As an example ‘if’ is a special form.

(if (= a b) (display a) (display b))

Here either value of ‘a’ is displayed, or value of b is displayed.

Now it is not possible to implement ‘if’ using a function call. Because whenever you call a function, all the arguments to the function are fully evaluated. So if there was a function called ‘if’ the condition, then and else parts would be evaluated before the ‘if’ function is called. So in this case the user would see both ‘a’ and ‘b’ being displayed. So ‘if’ has to be treated differently by the compiler and I have to generate test and branch instructions for the ‘if’.

 

The problem comes in the fact that the user can perfectly well say something like this –

(define if 10)

So from now on ‘if’ is 10.. that’s crazy because that’s a runtime condition, whereas my compiler has already generated test/branch code for the if.

 

When writing a compiler and I see ‘if’ I emit intermediate code which tests a condition and braches to the right code block for execution. If at runtime the meaning of the ‘if’ itself changes then the branching and compile time generated code has no meaning.

 

You may argue that in the case of ‘if’, I can solve the problem by having variable that indicates the current value of ‘if’ and at compile time I can generate code for an additional check with this variable. However that’s a limited hack… and it doesn’t explain what I will do when at runtime the meaning of ‘define’ changes or the meaning of ‘lambda’ changes.

 

I cant solve this problem right now, I don’t know how to – so I have given define, lambda and if a special ‘keyword’ like status – similar to keywords in other languages where keywords cannot be redefined to mean something else at runtime.

I don’t know if this is a problem I will be solving also.

 

Things I hope I can implement

-          closures

-          continuations

-          more special forms (list, quote)

-          a better function dispatch mechanism (generate the delegates types)

-          tail calls

-          cond, let, letrec

 

Here is my early ‘over-the-weekend’ quality scheme compiler for download. Use with caution.

This also requires the elusive .Net 2.0. I am not releasing source just yet, but it will come.

 

However, if you are looking for an actual full fledged compiler scheme compiler for .Net GotDotNet lists the following -

Scheme

Northwestern University Hotdog Scheme

Scheme

Tachy (Scheme-like) language

Scheme

Indiana University Scheme.NET

You may have more luck with these.

 

You may not be able to leave comments on my blog right now.

Do send me mail for now roshan -dot- james -at- gmail -dot- com.

 

Monday, April 11, 2005 9:29:27 AM (Eastern Standard Time, UTC-05:00)  #    Comments [2]  | 
 Tuesday, February 01, 2005

In the past weeks I have been around Hyderabad a bit – traveling mostly when I can take some time out in the weekends. I have also been playing around with my new Sony P150. I just thought that I should take some time and put out pictures.

 

 

This is a close up breakfast – sausages with lots of veggies – blurred through the steam droplets under the glass lid.

 

 

A shot of the famous Charminar.

 

 

Pigeons fly about one of the minarets of the Mecca Masjid.

 

 

The Mecca Masjid again. Lovely place. Peaceful.

We spent a sunset there this weekend. Sid had been visiting us from Bangalore.

 

 

Speaking of pigeons this is a shot of pigeons flying over one of the old minarets (of which you find several) in the old Hyderabad area.

 

 

 

Microsoft shifted to its new campus in Hyderabad this weekend. I now have a new cubicle. The new campus area is lovely. Open barren land with a few large software shops littering the rocky landscape.

 

 

 

 

The above shots are from the Nagarjun Sagar dam – a 3 hour drive from Hyderabad in one of the AP tourism buses. Had been fun.

 

 

One of the marble lions that are on guard outside the Salarjung museum.

 

Tuesday, February 01, 2005 4:38:02 AM (Eastern Standard Time, UTC-05:00)  #    Comments [5]  | 
 Monday, January 17, 2005

I recently had some C# code that that had to be made localizable. Most articles about localization/internationalization that you find on the web would talk about how nice Visual Studio is for code internationalization and would show nice examples of how many ways the forms-designer would extract code out into a resx file. I am perfectly ok with studio doing all the work for you. However there are very often, strings in your actual code that studio does not externalize to resx files.

 

Strings.rb is a ruby script that will parse your C# code base and identify literal string definitions in the code base and will move them to your resx file. The code was hacked up to fill out a personal need so your mileage on this may vary. The tool certainly isn’t fool proof and there are certain cases that it doesn’t handle too well. If you are however on the smart-scripter side of things then you may find it useful.

 

The script needs to be setup for your specific project. Once done you can run it several times on your code base and it can incrementally catch strings and externalize them for you. This is handy to have while your code is still undergoing changes so new strings can be identified as they pop up and can be moved out.

 

Getting Started

 

Downloads

1) First thing download the script (strings.rb) and put it in your project folder.

 

2) Download and install ruby from here – http://rubyforge.org/frs/?group_id=167, its about 12mb and the installation happens in a snap.

 

3) Download an install REXML library for XML handling in Ruby from here –

http://www.germane-software.com/archives/rexml_3.1.2.zip

http://www.germane-software.com/software/rexml/docs/tutorial.html

 

 

Patching Strings.rb for your project

1) You need to patch the script file to have the correct path to your resx file and the path to your wrapper class that will be used to read strings from your resx file.

 

Open the script file in a text editor. (If you have ruby installed you should find this editor called scite in the ruby installation folder – that’s a nice editor. Alternately you might want to try installing scite - http://scintilla.sourceforge.net/SciTEDownload.html - about 600k).

 

In your project identify your resx file. It will usually be in Properties\Resources.resx.

Change the following line the rb file to reflect the path path to your resx file.

strings.rb:4:$resx_fn = "properties/Resources.resx"

(The actual line number might change a bit)

 

2) Now create a new class in your project called Strings. VS should typically create an empty class definition file that looks like this.

 

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

 

#endregion

 

namespace <Some Namespace>

{

    public class Strings

    {

 

 

    }

}

 

Patch the file with the following additions

- Add a using directive for your ‘Properties’ namespace.

- Add a comment that stays //start and one that says //stop. These ad as delimiters between with the script will generate the string definitions.

 

 

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

using <Some namespace>.Properties;

 

#endregion

 

namespace <Some Namespace>

{

    public class Strings

    {

 

//start

//stop

 

    }

}

 

3) This is the wrapper class into which the script will generate string definitions. You need to patch the script with the path to this class file. Basically patch this line –

strings.rb:5:$stringsclass_fn = "helper/Strings.cs"

 

Done

If you have got this far then your installation is done and you are ready to go.

For sake of completeness let me just list out things again –

1) download the script and put it into the project folder

2) install ruby

3) install the REXML library for Ruby

4) patch the script with the path to the resx file of the project

5) create a empty Strings class and add the namespace directive and comment markers to it

6) patch the script to have the correct path to your Strings.cs file.

 

What does the script do?

The script does a few basic things.

1) it parses your *.cs files in all subdirectories and looks for strings.

2) when it finds a string a it prompts the user for an action

3) if it is a string that should be localized the user can provide a pseudonym for the string. On getting this name the script will -

            1) add the string and the name to the resx file

            2) add a property to the Strings class that will read the string from the rex file

            3) replace the string literal in the code with a call to the property.

 

Running the script

To run the script after all the previous setup, simply go to the command line and type strings.rb

 

Here is a sample run of the Strings.rb script

Let me take up a simple project and show you how the internationalization script works.

 

Here is a project that has only one Program.cs file –

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

 

#endregion

 

namespace ConsoleApplication1

{

    class Program

    {

        static void Main(string[] args)

        {

            string a = "hello world";

            string x = "skip this line";

            string b = "escape sequences  \n\r\t\\\"";

            string c = @"cant handle this one";

        }

    }

}

 

The resx file looks like this –

<?xml version="1.0" encoding="utf-8"?>

<root>

  <resheader name="resmimetype">

    <value>text/microsoft-resx</value>

  </resheader>

  <resheader name="version">

    <value>2.0</value>

  </resheader>

  <resheader name="reader">

    <value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>

  </resheader>

  <resheader name="writer">

    <value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>

  </resheader>

</root>

(I have removed some unnecessary details from the original resx file here)

 

I created this Strings class –

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

using ConsoleApplication1.Properties;

 

#endregion

 

namespace ConsoleApplication1

{

    public class Strings

    {

 

//start

//stop

 

    }

}

 

This is what happens when you run the strings.rb script –

C:\work\vcsexpress\Sample1\Sample1>strings

Error reading skip data! continuing with no skip data.

HelloString = hello world

EscString = escape sequences  \n\n\t\\\"

Program.cs:0:n++#region Using directives

Program.cs:1:

Program.cs:2:using System;

Program.cs:3:using System.Collections.Generic;

Program.cs:4:using System.Text;

Program.cs:5:

Program.cs:6:#endregion

Program.cs:7:

Program.cs:8:namespace ConsoleApplication1

Program.cs:9:{

Program.cs:10:    class Program

Program.cs:11:    {

Program.cs:12:        static void Main(string[] args)

Program.cs:13:        {

Program.cs:14:            string a = "hello world";

"hello world">?

Help ----------

        =<name> = the string will be externalised as <name>

        sf = skip file : file will not processed on next run

        if = ignore file : file will be processed on next run

        sl = skip line : line will be processed on next run

        il = ignore line : line will be processed on next run (default)

        x, exit = exit script

        all skip information in stored in "skip_list.txt"

Program.cs:14:            string a = "hello world";

"hello world">=HelloString

            string a = Strings.HelloString;

Program.cs:15:            string x = "skip this line";

"skip this line">sl

Program.cs:16:            string b = "escape sequences  \n\r\t\\\"";

"escape sequences  \n\r\t\\\"">=EscString

            string b = Strings.EscString;

Program.cs:17:            string c = @"cant handle this one";

Program.cs:18:        }

Program.cs:19:    }

Program.cs:20:}

Writing Resource File "properties/Resources.resx" : done

Writing Strings class "Strings.cs" : done

Writing Skip data "skip_list.txt" : done

 

Effectively you can see the script run through the source file (actually it runs through all the cs files) and prompt you with each string. It also shows a little help on the actions possible.

 

To replace a string, you need to give it a name. Simply type =<name> and the string will get replaced.

 

If you don’t want to do anything about a particular line, type ‘sl’ for skip line and it will skip that line. It also adds the line to a file called skip_file.txt so that in subsequent runs of strings.rb it will not keep prompting you to patch the same line.

 

You can similarly choosing skip a file using the ‘sf’ option. You may typically want to skip the *.designer.cs files, the strings.cs file etc.

 

All skip information is human readable and is stored in a text file called skip_list.txt.

 

Strings.rb is deisgned to be run multiple times over the sample project through its development so that it can catch new strings as they appear in your code base, incrementally. The resx and strings.cs files are recreated at each run.

 

To show you the output of the process, this is what happened.

 

This is the new Program.cs file –

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

 

#endregion

 

namespace ConsoleApplication1

{

    class Program

    {

        static void Main(string[] args)

        {

            string a = Strings.HelloString;

            string x = "skip this line";

            string b = Strings.EscString;

            string c = @"cant handle this one";

        }

    }

}

 

This is the new resx file –

<?xml version="1.0"?>

<root>

  <resheader name="resmimetype">

    <value>text/microsoft-resx</value>

  </resheader>

  <resheader name="version">

    <value>2.0</value>

  </resheader>

  <resheader name="reader">

    <value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>

  </resheader>

  <resheader name="writer">

    <value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>

  </resheader>

  <data name="HelloString">

    <value xml:space="preserve">hello world</value>

  </data>

  <data name="EscString">

    <value xml:space="preserve">escape sequences 

 

       \"</value>

  </data>

</root>

 

Notice that the two strings have appeared here.

 

And this is the new Strings.cs file –

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

using ConsoleApplication1.Properties;

 

#endregion

 

namespace ConsoleApplication1

{

    public class Strings

    {

 

//start

              // "escape sequences  \n\r\t\\\""

              public static string EscString { get { return Resources.ResourceManager.GetString("EscString"); } }

 

              // "hello world"

              public static string HelloString { get { return Resources.ResourceManager.GetString("HelloString"); } }

 

//stop

 

    }

}

 

Also, if you are interested in seeing the skip data, this is the skip_list.txt that got created –

Program.cs:::string x = "skip this line";

 

Limitations

1) The string matching that is done by the script is fairly limited. Basically it identifies strings in the the c# code by comparing with the following regex –

strings.rb:15:$string_pattern = /[^@]("(\\.|[^\\"])*")/

This does not cleanly cover all sorts of escape sequences that a string can have. It also does not support @””. But .. well… this covers large number of strings that you would face, so its good enough to get along. Also if you can get me a better pattern match, I would be happy.

 

The script iterates over all strings on a line of cs code using –

      line.scan($string_pattern).each {|str,e1|

            //str is the string

      }

 

 

2) The resx file tags that are generated by script are those that are valid for Visual C# Express Edition Beta 1 format. I don’t know if this resx format is valid for other versions of studio. I would expect that it is. Even if it is not, you can easily patch it for you version of studio. This is how –

 

The resx file has a tag added for each string definition that looks like this –

  <data name="HelloString">

    <value xml:space="preserve">Hello world</value>

  </data>

 

If your studio generates tags like this, then you are ok. If you are not just patch the following block of ruby code to generate your tags. It’s fairly easy –

            el = doc.root.add_element "data"

            el.add_attribute("name", key)

            val = el.add_element("value")

            val.add_attribute("xml:space","preserve")

            val.text = remove_esc_seq($map[key])

This is part of the writeresx() function.

 

3) The escape sequence handling in the script is a hack – its funny – it’s limited. It’s actually a little sad:

def add_esc_seq(str)

       str.gsub("\\", "<double_back_slash>").gsub("\"", "\\\"").gsub("\n", "\\n").gsub("\t", "\\t").gsub("\r", "\\r").gsub("<double_back_slash>", '\\\\\\')

end

 

def remove_esc_seq(str)

       str.gsub("\\\\","<back_slash>").gsub("\\n", "\n").gsub("\\t", "\t").gsub("\\r", "\r").gsub("\\\"", "\"").gsub("<back_slash>","\\")

end

 

These are however good enough for \r \n \t \\ \” etc.

 

4) The resx XML doesn’t look too nice. It works however. This is because the REXML library produces badly formatted XML. You can download the XML Pretty Printing program on mine and run it on the output resx file for pretty XML formatting.

 

5) “The setup is a little contrived and all this requires me to know ruby programming “

If you actually said that then this script is not for you. For the simple reason that this is something home-grown and not meant to be a polished product in any way. You don’t need to know ruby much to just get it working. You need to know ruby only if you need to extend it in non-obvious ways. Secondly the setup isn’t that contrived if you have been using ruby. You would, most likely, have most of the tools in place already.

 

Finally, Why Ruby?

My only real answer to the question is that I wanted to get the job done. For an example take a look at the engine code and peaceful separation that it gives me from the prompt/ui code.  

 

That’s it. So if you are geeky enough and consider it below your dignity to get down to doing a menial job of looking through source files and copying out strings to the resx files – then this script might help you.

 

Download Strings.rb

 

Ps. It’s a lot of effort documenting any ruby program that is more that 200 lines. It just does too many things.

 

Monday, January 17, 2005 8:40:18 AM (Eastern Standard Time, UTC-05:00)  #    Comments [9]  |