Monday, January 17, 2005

I recently had some C# code that that had to be made localizable. Most articles about localization/internationalization that you find on the web would talk about how nice Visual Studio is for code internationalization and would show nice examples of how many ways the forms-designer would extract code out into a resx file. I am perfectly ok with studio doing all the work for you. However there are very often, strings in your actual code that studio does not externalize to resx files.

 

Strings.rb is a ruby script that will parse your C# code base and identify literal string definitions in the code base and will move them to your resx file. The code was hacked up to fill out a personal need so your mileage on this may vary. The tool certainly isn’t fool proof and there are certain cases that it doesn’t handle too well. If you are however on the smart-scripter side of things then you may find it useful.

 

The script needs to be setup for your specific project. Once done you can run it several times on your code base and it can incrementally catch strings and externalize them for you. This is handy to have while your code is still undergoing changes so new strings can be identified as they pop up and can be moved out.

 

Getting Started

 

Downloads

1) First thing download the script (strings.rb) and put it in your project folder.

 

2) Download and install ruby from here – http://rubyforge.org/frs/?group_id=167, its about 12mb and the installation happens in a snap.

 

3) Download an install REXML library for XML handling in Ruby from here –

http://www.germane-software.com/archives/rexml_3.1.2.zip

http://www.germane-software.com/software/rexml/docs/tutorial.html

 

 

Patching Strings.rb for your project

1) You need to patch the script file to have the correct path to your resx file and the path to your wrapper class that will be used to read strings from your resx file.

 

Open the script file in a text editor. (If you have ruby installed you should find this editor called scite in the ruby installation folder – that’s a nice editor. Alternately you might want to try installing scite - http://scintilla.sourceforge.net/SciTEDownload.html - about 600k).

 

In your project identify your resx file. It will usually be in Properties\Resources.resx.

Change the following line the rb file to reflect the path path to your resx file.

strings.rb:4:$resx_fn = "properties/Resources.resx"

(The actual line number might change a bit)

 

2) Now create a new class in your project called Strings. VS should typically create an empty class definition file that looks like this.

 

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

 

#endregion

 

namespace <Some Namespace>

{

    public class Strings

    {

 

 

    }

}

 

Patch the file with the following additions

- Add a using directive for your ‘Properties’ namespace.

- Add a comment that stays //start and one that says //stop. These ad as delimiters between with the script will generate the string definitions.

 

 

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

using <Some namespace>.Properties;

 

#endregion

 

namespace <Some Namespace>

{

    public class Strings

    {

 

//start

//stop

 

    }

}

 

3) This is the wrapper class into which the script will generate string definitions. You need to patch the script with the path to this class file. Basically patch this line –

strings.rb:5:$stringsclass_fn = "helper/Strings.cs"

 

Done

If you have got this far then your installation is done and you are ready to go.

For sake of completeness let me just list out things again –

1) download the script and put it into the project folder

2) install ruby

3) install the REXML library for Ruby

4) patch the script with the path to the resx file of the project

5) create a empty Strings class and add the namespace directive and comment markers to it

6) patch the script to have the correct path to your Strings.cs file.

 

What does the script do?

The script does a few basic things.

1) it parses your *.cs files in all subdirectories and looks for strings.

2) when it finds a string a it prompts the user for an action

3) if it is a string that should be localized the user can provide a pseudonym for the string. On getting this name the script will -

            1) add the string and the name to the resx file

            2) add a property to the Strings class that will read the string from the rex file

            3) replace the string literal in the code with a call to the property.

 

Running the script

To run the script after all the previous setup, simply go to the command line and type strings.rb

 

Here is a sample run of the Strings.rb script

Let me take up a simple project and show you how the internationalization script works.

 

Here is a project that has only one Program.cs file –

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

 

#endregion

 

namespace ConsoleApplication1

{

    class Program

    {

        static void Main(string[] args)

        {

            string a = "hello world";

            string x = "skip this line";

            string b = "escape sequences  \n\r\t\\\"";

            string c = @"cant handle this one";

        }

    }

}

 

The resx file looks like this –

<?xml version="1.0" encoding="utf-8"?>

<root>

  <resheader name="resmimetype">

    <value>text/microsoft-resx</value>

  </resheader>

  <resheader name="version">

    <value>2.0</value>

  </resheader>

  <resheader name="reader">

    <value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>

  </resheader>

  <resheader name="writer">

    <value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>

  </resheader>

</root>

(I have removed some unnecessary details from the original resx file here)

 

I created this Strings class –

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

using ConsoleApplication1.Properties;

 

#endregion

 

namespace ConsoleApplication1

{

    public class Strings

    {

 

//start

//stop

 

    }

}

 

This is what happens when you run the strings.rb script –

C:\work\vcsexpress\Sample1\Sample1>strings

Error reading skip data! continuing with no skip data.

HelloString = hello world

EscString = escape sequences  \n\n\t\\\"

Program.cs:0:n++#region Using directives

Program.cs:1:

Program.cs:2:using System;

Program.cs:3:using System.Collections.Generic;

Program.cs:4:using System.Text;

Program.cs:5:

Program.cs:6:#endregion

Program.cs:7:

Program.cs:8:namespace ConsoleApplication1

Program.cs:9:{

Program.cs:10:    class Program

Program.cs:11:    {

Program.cs:12:        static void Main(string[] args)

Program.cs:13:        {

Program.cs:14:            string a = "hello world";

"hello world">?

Help ----------

        =<name> = the string will be externalised as <name>

        sf = skip file : file will not processed on next run

        if = ignore file : file will be processed on next run

        sl = skip line : line will be processed on next run

        il = ignore line : line will be processed on next run (default)

        x, exit = exit script

        all skip information in stored in "skip_list.txt"

Program.cs:14:            string a = "hello world";

"hello world">=HelloString

            string a = Strings.HelloString;

Program.cs:15:            string x = "skip this line";

"skip this line">sl

Program.cs:16:            string b = "escape sequences  \n\r\t\\\"";

"escape sequences  \n\r\t\\\"">=EscString

            string b = Strings.EscString;

Program.cs:17:            string c = @"cant handle this one";

Program.cs:18:        }

Program.cs:19:    }

Program.cs:20:}

Writing Resource File "properties/Resources.resx" : done

Writing Strings class "Strings.cs" : done

Writing Skip data "skip_list.txt" : done

 

Effectively you can see the script run through the source file (actually it runs through all the cs files) and prompt you with each string. It also shows a little help on the actions possible.

 

To replace a string, you need to give it a name. Simply type =<name> and the string will get replaced.

 

If you don’t want to do anything about a particular line, type ‘sl’ for skip line and it will skip that line. It also adds the line to a file called skip_file.txt so that in subsequent runs of strings.rb it will not keep prompting you to patch the same line.

 

You can similarly choosing skip a file using the ‘sf’ option. You may typically want to skip the *.designer.cs files, the strings.cs file etc.

 

All skip information is human readable and is stored in a text file called skip_list.txt.

 

Strings.rb is deisgned to be run multiple times over the sample project through its development so that it can catch new strings as they appear in your code base, incrementally. The resx and strings.cs files are recreated at each run.

 

To show you the output of the process, this is what happened.

 

This is the new Program.cs file –

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

 

#endregion

 

namespace ConsoleApplication1

{

    class Program

    {

        static void Main(string[] args)

        {

            string a = Strings.HelloString;

            string x = "skip this line";

            string b = Strings.EscString;

            string c = @"cant handle this one";

        }

    }

}

 

This is the new resx file –

<?xml version="1.0"?>

<root>

  <resheader name="resmimetype">

    <value>text/microsoft-resx</value>

  </resheader>

  <resheader name="version">

    <value>2.0</value>

  </resheader>

  <resheader name="reader">

    <value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>

  </resheader>

  <resheader name="writer">

    <value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>

  </resheader>

  <data name="HelloString">

    <value xml:space="preserve">hello world</value>

  </data>

  <data name="EscString">

    <value xml:space="preserve">escape sequences 

 

       \"</value>

  </data>

</root>

 

Notice that the two strings have appeared here.

 

And this is the new Strings.cs file –

#region Using directives

 

using System;

using System.Collections.Generic;

using System.Text;

using ConsoleApplication1.Properties;

 

#endregion

 

namespace ConsoleApplication1

{

    public class Strings

    {

 

//start

              // "escape sequences  \n\r\t\\\""

              public static string EscString { get { return Resources.ResourceManager.GetString("EscString"); } }

 

              // "hello world"

              public static string HelloString { get { return Resources.ResourceManager.GetString("HelloString"); } }

 

//stop

 

    }

}

 

Also, if you are interested in seeing the skip data, this is the skip_list.txt that got created –

Program.cs:::string x = "skip this line";

 

Limitations

1) The string matching that is done by the script is fairly limited. Basically it identifies strings in the the c# code by comparing with the following regex –

strings.rb:15:$string_pattern = /[^@]("(\\.|[^\\"])*")/

This does not cleanly cover all sorts of escape sequences that a string can have. It also does not support @””. But .. well… this covers large number of strings that you would face, so its good enough to get along. Also if you can get me a better pattern match, I would be happy.

 

The script iterates over all strings on a line of cs code using –

      line.scan($string_pattern).each {|str,e1|

            //str is the string

      }

 

 

2) The resx file tags that are generated by script are those that are valid for Visual C# Express Edition Beta 1 format. I don’t know if this resx format is valid for other versions of studio. I would expect that it is. Even if it is not, you can easily patch it for you version of studio. This is how –

 

The resx file has a tag added for each string definition that looks like this –

  <data name="HelloString">

    <value xml:space="preserve">Hello world</value>

  </data>

 

If your studio generates tags like this, then you are ok. If you are not just patch the following block of ruby code to generate your tags. It’s fairly easy –

            el = doc.root.add_element "data"

            el.add_attribute("name", key)

            val = el.add_element("value")

            val.add_attribute("xml:space","preserve")

            val.text = remove_esc_seq($map[key])

This is part of the writeresx() function.

 

3) The escape sequence handling in the script is a hack – its funny – it’s limited. It’s actually a little sad:

def add_esc_seq(str)

       str.gsub("\\", "<double_back_slash>").gsub("\"", "\\\"").gsub("\n", "\\n").gsub("\t", "\\t").gsub("\r", "\\r").gsub("<double_back_slash>", '\\\\\\')

end

 

def remove_esc_seq(str)

       str.gsub("\\\\","<back_slash>").gsub("\\n", "\n").gsub("\\t", "\t").gsub("\\r", "\r").gsub("\\\"", "\"").gsub("<back_slash>","\\")

end

 

These are however good enough for \r \n \t \\ \” etc.

 

4) The resx XML doesn’t look too nice. It works however. This is because the REXML library produces badly formatted XML. You can download the XML Pretty Printing program on mine and run it on the output resx file for pretty XML formatting.

 

5) “The setup is a little contrived and all this requires me to know ruby programming “

If you actually said that then this script is not for you. For the simple reason that this is something home-grown and not meant to be a polished product in any way. You don’t need to know ruby much to just get it working. You need to know ruby only if you need to extend it in non-obvious ways. Secondly the setup isn’t that contrived if you have been using ruby. You would, most likely, have most of the tools in place already.

 

Finally, Why Ruby?

My only real answer to the question is that I wanted to get the job done. For an example take a look at the engine code and peaceful separation that it gives me from the prompt/ui code.  

 

That’s it. So if you are geeky enough and consider it below your dignity to get down to doing a menial job of looking through source files and copying out strings to the resx files – then this script might help you.

 

Download Strings.rb

 

Ps. It’s a lot of effort documenting any ruby program that is more that 200 lines. It just does too many things.

 

Monday, January 17, 2005 8:40:18 AM (Eastern Standard Time, UTC-05:00)  #    Comments [8]  | 
 Thursday, December 02, 2004

Here is another command line tool. Strangely, a couple of quick web searches could not come up with a command line tool for resizing images – so I wrote my own. If you have photographs from a digital camera that you want to mail out and the images are too large for email then most of the time it involves taking each image to some sort of image editing software and resizing them and such.

 

Image Manipulation Utility v0.1

(c) Roshan James, Dec 1 2004

 Img v0.1 is built on the .Net 2.0 GDI+ API and supports only creation

 of JPG image files. Exif/Iptc metadata are lost during convertions.

 

Syntax:

     imgmanip [/S] < filepattern> [additional patterns] < image size>

         /S               - recurse subdirs

         < filepattern>    - any wildcard combination

         < image size>     - format < Width>x< Height>, Ex: 800x600

 

(Don’t tell me it looks cheesy – I know it does – but it solves the problem)

A part of this source I found on the web, so appropriate mention is given to the original article.

 

Here are a few usage examples

 

> img *.jpg 800x600

File1.800x600.jpg

File2.800x600.jpg

This basically converts all jpf files to images of 800 * 600 resolution.

 

To recursively change

> img /S *.jpg 800x600

File1.800x600.jpg

File2.800x600.jpg

Simple?

 

If the original images have any metadata information then they are not retained in the new ones. What is this? Well most cameras insert information about the camera into the image file. You can also add your custom information like a title or description or comments to the image. To see this information (on a WinXP) simple right click the image file and take a look at the properties -> summary tab. Also if you tinker around with the column settings of explorer in detail view you can display some of this info directly in explorer.

 

I can think of a bunch of simple useful things to add – format conversions, cropping, borders, grayscale etc. Lets see…

 

The code is simple usage of .Net GDI+ API. The download exe is compiled to .Net 2.0 – but you can recompile from source to the version you want. For compilation run the following from a .Net SDK 2.0 command line –

>csc img.cs

 

Download

 

Speaking of image metadata, if you are a Ruby programmer, take a look at the exif library available. EXIF is a metadata tagging standard for image files.

http://raa.ruby-lang.org/list.rhtml?name=rexif

Thursday, December 02, 2004 1:14:33 AM (Eastern Standard Time, UTC-05:00)  #    Comments [0]  | 
 Saturday, June 19, 2004

A few days back I found what seemed to be a book about Ruby. This was being discussed on the Ruby mailing list. It’s called “A Little Ruby” or more precisely “A Little Ruby. A Lot of Objects”. You can find it here:

http://web.archive.org/web/20030618203059/visibleworkings.com/little-ruby/

(Someday it will be available here: http://www.visibleworkings.com/little-ruby/ )

 

Instead of writing the whole thing myself or copy paste it, I ask you to simply go read the book. That is my blog entry for the day.

 

The “Little Ruby” book is a conversation between two people where some sublime ideas about the design philosophy of the Ruby language are discussed. The book itself is a pleasure to read and more importantly, to think about. (It is an incomplete book, only 3 chapters – the author Brain Marick said on the Ruby list that he hopes to complete it sometime).

 

Reading “Little Ruby” put in a phrase in my thinking – “Model of Computation”, I don’t know if this sounds sober, but I think this is what I am really looking for.

In all my tinkering around languages, compilers, runtimes and other things – I am looking for a Model of Computation, a fundamental set of programmatic thought abstractions that are beautiful and can encompass various forms of programming.

 

The Little Ruby book talks about a model of computation where all computation is simply built around the idea of passing messages to objects. It is a simple concrete idea with which the rest of the Ruby world is built (apart of syntactic sugar). I don’t know if you are used to thinking in this way – but it is a powerful form of thought.

 

Let me quote from one of the conversation toward the end of the third chapter (the last chapter that is written so far):

 

“A language that provides lots of features

will always be missing that one feature you

need.”

 

“But a language that chooses the right

simple rules for you to combine lets you

build the features you need.”

 

This is the basic idea of composition – small integral units that compose to produce powerful behavioral entities. Have you ever thought why a unix command shell guy never really thought much of a Win/Dos user – because somewhere the way the shell forces you to thinking terms of composition of small do-one-thing-well tools and create powerful meta-tools, is a greater thought pattern.

 

You might have heard this being said about tools in the old unix culture (I say ‘old’ because I have different opinions of ‘unix’ culture as it is now)

 

"This is the Unix philosophy. Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

--Doug McIlroy

 

The “Little Ruby” book is inspired by the old book “The Little LISPer”. Something that is now on my reading list – I can’t seem to get a copy of this anywhere. The present edition of the book is called “The Little Schemer”. The book is co written by Prof Daniel P Friedman of Indiana University and Prof Matthias Felleisen of Rice University. The Little Schemer discusses a different model of computation from what the “Little Ruby” describes.

 

I did not know this then, but sometime last year I was in email correspondence with Prof Friedman. That time, had I known that he is author of a respected LISP text book, I might have been frightened off the prospect of asking this -  but in one of the mails I had asked “why Lisp?”

 

Roshan,

 

The most fundamental building block of computation is composition. If the language does not support composition in a trivial way, then I have no use for it.  ML, Haskell, LISP, and Scheme each give a kind of composition.  Composition is the building block of Category Theory, which is a unifying tool that helps clarify much of mathematics. and logic.  So, thinking that it would be okay to use a language that does not support composition is impossible for me.

 

(I quote this here presently without his permission, I believe he would be ok though).

I didn’t understand him then. But now after a year, I think I am closer to understanding him.

 

What would a unified model of computation be? Can such a thing exist? Can we think of all computation using a set of minimal and powerful abstraction such that every other form of computation can be built out of it. Can this be one that is easy and fun to use that we could interact with this force on a day to day basis.

 

And what forms the underlying foundation for computation then might also form the underlying basis for other systems of organized thought as well. This is like the dream of Grand Unified Field Theory in physics. Can something like that exist in the computational systems as well?

 

I don’t know enough to guess. But however I believe that as long we keep pursuing computing in a way that is fun and simple, we are probably on the right track.

 

 

To end this entry I want to quote from the preface of the little ruby:

 

Welcome to my little book. In it, my goal is to teach you a way to think about computation, to show you how far you can take a simple idea: that all computation consists of sending messages to objects. Object-oriented programming is no longer unusual, but taking it to the extreme - making everything an object - is still supported by only a few programming languages.

 

Can I justify this book in practical terms? Will reading it make you a better programmer, even if you never use "call with current continuation" or indulge in "metaclass hackery"? I think it might, but perhaps only if you're the sort of person who would read this sort of book even if it had no practical value.

 

The real reason for reading this book is that the ideas in it are neat. There's an intellectual heritage here, a history of people building idea upon idea. It's an academic heritage, but not in the fussy sense. It's more a joyous heritage of tinkerers, of people buttonholing their friends and saying, "You know, if I take that and think about it like this, look what I can do!"

 

As a closing note, sometime last year I was looking to do research under someone working with the SSCLI code base and work on virtual machines and runtimes. I wanted to do my Masters.

 

At that time the best way I could describe what I wanted to do was to say that I was looking runtimes and virtual machines research with a specific interest in SSCLI. Now, maybe I can describe myself a little better.

 

The only way I could think of doing this that time was to ask around in online forums and mailing lists about universities doing work with Rotor. That accompanied by a barrage of mails to everyone who I thought might know, or point me in the right direction. One name that came up was of Prof Ralf Johnson of UIUC. Right now I was looking for Brian Marick (author of little ruby) on Google, Brian is research student doing his PhD under Prof. Johnson.

 

Saturday, June 19, 2004 2:50:25 AM (Eastern Standard Time, UTC-05:00)  #    Comments [2]  | 
 Wednesday, April 21, 2004

This is a Wish List for Ruby. Ruby is an excellent language, however here are some small things that I would like to see added to Ruby:

 

  • Threading
    I wish ruby had real threads. The threading support currently provided is really sad. If Rite could actually have OS threads as Ruby threads, like in the .Net framework it would be awesome, instead of doing them as interpreter threads. Write now doing any sort of meaningful multithreaded application in ruby is meaningless.

  • C/C++ style operators
    I wish ruby had ++, -- operators. They really do not contribute to unmanageable code and on the whole are nice things to have.
  • Use of Curly Braces { }
    I wish that Ruby would let the usage of cur