I recently had some C# code that that had to be made localizable. Most articles about localization/internationalization that you find on the web would talk about how nice Visual Studio is for code internationalization and would show nice examples of how many ways the forms-designer would extract code out into a resx file. I am perfectly ok with studio doing all the work for you. However there are very often, strings in your actual code that studio does not externalize to resx files.
Strings.rb is a ruby script that will parse your C# code base and identify literal string definitions in the code base and will move them to your resx file. The code was hacked up to fill out a personal need so your mileage on this may vary. The tool certainly isn’t fool proof and there are certain cases that it doesn’t handle too well. If you are however on the smart-scripter side of things then you may find it useful.
The script needs to be setup for your specific project. Once done you can run it several times on your code base and it can incrementally catch strings and externalize them for you. This is handy to have while your code is still undergoing changes so new strings can be identified as they pop up and can be moved out.
Getting Started
Downloads
1) First thing download the script (strings.rb) and put it in your project folder.
2) Download and install ruby from here – http://rubyforge.org/frs/?group_id=167, its about 12mb and the installation happens in a snap.
3) Download an install REXML library for XML handling in Ruby from here –
http://www.germane-software.com/archives/rexml_3.1.2.zip
http://www.germane-software.com/software/rexml/docs/tutorial.html
Patching Strings.rb for your project
1) You need to patch the script file to have the correct path to your resx file and the path to your wrapper class that will be used to read strings from your resx file.
Open the script file in a text editor. (If you have ruby installed you should find this editor called scite in the ruby installation folder – that’s a nice editor. Alternately you might want to try installing scite - http://scintilla.sourceforge.net/SciTEDownload.html - about 600k).
In your project identify your resx file. It will usually be in Properties\Resources.resx.
Change the following line the rb file to reflect the path path to your resx file.
strings.rb:4:$resx_fn = "properties/Resources.resx"
(The actual line number might change a bit)
2) Now create a new class in your project called Strings. VS should typically create an empty class definition file that looks like this.
#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
#endregion
namespace <Some Namespace>
{
public class Strings
{
}
}
Patch the file with the following additions
- Add a using directive for your ‘Properties’ namespace.
- Add a comment that stays //start and one that says //stop. These ad as delimiters between with the script will generate the string definitions.
#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
using <Some namespace>.Properties;
#endregion
namespace <Some Namespace>
{
public class Strings
{
//start
//stop
}
}
3) This is the wrapper class into which the script will generate string definitions. You need to patch the script with the path to this class file. Basically patch this line –
strings.rb:5:$stringsclass_fn = "helper/Strings.cs"
Done
If you have got this far then your installation is done and you are ready to go.
For sake of completeness let me just list out things again –
1) download the script and put it into the project folder
2) install ruby
3) install the REXML library for Ruby
4) patch the script with the path to the resx file of the project
5) create a empty Strings class and add the namespace directive and comment markers to it
6) patch the script to have the correct path to your Strings.cs file.
What does the script do?
The script does a few basic things.
1) it parses your *.cs files in all subdirectories and looks for strings.
2) when it finds a string a it prompts the user for an action
3) if it is a string that should be localized the user can provide a pseudonym for the string. On getting this name the script will -
1) add the string and the name to the resx file
2) add a property to the Strings class that will read the string from the rex file
3) replace the string literal in the code with a call to the property.
Running the script
To run the script after all the previous setup, simply go to the command line and type strings.rb
Here is a sample run of the Strings.rb script
Let me take up a simple project and show you how the internationalization script works.
Here is a project that has only one Program.cs file –
#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
#endregion
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string a = "hello world";
string x = "skip this line";
string b = "escape sequences \n\r\t\\\"";
string c = @"cant handle this one";
}
}
}
The resx file looks like this –
<?xml version="1.0" encoding="utf-8"?>
<root>
<resheader name="resmimetype">
<value>text/microsoft-resx</value>
</resheader>
<resheader name="version">
<value>2.0</value>
</resheader>
<resheader name="reader">
<value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>
</resheader>
<resheader name="writer">
<value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>
</resheader>
</root>
(I have removed some unnecessary details from the original resx file here)
I created this Strings class –
#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
using ConsoleApplication1.Properties;
#endregion
namespace ConsoleApplication1
{
public class Strings
{
//start
//stop
}
}
This is what happens when you run the strings.rb script –
C:\work\vcsexpress\Sample1\Sample1>strings
Error reading skip data! continuing with no skip data.
HelloString = hello world
EscString = escape sequences \n\n\t\\\"
Program.cs:0:n++#region Using directives
Program.cs:1:
Program.cs:2:using System;
Program.cs:3:using System.Collections.Generic;
Program.cs:4:using System.Text;
Program.cs:5:
Program.cs:6:#endregion
Program.cs:7:
Program.cs:8:namespace ConsoleApplication1
Program.cs:9:{
Program.cs:10: class Program
Program.cs:11: {
Program.cs:12: static void Main(string[] args)
Program.cs:13: {
Program.cs:14: string a = "hello world";
"hello world">?
Help ----------
=<name> = the string will be externalised as <name>
sf = skip file : file will not processed on next run
if = ignore file : file will be processed on next run
sl = skip line : line will be processed on next run
il = ignore line : line will be processed on next run (default)
x, exit = exit script
all skip information in stored in "skip_list.txt"
Program.cs:14: string a = "hello world";
"hello world">=HelloString
string a = Strings.HelloString;
Program.cs:15: string x = "skip this line";
"skip this line">sl
Program.cs:16: string b = "escape sequences \n\r\t\\\"";
"escape sequences \n\r\t\\\"">=EscString
string b = Strings.EscString;
Program.cs:17: string c = @"cant handle this one";
Program.cs:18: }
Program.cs:19: }
Program.cs:20:}
Writing Resource File "properties/Resources.resx" : done
Writing Strings class "Strings.cs" : done
Writing Skip data "skip_list.txt" : done
Effectively you can see the script run through the source file (actually it runs through all the cs files) and prompt you with each string. It also shows a little help on the actions possible.
To replace a string, you need to give it a name. Simply type =<name> and the string will get replaced.
If you don’t want to do anything about a particular line, type ‘sl’ for skip line and it will skip that line. It also adds the line to a file called skip_file.txt so that in subsequent runs of strings.rb it will not keep prompting you to patch the same line.
You can similarly choosing skip a file using the ‘sf’ option. You may typically want to skip the *.designer.cs files, the strings.cs file etc.
All skip information is human readable and is stored in a text file called skip_list.txt.
Strings.rb is deisgned to be run multiple times over the sample project through its development so that it can catch new strings as they appear in your code base, incrementally. The resx and strings.cs files are recreated at each run.
To show you the output of the process, this is what happened.
This is the new Program.cs file –
#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
#endregion
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string a = Strings.HelloString;
string x = "skip this line";
string b = Strings.EscString;
string c = @"cant handle this one";
}
}
}
This is the new resx file –
<?xml version="1.0"?>
<root>
<resheader name="resmimetype">
<value>text/microsoft-resx</value>
</resheader>
<resheader name="version">
<value>2.0</value>
</resheader>
<resheader name="reader">
<value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>
</resheader>
<resheader name="writer">
<value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.3600.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>
</resheader>
<data name="HelloString">
<value xml:space="preserve">hello world</value>
</data>
<data name="EscString">
<value xml:space="preserve">escape sequences
\"</value>
</data>
</root>
Notice that the two strings have appeared here.
And this is the new Strings.cs file –
#region Using directives
using System;
using System.Collections.Generic;
using System.Text;
using ConsoleApplication1.Properties;
#endregion
namespace ConsoleApplication1
{
public class Strings
{
//start
// "escape sequences \n\r\t\\\""
public static string EscString { get { return Resources.ResourceManager.GetString("EscString"); } }
// "hello world"
public static string HelloString { get { return Resources.ResourceManager.GetString("HelloString"); } }
//stop
}
}
Also, if you are interested in seeing the skip data, this is the skip_list.txt that got created –
Program.cs:::string x = "skip this line";
Limitations
1) The string matching that is done by the script is fairly limited. Basically it identifies strings in the the c# code by comparing with the following regex –
strings.rb:15:$string_pattern = /[^@]("(\\.|[^\\"])*")/
This does not cleanly cover all sorts of escape sequences that a string can have. It also does not support @””. But .. well… this covers large number of strings that you would face, so its good enough to get along. Also if you can get me a better pattern match, I would be happy.
The script iterates over all strings on a line of cs code using –
line.scan($string_pattern).each {|str,e1|
//str is the string
}
2) The resx file tags that are generated by script are those that are valid for Visual C# Express Edition Beta 1 format. I don’t know if this resx format is valid for other versions of studio. I would expect that it is. Even if it is not, you can easily patch it for you version of studio. This is how –
The resx file has a tag added for each string definition that looks like this –
<data name="HelloString">
<value xml:space="preserve">Hello world</value>
</data>
If your studio generates tags like this, then you are ok. If you are not just patch the following block of ruby code to generate your tags. It’s fairly easy –
el = doc.root.add_element "data"
el.add_attribute("name", key)
val = el.add_element("value")
val.add_attribute("xml:space","preserve")
val.text = remove_esc_seq($map[key])
This is part of the writeresx() function.
3) The escape sequence handling in the script is a hack – its funny – it’s limited. It’s actually a little sad:
def add_esc_seq(str)
str.gsub("\\", "<double_back_slash>").gsub("\"", "\\\"").gsub("\n", "\\n").gsub("\t", "\\t").gsub("\r", "\\r").gsub("<double_back_slash>", '\\\\\\')