Precompiled Or Inline Regular Expressions


Precompiled Or Inline Regular Expressions

What to Do
Consider using Precompiled Or Inline Regular Expressions in the scenarios where the same set of regular expressions are used quite often in the applications.
  • Precompiled Regular Expressions can be created using RegexOptions.Compiled switch while defining the regular expression.
  • Inline Regular Expressions can be created using Regex.CompileToAssembly method.
Why
  • Interpreted Regular Expressions have the least start-up time, but has performance issues at runtime. If the regular expression is used rarely in the application, use Interpreted Regular Expressions.
  • Precompiled Regular Expressions take little more time at start-up, but gives better performance at runtime. Use Precompiled Regular Expressions for the regular expressions which are used most often in the application for better runtime performance. The precompiled regular expressions give 30 % better runtime performance then interpreted regular expressions.
  • Inline Regular Expressions creates separate regular expression assembly with comparable start-up time as interpreted regular expressions and also provide the performance benefit of precompiled regular expressions.
When
  • Select traditional interpreted Regular Expression if RegEx is used rarely in the application
  • Select Precompiled Regular Expression if there are limited set of RegEx used repeatedly in the application.
  • Select Inline Regular Expression if the regular expressions are shared across applications and used multiple times.
How
  • Interpreted Regular Expressions can be created as follows:
...
Regex r = new Regex("xyz*");
Regex.Match("9876foo", @"(\d*)foo");
...
* Precompiled Regular Expressions can be created as follows:
...
Regex r = new Regex("xyz*", RegexOptions.Compiled);
Regex.Match("9876foo", @"(\d*)foo");
...
* Inline Regular Expressions can be created as follows:
... 
RegexCompilationInfo regexInfo = new RegexCompilationInfo("xyz*", RegexOptions.None, "standard", "XYZ.Regex", true); 
Regex.CompileToAssembly(new RegexCompilationInfo[]{regexInfo}, new AssemblyName("bar.dll"));
...
Problem Example
A Data Cleansing Application which reads the data from the database, performs data cleansing operations and saves the corrected data to a new database. The application reads the data for Email Address, Zip Code, Phone Number etc and validates the data structure using Regular Expressions. The application uses Interpreted Regular Expressions. Since the application needs to process millions of database rows for data cleansing, the runtime performance using interpreted regular expression is slow.
...
for(i=0; i[^@]+)@(?.+)");
 Boolean match = reg.IsMatch(emailAddress);
 if (!match)
 {
   //process the data
 }

 //Validate Zip Code 
 reg = new Regex("\d{5}(-\d{4})?");
 match = reg.IsMatch(zipCode);
 if (!match)
 {
   //process the data
 }

 //Validate Phone Number
 reg = new Regex("((\(\d{3}\) ?)|(\d{3}-))?\d{3}-\d{4}");
 match = reg.IsMatch(phoneNumber);
 if (!match)
 {
   //process the data
 }
}
...
Solution Example
A Data Cleansing Application which reads the data from the database, performs data cleansing operations and saves the corrected data to a new database. The application reads the data for Email Address, Zip Code, Phone Number etc and validates the data structure using Regular Expressions. The application uses Precompiled Regular Expressions which gives better runtime performance:
...
Regex reg1 = new Regex("(?[^@]+)@(?.+)",RegexOptions.Compiled );
Regex reg2 = new Regex("\d{5}(-\d{4})?", RegexOptions.Compiled);
Regex reg3 = new Regex("((\(\d{3}\) ?)|(\d{3}-))?\d{3}-\d{4}", RegexOptions.Compiled);

for(i=0; i
Additional Resources

Comments

Popular posts from this blog

Data Binding in .net

काहे की दोस्ती

C# Polymorphic types conversion with Generics