Skip to content

2010 07 15 support for vb specific string comparisons in re linq

Fabian Schmied edited this page Jul 15, 2010 · 1 revision

Published on July 2nd, 2010 at 19:31

Support for VB-specific string comparisons in re-linq

In a previous post of mine, I’ve discussed the VB.NET compiler’s characteristics of emitting very special expressions when a string comparison is performed in a LINQ query. Instead of emitting a simple BinaryExpression, as C# would, VB.NET emits a MethodCallExpression to the Microsoft.VisualBasic.CompilerServices.Operators.CompareString method. The reason for this is that VB.NET has some special features regarding string comparison – and the standard BinaryExpressions cannot represent those features.

Many LINQ providers are written in C# and therefore never learn of that specialty, until one day a VB.NET user submits a bug report about string comparisons leading to an exception in the LINQ provider. Therefore I thought it sensible to provide support for this VB.NET quirk directly in re-linq’s front-end.

It has taken some time – there was a lot of other work to do –, but starting with re-linq 1.13.65, there is now built-in support for VB.NET string comparisons.

(Unfortunately, the release notes of that version describe the feature in a different form than was implemented. The release notes for version 1.13.68 will contain an updated version of the feature description that is consistent with my explanations below.)

There were a few different options about how to actually handle this issue because of two conflicting goals:

  • make it as simple for the standard LINQ provider – one that simply needs to know that two strings were compared, without any VB-specific quirks – as possible, and
  • provide all necessary information to those LINQ providers that want to simulate VB’s string comparison semantics.

The first goal’s trivial implementation would be to have re-linq detect the MethodCallExpressions and replace them with standard BinaryExpressions [1]. That way, the LINQ provider would never even notice that VB.NET produces different expression trees; VB.NET’s comparisons would be handled just like C#’s ones.

On the other hand, this couldn’t possibly fulfill the second goal. Once the MethodCallExpression has been replaced, vital information (the textCompare parameter to the CompareString method, for example) is lost. To fulfill the second goal, the most sensible thing would therefore be to leave the MethodCallExpression in the expression tree – or to replace it with a specific custom expression, let’s call it VBStringComparisonExpression, which would hold all the required metadata about the comparison. Then, the LINQ provider could recognize this expression and deal with the expression the VB way. On the downside, this would require every LINQ provider to deal with those VBStringComparisonExpressions. And back to square one.

So, the question was, how to reach both goals at once?

One possibility would have been to make it configurable. I didn’t like this option for a number of reasons. First, as a library user, I like APIs that just work, without me having to configure anything, reading through documentation, and so on. Second, re-linq is not currently designed to be configurable at this part in the front-end, and while it would have been possible to add this feature, it seemed strange to add a configuration infrastructure just for this one option of turning automatic VB support on and off. Third, I knew there had to be a better way.

And find one I did: The .NET BCL 4.0 made some changes to the LINQ Expression classes, which were required because the Dynamic Language Runtime uses the same classes to represent their abstract syntax trees. There were a few features I liked so much that I re-implemented them in re-linq (which also has to run on .NET 3.5), in the ExtensionExpression base class. That class contains some boilerplate code for the Visitor pattern, and it also has a feature for reducing expressions. If an extension expression returns true from its CanReduce property, this means that it’s Reduce method can be used to transform the expression into a simplified version with the same (more or less) semantics. And this is the feature I used to implement the VBStringComparisonExpression with.

Here are the facts:

  • When parsing an expression tree into a QueryModel, re-linq’s frontend will now automatically detect calls to VB.NET’s CompareString operator.
  • It will simplify that comparison by using a BinaryExpression [1]. The BinaryExpression will be wrapped into a VBStringComparisonExpression, which also contains the textCompare flag defining how VB.NET would compare the string.
  • VBStringComparisonExpression is implemented so that its VisitChildren method will automatically visit the simplified BinaryExpression. This means that ordinary expression visitors (derived from the ExpressionTreeVisitor base class) can simply choose to ignore the VBStringComparisonExpressions. The simplified BinaryExpressions will automatically be visited.
  • VBStringComparisonExpression.Reduce is implemented so that it returns the simplified BinaryExpression. Visitors that are defined to throw an exception on any expression whose type they don’t know should try to reduce such expressions before giving up. That way, they can deal with complex, unknown expressions as long as they reduce to something simpler. ThrowingExpressionTreeVisitor, the base class that implements this behavior, does this automatically. This means that throwing expression visitors (derived from the ThrowingExpressionTreeVisitor base class) can also ignore the VBStringComparisonExpressions. The BinaryExpression will automatically be used instead.
  • If a visitor wants to explicitly deal with VBStringComparisonExpressions, it can implement the IVBSpecificExpressionVisitor. (Of course, such a visitor has to visit the expression before any ThrowingExpressionTreeVisitor reduces the nodes. Ordinary ExpressionTreeVisitors won’t reduce, so they are safe to be used at any time.)

And here is a picture:

VBStringComparisonExpression

The concept works quite well, I’ve tested it with re-linq’s own SQL backend – which does not handle VB comparisons in a special way. So far, the backend works nicely with VB expressions without explicitly dealing with them in any way.

It seems as if it was indeed possible to have the cake, and eat it too.


[1] This is actually a simplification. VB.NET’s CompareString is also produced for the <, <=, >, and >= operators. BinaryExpression can’t represent those operators for strings, so the “standard” expressions for them involve a MethodCallExpression to String.CompareTo.

- Fabian

Comments

Jorge Vargas - July 16th, 2010 at 10:17

Hey Fabian, thank you for this post.

With it I solved the bug in NHibernate by reducing the VBStringComparisonExpression to a BinaryExpression (the lazy way :P).

So, once again, thank you very very much for your help man. I already sent a mail to the NHibernate google group to let them know about this and my fix. I hope they approve it.

Thanks again man, cheers from Mexico :)

Clone this wiki locally