⬆️ ⬇️

Another Pattern Matching in C #

Recently, in the process of working with the C # language, I have become more and more acutely in need of a pattern matching mechanism that is present in many modern multi-paradigm languages ​​(F #, Scala, etc.), but not in C #. The implementations found by half an hour ( Example ) suggested constructing match expressions via fluent interfaces, which, in my opinion, is rather cumbersome syntactically. The second, more significant for real use, the disadvantage is the overhead to iterate over predicates in the loop, occurring "under the hood" in such matchers. In general, I set out to write my own implementation of pattern matching, following two basic principles:





For what happened - I ask under the cat





Appearance



Initially, I wanted to make the construction of the matcher similar to the “real” one, avoiding chained method calls. It was decided to use the list of pairs "predicate - function". In order to use the abbreviated list initialization syntax, the Matcher class implements an IEnumerable and also has an Add method. For ease of use (for example, to pass to Select), an implicit reduction method to Func <> was added to the class.

')

Here is how it looks when used:

Func<string, int> match = new Matcher<string, int> { {s => string.IsNullOrEmpty(s), s => 0}, {s => true, s => s.Length} }; int len1 = match(null); // 0 int len2 = match("abc"); // 3 




Implementation



The first implementation, written in the process of searching for syntax, was “naive” and, like those found, produced alternate execution of predicates with the parameter passed in Match. When the code began to satisfy on the first point (to be outwardly not cumbersome), I rewrote the matcher using Expression <>:



 public class ExprMatcher<TIn, TOut> : IEnumerable<Pair<Expression<Predicate<TIn>>, Expression<Func<TIn, TOut>>>> { private Func<TIn, TOut> _matcher; private Func<TIn, TOut> Matcher { get { return _matcher ?? (_matcher = CompileMatcher()); } } private readonly List<Pair<Expression<Predicate<TIn>>, Expression<Func<TIn, TOut>>>> _caseList = new List<Pair<Expression<Predicate<TIn>>, Expression<Func<TIn, TOut>>>>(); public void Add(Expression<Predicate<TIn>> predicate, Expression<Func<TIn, TOut>> function) { _caseList.Add(new Pair<Expression<Predicate<TIn>>, Expression<Func<TIn, TOut>>>(predicate, function)); } private Func<TIn, TOut> CompileMatcher() { var reverted = Enumerable.Reverse(_caseList).ToList(); var arg = Expression.Parameter(typeof(TIn)); var retVal = Expression.Label(typeof(TOut)); var matcher = Expression.Block( Expression.Throw(Expression.Constant(new MatchException("Provided value was not matched with any case"))), Expression.Label(retVal, Expression.Constant(default(TOut))) ); foreach (var pair in reverted) { retVal = Expression.Label(typeof(TOut)); var condition = Expression.Invoke(pair.First, arg); var action = Expression.Return(retVal, Expression.Invoke(pair.Second, arg)); matcher = Expression.Block( Expression.IfThenElse(condition, action, Expression.Return(retVal, matcher)), Expression.Label(retVal, Expression.Constant(default(TOut))) ); } return Expression.Lambda<Func<TIn, TOut>>(matcher, arg).Compile(); } public TOut Match(TIn value) { return Matcher(value); } public static implicit operator Func<TIn, TOut>(ExprMatcher<TIn, TOut> matcher) { return matcher.Match; } public IEnumerator<Pair<Expression<Predicate<TIn>>, Expression<Func<TIn, TOut>>>> GetEnumerator() { return _caseList.GetEnumerator(); } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } } 




When calling Match () or when casting to Func, a chained expression is created that throws a MatchException in case the argument does not satisfy any of the predicates. As a result, we get only the overhead in the form of the compile time Expression.



Algebraic types



Another disadvantage of using C # for me was the lack of union types in it. I wanted to add them, but at the same time make their use as safe (on the subject of NPE) as possible.

To begin with, two types of combining were implemented:

 public sealed class Union<T1, T2> { public object Value { get; private set; } public T1 Value1 { get; private set; } public T2 Value2 { get; private set; } public Union(T1 value) { Value1 = value; Value = value; } public Union(T2 value) { Value2 = value; Value = value; } public static explicit operator T1(Union<T1, T2> value) { return value.Value1; } public static explicit operator T2(Union<T1, T2> value) { return value.Value2; } public static implicit operator Union<T1, T2>(T1 value) { return new Union<T1, T2>(value); } public static implicit operator Union<T1, T2>(T2 value) { return new Union<T1, T2>(value); } } 


Depending on the parameter passed to the constructor, either the Value1 or Value2 property is initialized in the instance, and the Value is also initialized. This allows comparing to check the Value type in the predicate with the help of is, without worrying about the fact that the value will take any other type except T1 and T2. With the help of template t4, Union overloads of up to 17 types were generated.

Also, to simplify the initialization of matchers, the heirs of Matcher and ExprMatcher were written:

 public class ExprMatcher<T1, T2, T3> : ExprMatcher<T1, Union<T2, T3>> {} 




To complete the picture, the rather trivial Option was also written.



I hope that my matcher will be useful to someone:

Project on bitbucket

Nuget package



Thanks for attention!

Source: https://habr.com/ru/post/222979/



All Articles