In this article I will share the experience of binary type serialization between assemblies, without reference to each other. As it turned out, there are real and "legitimate" cases when you need to deserialize the data without having reference to the assembly where they are declared. In the article I will talk about the scenarios in which this was required, describe the solution method, and also describe intermediate errors made during the search.
Introduction Formulation of the problem
We cooperate with a large corporation working in the field of geology. Historically, corporations have written very different software for working with data coming from different types of equipment + data analysis + forecasting. Alas, all this software is not always "friendly" among themselves, but more often it is not at all friendly. In order to somehow consolidate information, a web-portal is now being created, where different programs upload their data in the form of xml. And the portal is trying to create a plus-minus-full view. An important caveat: since the portal developers are not strong in the subject areas of each of the applications, each team provided a parser / data converter module from its xml to the portal data structures.
I work in a team that develops one of the applications and we pretty easily wrote a mechanism for exporting our data part. But here, the business analyst decided that the central portal needed one of the reports that our program was building. This is where the first problem appeared: the report is rebuilt every time and the results are not saved anywhere.
“So save!” The reader will probably think. I thought so too, but I was seriously disappointed by the requirement that the report be built for the downloaded data. There is nothing to do - you need to transfer the logic.
Stage 0. Refactoring. No signs of trouble
It was decided to allocate the logic of building the report (in fact, this is a plate in 4 columns, but the logic is a car and a large truck) in a separate class, and the file with this class should be included by reference to the parser assembly. By this we:
')
- Avoid direct copying
- Defending against version discrepancies
Selecting logic in a separate class is not a difficult task. But then everything was not so rosy: the algorithm was based on business objects, the transfer of which did not fit into our concept. I had to rewrite the methods so that they took only simple types and operated on them. It was not always simple and in places, which required decisions, the beauty of which remained in question, but in general, a reliable solution was obtained without obvious crutches.
One detail remained, which, as is known, often serves as a cozy shelter for the devil: a legacy of previous generations of developers gave us a strange approach, according to which some data required for building a report is stored in the database in the form of .Net objects serialized in a binary way ( the questions “why?”, “kaaak?”, etc. alas, will remain unanswered due to the absence of addressees). And the input of calculations, we, naturally, have to deserialize them.
These types, from which it was impossible to get rid of, we also included "by reference", especially since they were rather uncomplicated.
Stage 1. Deserialization. Remember the full name of the type
Having performed the above manipulations and performed a test run, I unexpectedly received a runtime error that
[A] Namespace.TypeA cannot be cast to [B] Namespace.TypeA. Type A originates from 'Assembley. Application, Version = 1.0.0.0, Culture = neutral, PublicKeyToken = null' in the context 'Default' at location '...'. Type B originates from 'Assmbley.Portal, Version = 1.0.0.0, Culture = neutral, PublicKeyToken = null' in the context 'Default' at location ''.
The very first Google links told me that the fact is that the BinaryFormatter writes not only data, but also type information to the output stream, which is logical. And given that the full name of the type contains the assembly in which it is declared, the picture of what I was trying to deserialize was evidently a completely different one, from the point of view of .Net
Scratching the back of my head, I, as it happens, made the obvious, but, alas, vicious, the decision to replace a specific TypeA type when deserializing with
dynamic . It all worked. The results of the report converged into a little piece, tests on the build server passed. With a sense of accomplishment, we send tasco testers.
Stage 2. The main. Serialization between builds
Payback came quickly in the form of bugs registered by testers, which stated that the parser was on the portal side, fell with the exception that it could not load the Assembley.Application assembly (build from our application). First thought - did not clean the references. But - no, everything is fine, no one refers. I try to run it again in the sandbox - everything works. I begin to suspect a build error, but here, a thought comes up that does not please me: I change the output path for the parser to a separate folder, and not to the common bin-directory of the application. And voila - I get the described exception. Analysis of the model confirms vague guesses - deserialization is falling.
Awareness was quick and painful: replacing a specific type with dynamic, did not change anything, BinaryFormatter still created the type from the outer assembly, only when the assembly with the type was lying next to, the runtime naturally loaded it, and when the assembly was not we get an error.
There was a reason to become sad. But googling gave hope in the form of a
SerializationBinder Class . As it turned out, it allows us to determine the type in which our data is serialized. To do this, create a successor and define the following method in it.
public abstract Type BindToType(String assemblyName, String typeName);
in which you can return any type for the given conditions.
the BinaryFormatter class has the
Binder property, where you can inject your implementation.
It would seem - there is no problem. But again, the details remain (see above).
First, you must handle requests for
all types (and standard ones too).
A rather interesting implementation variant was found on the Internet here , but there they are trying to use the default binder from BinaryFormatter, as a construct
var defaultBinder = new BinaryFormatter().Binder
But in fact, the default Binder property is null. Analysis of the source code showed that inside BinaryFormatter it is checked whether Binder is set, if yes - its methods are called, if not - internal logic is used, which ultimately boils down to
var assembly = Assembly.Load(assemblyName); return FormatterServices.GetTypeFromAssembly(assembly, typeName);
Without further ado, I repeated the same logic in myself.
This is what happened in the first implementation.
public class MyBinder : SerializationBinder { public override Type BindToType(string assemblyName, string typeName) { if (assemblyName.Contains("<ObligatoryPartOfNamespace>") ) { var bindToType = Type.GetType(typeName); return bindToType; } else { var bindToType = LoadTypeFromAssembly(assemblyName, typeName); return bindToType; } } private Type LoadTypeFromAssembly(string assemblyName, string typeName) { if (string.IsNullOrEmpty(assemblyName) || string.IsNullOrEmpty(typeName)) return null; var assembly = Assembly.Load(assemblyName); return FormatterServices.GetTypeFromAssembly(assembly, typeName); } }
Those. checked if the namespace belongs to the project - return the type from the current domain, if the system type - load from the corresponding assembly
It looks logical. We start testing: our type comes - we replace it, it is created. Hooray! A string comes - go along the branch with loading from the assembly. Works! Open virtual champagne ...
But here ... Dictionary comes, with elements of custom types: since this is a system type, then ... obviously, we are trying to load it from the assembly, but since its elements are our types, with it, again with full qualification (assembly, version, key ) then we fall again. (there should be a sad smile here).
Clearly, you need to change the input name of the type, substituting references to the desired assembly. I really hoped that for the type name, there is an analogue of the
AssemblyName class, but I did not find anything similar. Writing a universal replacement parser is not an easy task. After a series of experiments, I came up with the following solution: in a static constructor, I read types for replacement, and then I look for their names in the line with the name of the type being created, and when I find it, I replace the name of the assembly
As you can see, I was repelled by the fact that PublicKeyToken is the last in the type description. It may not be 100% reliable, but in my tests I have not found cases where it is not.
So a string like
"System.Collections.Generic.Dictionary`2 [[SomeNamespace.CustomType, Assembley.Application, Version = 1.0.0.0, Culture = neutral, PublicKeyToken = null], [System.Byte [], mscorlib, Version = 4.0.0.0, Culture = neutral, PublicKeyToken = b77a5c561934e089]] »
turns into
"System.Collections.Generic.Dictionary`2 [[SomeNamespace.CustomType, Assembley.Portal, Version = 1.0.0.0, Culture = neutral, PublicKeyToken = null], [System.Byte [], mscorlib, Version = 4.0.0.0, Culture = neutral, PublicKeyToken = b77a5c561934e089]] »
Now everything has finally worked "like a clock." Small technical details remained: if you remember, we included the files by reference from the main application. But in the main application all these dances are not needed. Therefore, the conditional compilation mechanism of the form was applied.
BinaryFormatter binForm = new BinaryFormatter(); #if EXTERNAL_LIB binForm.Binder = new MyBinder(); #endif
Accordingly, in the assembly of the portal we define the EXTERNAL_LIB macro, but in the main application - no
"Non-retreat"
In fact, in the process of coding, in order to quickly verify the decision, I made one miscalculation, which cost me, probably, a certain number of nerve cells: for starters, I was just a hardcore type replacement for Dicitionary. As a result, after deserialization, an empty Dictionary was obtained, which also “fell” when trying to perform some operations with it. I was already beginning to think that you could not fool BinaryFormatter , I began desperate experiments with trying to write a heir to Dictionary. Fortunately, I stopped almost at the right time and returned to writing a universal substitution mechanism and, having realized it, I realized that to create a Dictionary, it is not enough to redefine its type: you also need to take care of the types for KeyValuePair <TKey, TValue>, Comparer, which are also requested from Binder
These are the adventures of binary serials. I would be grateful for the feedback.