📜 ⬆️ ⬇️

How I searched (and found) the difference in two byte-identical files.

We have one .NET application that can load and use plugins. Plugins are a good thing. You can expand the functionality, you can quickly update them from your site, you can even give users the SDK and allow them to write their own plugins. We did all this. Our plugins were normal .NET assemblies that had to be thrown into a specific folder from which the main application loaded and used them. Well, you probably imagine how - Assembly.Load () , then look for a class that implements the necessary interface, create an object of this class, etc. All this worked for a long time, is stable and nothing foreshadows trouble. But suddenly at some point it became necessary to create a plugin consisting of several files. In this regard, it was decided to consider the plugin is not just a .NET assembly (1 file), but a zip archive, which can contain both one assembly and several files. In this regard, I had to teach the build server to pack plugins into archives, and the main application to unzip them to the right place. In general, a task for 10 lines of code. Nothing foreshadowed trouble. And here I download the build archive with the plugin from the build server, unzip it into the necessary folder, launch the application, and ... it doesn't work! Stop how not working? This is the same plugin!

Further more. Please do the same procedure for my colleague on his computer. He tries - and everything works for him! But how so? One version of the application, the same file from the build server. Any difference in the environment? I sit down at a colleague's computer, try again - it does not work! He is trying this time on my - it works! That is, it turns out that the file "remembers" who unzipped it! We call the third colleague to watch this circus. Sequentially, on the same computer, in turn we do the same actions: download the archive with the plugin, unzip it into the desired folder, start the application. When I do this - the program does not see the plugin, when a colleague does it - everything works. In the third round of these interesting experiments, we suddenly notice a difference in the actions: I unzipped the plugin using standard Windows tools, and my colleague - using 7-Zip. Both of these were caused by us from the context menu of the archive, so no one noticed the difference in the click on the wrong item at the beginning. Well, OK. It turns out that a file extracted from a zip archive using 7-zip is different from the same file from the same archive extracted using a standard Windows archiver?

By the way, until you open the article under the cut, answer for yourself the question: can it be that the contents of the files of a valid zip-archive when unzipping 7-zip and through Windows Explorer will be different?

Well, let's not guess and compare files using WinMerge:
')


It turns out that the files are the same and must be equally loaded and processed? No matter how wrong! WinMerge is lying . Files are different. And they are loaded. NET also in different ways.

And now it will be a terrible truth.

When downloading a file from the Internet, Windows puts a special “flag” on it, meaning a zone of trust corresponding to the site from which it was downloaded. I think many people have seen when trying to launch a just-downloaded executable warning file that you may not need to run it, you have to think, look at the certificate and tell me what to do. Depending on the security policies and the origin of the file, the level of paranoia of these warnings can be different - from their complete absence (working under the administrator, UAC is disabled, the file is signed) to launch blocking (corporate environment, unsigned file). There are several intermediate stages where you have to say “yes, launch” one or several times. But all this only works for exe-files, right? Not! The dll file or archive downloaded from the Internet will also hang this flag! From a technical point of view, it is an alternative NTFS file stream , which can be viewed, for example, through the AlternateStreamView utility, or via the command:

more < Plugin.dll:Zone.Identifier 

And here we have the confluence of the following circumstances:

  1. When downloading, the browser creates an alternate file stream “Zone.Identifier” for the downloaded archive and writes there the zone ID from which the file came from.
  2. The standard archiver of Windows Explorer when unzipping reads not only the main file stream, but also alternative ones, and adds them to each extracted file. (7-zip does not do this).
  3. The WinMerge utility compares only the main file streams and says that the files created by 7-Zip and Explorer are identical.
  4. In .NET, the Assembly.Load () method also reads alternative file streams, finds a zone identifier with a lower trust - and refuses to load the file! At the same time, messages familiar to the user asking to confirm the launch of an untrusted application are not shown and we get our bug.

To deal with the problem is quite simple - you need to check \ delete the file stream. In Windows, you can call the properties of the file and click the Unblock button there (well, or do it programmatically).



If you do this for an archive before extracting files from it, the zone identifier will disappear for all files extracted later.

Perhaps I told banal and well-known things here, but the fact that different archivers can extract different files from the same archive, and even so cleverly different, that WinMerge doesn’t see the difference, and .NET sees it personally for me it was an interesting discovery.

Source: https://habr.com/ru/post/274183/


All Articles