Comparing Different Versions of the CLR Using Reflection (Part 1: Assemblies)

Mar 4, 07:42 pm

Introduction

This articles is based on the .NET 2.0 Beta, but the principals should work perfectly for the released version.

Even though VS .NET 2005 Beta 2 is supposed to be feature-complete for the 2.0 version of the Microsoft .NET Framework, it's not easy to find all the changes and additions to the namespaces and types in this version as compared to the previous shipped versions. If you don't want to wait for your compiler to complain, you can read the list of breaking changes between all versions of the CLR in versions 1.1 and 2.0, provided by Microsoft (see Related Links section), and try to figure out how it affects you. Even better, the source code of the tool used to generate this information is also available as a tool called LibCheck . LibCheck is mainly focused on detecting breaking changes between two versions of the same assembly—not new types or compatible changes. The other main limitation is the lack of possible comparison between binary incompatible versions that exists between .NET 2.0 and 1.x.

I use a similar approach to finding the changes and additions, which I call the "brute force" approach: using reflection, I dump the contents of assemblies into the CLR folders of both versions 1.1 and 2.0 Beta 2. The interesting implementation details of this dump-and-compare story are the core of this article trilogy. This first article focuses on the following topics:

  • Detecting which version of the CLR is installed
  • Locating where a version of the CLR is stored on the file system
  • Knowing which method of the reflection API to use in order to load the assembly you want

Note that, in several places throughout this article, I use the term .NET 2005 to mean VS .NET 2005 (Beta 1 or 2) and the new version of the CLR itself.

System Requirements

To run the code for this sample, you should have:

  • The .NET Framework versions 1.1 and 2.0 (Beta 1 or 2)
  • VS .NET 2003 and VS .NET 2005 Beta 2

Installing and Compiling the Sample Code

The C# sample download for this series contains several VS .NET solutions for C# and C++ projects. Note that the download for this article is identical to the downloads for the other two articles of the series.

  • LoadClrAssembly11 . A console application test to load mscorlib.dll and system.dll compiled with CLR version 1.1.
  • LoadClrAssembly . A console application test to load mscorlib.dll and system.dll compiled with CLR v2.0 Beta 2.
  • LoadedAssemblies . The sample code for version 1.1\.NET 2005 Beta 1 and 2 executables, which allows you to see which assemblies are loaded. To compile it on your machine, open a VS command prompt and type csc /out:AssembliesXXX program.cs , where XXX is the version of the CLR you're using to compile.
  • CLRDump11 . A tool used to dump assemblies compiled with CLR 1.1.
  • CLRDumpBeta1 . A tool used to dump assemblies compiled with CLR 2.0 Beta 1.
  • CLRDump . A tool used to dump assemblies compiled with CLR 2.0 Beta 2.

Useful Tools for Listing the Contents of Assemblies

Before using the code presented in this article, you can already take advantage of some existing tools. Provided either by VS, the Microsoft .NET SDK, or generous developers, some valuable tools have already been written on top of the reflection API, each with its own set of features.

You can find WinCV.exe in the bin folder of the Microsoft .NET Framework SDK installation in VS .NET 2003, but it has been removed from this folder in VS .NET 2005 Beta 2. As Figure 1 shows, WinCV is useful for locating a type from the main assemblies of the CLR (more on the exact list later).

Figure 1. Locating a CLR type using WinCV

Once you select a type in the list on the left, the right panel is populated with its declaration in textual format, which is easily copied to the clipboard.

Figure 2. Easy-to-read type declarations with WinCV

Oftentimes I find it handy to get the declaration of an interface from which I need to derive, so I can know what I'm working with. It's worth noting that WinCV allows you to update the list of assemblies used during searches. In the WinCV.exe.config configuration file, each assembly is listed with an <assembly> entry containing its complete name, such as the following:

System.Windows.Forms, Version=1.0.5000.0, Culture=neutral, 
   PublicKeyToken=b77a5c561934e089

This is an easy way to enlarge the scope of your searches if need be. However, combined with the /nostdlib+ switch, you can add two versions of the same assembly and immediately get a quick-and-dirty comparison. Unfortunately, WinCV doesn't show the assembly version in the left pane, so you have to select each individually to read its version in the right pane.

Also installed within the same folder are ILDASM and its semi-documented /ADV command-line argument, which takes advantage of undocumented unmanaged interfaces such as IMDInternalImport . ILDASM lets you browse types from any assembly through a graphical, hierarchical representation of the namespaces, types, and their members, as shown in Figure Figure 3.

Figure 3. Using ILDASM to show the members of a type

How do I know that undocumented interfaces are used under the covers of ILDASM? Well, thanks to the Internet, it's possible to download the Microsoft multiplatform Rotor implementation (see Related Links section for the link) of a CLR subset. If you take the time to browse the contents of the sli\clr\src\ildasm folder, you can take a look at the source code to see it.

Lutz Roeder regularly updates Reflector , a graphical managed tool that's much more user-friendly than ILDASM (see the Related Links section for download). In addition to the standard assembly and CLR metadata browsing, you get C# or VB .NET syntax instead of IL code, resource visualization, and dependencies exploration (see "Under the Hood," by Matt Pietrek, in the Related Links section). It's also possible to add your own extension to Reflector through a free SDK (again, see the Related Links section).

To finish my journey through existing tools, I shouldn't forget Anakrino, the graphical tool and ancestor of Reflector written by Jay Freeman. Its UI may not be as rich or as smooth as Reflector's, but Anakrino allows you to get C++-like formatting for a type's method implementation. Links to all of these great tools can be found in the Related Links section.

Even with all these tools on my desk, however, I still have some good reasons to build my own, as the final goal is to compare two versions of the same assembly, as opposed to merely browsing one version.

Loading the Right Assemblies

In order to dump the namespaces and types from a version of the CLR, it's mandatory to know where the assemblies are located. However, this isn't always an easy task. I first need to come back to the basics of reflection. Let's assume you ask a .NET-aware compiler to build an assembly containing your types. In addition to the MSIL-compiled code for the methods you write in VB .NET or C#, the compiler will also add metadata that describes the contents of the assembly precisely and entirely. From the types and their members, to the resources, everything is compiled in an efficient binary format. There's no need to build your own metadata decompiler because Microsoft has already written two APIs dedicated to browsing the metadata: an unmanaged and a managed reflection API. I'll focus on the latter in Part 2 of this series, and I'll describe how to use the former to decipher P/Invoke definitions in Part 3.

Where are the CLR Assemblies?

Before loading any assemblies, I'd like to talk about the location of the CLR assemblies. In the next section, you'll see why this is important when the time comes to load them through reflection. The first step is to find out where the different versions of the CLR are installed on the hard drive. One solution is to guess—you write a console application and dump all the loaded assemblies. Since mscorlib.dll is always needed, the corresponding Assembly instance should provide all the needed information:

using System;
using System.Reflection;
class EntryPoint
{
  static void DumpLoadedAssemblies()
  {
    Assembly[] assemblies = AppDomain.CurrentDomain.GetAssemblies();


      foreach(Assembly assembly in assemblies)
      {
        Console.WriteLine("   {0}{1}", 
        assembly.Location, 
        assembly.GlobalAssemblyCache ? " - GAC" : "");


      }
  }
  [STAThread]
  static void Main(string[] args)
  {
    DumpLoadedAssemblies();
  }
}

If you don't know the reflection API, remember that the code uses the AppDomain class first to get the current AppDomain and then to list the loaded assemblies through the GetAssemblies() method. From an Assembly reference, you have access to a great deal of information, such as the full path from which it has been loaded with the Location property and whether it comes from the GAC (Global Assembly Cache) with the GlobalAssemblyCache Boolean property.

Compiling the previous code snippet with .NET 1.1 and .NET 2005 Beta 1/Beta 2 results in the following outputs:

  • .NET 1.1:
  •    c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll
       ...\Assemblies11.exe
    

  • .NET 2005 Beta 1:
  •    C:\WINDOWS\Microsoft.NET\Framework\v2.0.40607\mscorlib.dll - GAC
       …\AssembliesBeta1.exe
    

  • .NET 2005 Beta 2:
  •    C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll - GAC
       …\AssembliesBeta2.exe
    

It seems that mscorlib.dll is stored in the GAC in .NET 2005, but not in version 1.1.

You can double-check it using gacutil.exe -l mscorlib in both environments.

  • .NET 1.1:
  • The Global Assembly Cache contains the following assemblies:
    The cache of ngen files contains the following entries:
            mscorlib, Version=1.0.5000.0, Culture=neutral, 
       PublicKeyToken=b77a5c561934e089, 
       Custom=5a00410050002d004e0035002e0031002d0038004600
       53002d00300030003500340034003400310043000000
            mscorlib, Version=1.0.5000.0, Culture=neutral, 
       PublicKeyToken=b77a5c561934e089, 
       Custom=5a00410050002d004e0035002e0031002d0038004600
       440053002d00300030003900340032003500310034000000
    Number of items = 2
    

In .NET 1.1, there's nothing in the GAC except two precompiled versions of mscorlib . Let's check with ngen /show mscorlib.dll .

mscorlib, Version=1.0.5000.0, Culture=neutral, 
   PublicKeyToken=b77a5c561934e089 <domain neutral>
mscorlib, Version=1.0.5000.0, Culture=neutral, 
   PublicKeyToken=b77a5c561934e089 <debug> <domain neutral>

I have both release and debug versions of mscorlib with the MSIL precompiled in x86 by ngen on my machine. It's not a big deal, but you should also note that mscorlib is domain neutral, which means that the CLR will make some optimizations when it's time to load this in an AppDomain .

  • .NET 2005 Beta 1:
  • The Global Assembly Cache contains the following assemblies:
      mscorlib, Version=2.0.3600.0, Culture=neutral, 
       PublicKeyToken=b77a5c561934e089, ProcessorArchitecture=x86
    Number of items = 1
    

  • .NET 2005 Beta 2:
  • The Global Assembly Cache contains the following assemblies:
      mscorlib, Version=2.0.0.0, Culture=neutral, 
       PublicKeyToken=b77a5c561934e089, ProcessorArchitecture=x86
    Number of items = 1
    

In .NET 2005, mscorlib is now stored in the GAC. As we move forward, you should see that the algorithm used to locate other assemblies of the CLR has also changed.

Let's go back to my first question: how can you know where the assemblies are stored? As you've seen, in all cases, mscorlib.dll is loaded from a Windows subfolder. Instead of hard-coding this path, you can rely on a registry key. How did I find out about this key? I used a free tool called RegMonitor, provided by Sysinternals (see Related Links for a download), which allowed me to log the access to the Windows registry. I compiled a C# console application that does nothing, and ran it after RegMon was started. In the following code, you can see that the value of HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\ InstallRoot is fetched, and surprisingly contains the same C:\WINDOWS\Microsoft.NET\Framework\ subfolder root in which mscorlib.dll is stored. If you need to know which version is installed under this root, you can easily list the versions as subkeys of the Policy key, as shown in Figure Figure 1.

Here's an example of how to get the root of the CLR subfolders:

private string GetClrInstallRoot()
{
  #if .NET 2005_BETA2
     string path = 
       Registry.GetValue(
       @"HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework", 
       "InstallRoot", 
       null


       ) as string;
  #else
    RegistryKey key = 
      Registry.LocalMachine.OpenSubKey(
      @"SOFTWARE\Microsoft\.NETFramework", 
      false
      );
    string path = 
      key.GetValue("InstallRoot") as string;
  #endif
   return(path);
}

In the previous code, I've highlighted a new feature of the registry in .NET 2005 that allows you to retrieve the value of a registry key directly in a single call—not a killer feature, but it will make your source code much easier to read.

The Related Links section has links to several articles on exploring the GAC. In particular, you may want to take a look at "Undocumented Fusion," by John Renaud. You'll learn how to programmatically access the GAC through managed code. For the unmanaged side of the story, the Microsoft knowledge base article "Global Assembly Cache (GAC) APIs Are Not Documented in the .NET Framework Software Development Kit (SDK) Documentation" is also available. Last but not least, Rotor comes with the source code of GACUTIL in ssli/clr/src/tools/gac .

During my searches to dump and compare assemblies, I've found the RuntimeEnvironment class in the System.Runtime.InteropServices namespace of mscorlib . If you call its GetRuntimeDirectory() static method, you get the full path of the version of the CLR you're using as a string with a \ at the end. Sometimes, spending more time on the documentation saves time in the long run.

Several Ways to Load an Assembly

Using reflection, the Assembly class provides four methods to load an assembly: Load() , LoadWithPartialName() , LoadFrom() , and LoadFile() . I recommend reading Suzanne Cook's blog entries (listed in the Related Links section) for a detailed explanation of these arcane methods. In this article, I simply want to point out changes in the behavior of these methods in .NET 2005. First, LoadWithPartialName() is tagged as Obsolete and has been deprecated in favor of Load() . (If you want to know all the types and members decorated with the ObsoleteAttribute and discover further information, read Christophe Nasarre's blog, which is listed in the Related Links section.)

Second, the way the CLR locates an assembly before loading it from the file system has changed (see the Related Links section for related blog entries). When you provide a full path name as a parameter to LoadFrom() or LoadFile() , you're not guaranteed to get the Assembly instance that corresponds to this particular file. Let's run some code in order to demonstrate the differences.

private void LoadClrAssembly(string version, string filename)
{
  Console.WriteLine("\r\nDumping {0} {1}\r\n-----------------
   -------------------", filename, version);
  string path = GetClrAssemblyPathname(version, filename);
  Console.WriteLine("             {0}", path);
  try
  {
    Assembly assembly = Assembly.LoadFile(path);
    Console.WriteLine("LoadFile --> {0}{1}", assembly.Location, 
   assembly.GlobalAssemblyCache ? " - GAC" : "");
    assembly = Assembly.LoadFrom(path);
    Console.WriteLine("LoadFrom --> {0}{1}", assembly.Location, 
   assembly.GlobalAssemblyCache ? " - GAC" : "");


    assembly = 
   Assembly.LoadWithPartialName(Path.GetFileNameWithoutExtension(filename));
    Console.WriteLine("LoadWPN  --> {0}{1}", assembly.Location, 
   assembly.GlobalAssemblyCache ? " - GAC" : "");


  }
  catch(BadImageFormatException x)
  {
      Console.WriteLine("Error loading {0}: {1}", x.FileName, x.Message);
  }
}   

The two different Load() methods are called to load mscorlib.dll and system.dll from the folders corresponding to the different versions of the CLR:

LoadClrAssembly("v1.0.3705", "mscorlib.dll");
LoadClrAssembly("v1.1.4322", "mscorlib.dll");
LoadClrAssembly("v2.0.50215", "mscorlib.dll");
LoadClrAssembly("v1.0.3705", "system.dll");
LoadClrAssembly("v1.1.4322", "system.dll");
LoadClrAssembly("v2.0.50215", "system.dll");

The full path name is computed based on the CLR version, using the file name of the assembly and the GetClrInstallRoot() helper method presented earlier:

private string GetClrAssemblyPathname(string version, string filename)
{
  string path = 
    Path.Combine(
      GetClrInstallRoot(),
      string.Format(@"{0}\{1}", version, filename)
    );
  return(path);
}

Here are the results.

.NET 1.1:

Dumping mscorlib.dll v1.0.3705
------------------------------------
             C:\WINDOWS\Microsoft.NET\Framework\v1.0.3705\mscorlib.dll
LoadFile --> c:\windows\microsoft.net\framework\v1.0.3705\mscorlib.dll
LoadFrom --> c:\windows\microsoft.net\framework\v1.0.3705\mscorlib.dll
Dumping mscorlib.dll v1.1.4322
------------------------------------
             C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorlib.dll
LoadFile --> c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll
LoadFrom --> c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll
Dumping mscorlib.dll v2.0.50215
------------------------------------
             C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll
Version 2.0 is not a compatible version.

For versions 1.x of the CLR, mscorlib is loaded from a Windows root folder subdirectory as expected. I didn't show the result for LoadWithPartialName() , but it provides the same results.

The binary file format used to store the managed metadata has changed in .NET 2005, and it's not possible to load a version 2.0 assembly using 1.1 reflection. Don't forget to catch any BadImageFormatException in your version 1.1 code, just in case you're going to try to load a version 2.0 assembly. However, you shouldn't expect its Filename property to return something helpful, as it's always null. As such, you have to write a version 2.0 program if you want to use managed reflection to dump the types declared in a version 2.0 assembly.

  • .NET 2005 Beta 2:
  • Dumping mscorlib.dll v1.0.3705
    ------------------------------------
                 C:\WINDOWS\Microsoft.NET\Framework\v1.0.3705\mscorlib.dll
    LoadFile --> C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll - 
       GAC
    LoadFile --> C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll - 
       GAC
    Dumping mscorlib.dll v1.1.4322
    ------------------------------------
                 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorlib.dll
    LoadFile --> C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll - 
       GAC
    LoadFile --> C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll - 
       GAC
    Dumping mscorlib.dll v2.0.50215
    ------------------------------------
                 C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll
    LoadFile --> C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll - 
       GAC
    LoadFile --> C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\mscorlib.dll - 
       GAC
    

Whatever I try, it always loads the version 2.0 mscorlib assembly—perhaps mscorlib is a special assembly. I'll try doing the same with the system.dll , for example, to verify my results. With the 1.x versions only LoadWithPartialName() fetches the assembly from the GAC:

Dumping system.dll v1.0.3705
------------------------------------
             C:\WINDOWS\Microsoft.NET\Framework\v1.0.3705\system.dll
LoadFile --> c:\windows\microsoft.net\framework\v1.0.3705\system.dll
LoadFrom --> c:\windows\microsoft.net\framework\v1.0.3705\system.dll
LoadWPN  --> 
c:\windows\assembly\gac\system\1.0.5000.0__b77a5c561934e089\system.dll - 
   GAC

Dumping system.dll v1.1.4322
------------------------------------
             C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\system.dll
LoadFile --> c:\windows\microsoft.net\framework\v1.1.4322\system.dll
LoadFrom --> c:\windows\microsoft.net\framework\v1.1.4322\system.dll
LoadWPN  --> 
c:\windows\assembly\gac\system\1.0.5000.0__b77a5c561934e089\system.dll - GAC

But .NET 2005 behaves differently:

Dumping system.dll v1.0.3705
------------------------------------
             C:\WINDOWS\Microsoft.NET\Framework\v1.0.3705\system.dll
LoadFile --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll - 
   GAC
LoadFrom --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll 
   - GAC
LoadWPN  --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll 
   - GAC
Dumping system.dll v1.1.4322
------------------------------------
             C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\system.dll
LoadFile --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll - 
   GAC
LoadFrom --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll - 
   GAC
LoadWPN  --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll 
   - GAC
Dumping system.dll v2.0.50215
------------------------------------
             C:\WINDOWS\Microsoft.NET\Framework\v2.0.50215\system.dll
LoadFile --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll 
   - GAC
LoadFrom --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll 
   - GAC
LoadWPN  --> 
C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll 
   - GAC

Oops! Even system.dll is loaded from the GAC, and it seems that the GAC has been extended in .NET 2005 with a lot of new subfolders, mostly to support 32- and 64-bit applications. If you're interested in this topic, see the Related Links section for more details.

Even though LoadWithPartialName() is deprecated, its behavior is exactly what happens now in Beta 2 compared to version 1.1—it always loads the current version of the given short name, like mscorlib or system (see the LoadClrAssembly solution in source code). Even the Load() method succeeds in loading mscorlib if you provide mscorlib as a parameter.

Conclusion

This first article of this series has focused on the location of the assemblies and how to load them the way you want to list their contents. In version 1.1, it's possible to use LoadFile() or LoadFrom() to load any version 1.0/1.1 assembly, but it's not possible for version 2.0. Also, in version 2.0, it's impossible to load an assembly from the GAC, even if a full path name is given to LoadFile() or LoadFrom() . For more details on this, see Suzanne Cook's or Alan Shi's blogs (listed in the Related Links section). At this time, I haven't found a solution to load two assemblies of different versions of the CLR in the same process using managed reflection. I have to compile the same dumper application in version 1.1 and version 2.0 to be able to extract the metadata of the corresponding CLR assemblies. In the next two articles, I'll explain how to proceed to this extraction, and I'll also show you some specific code to handle generics in the case of dedicated .NET 2005 compilation.

Founders at Work



Add your comments

Please keep your comments relevant to this blog entry: inappropriate or purely promotional comments may be removed. To add hyperlink, please follow this example: "your link text":http://your.link.url