Pro C# 2008 and the .NET 3.5 Platform: A Sample

Nov 20, 09:42 pm

Given that .NET defines two major categories of types, you may occasionally need to represent a variable of one category as a variable of the other category. To do so, C# provides a very simple mechanism, termed boxing, to convert a value type to a reference type. Assume that you have created a variable of type short:


// Make a short value type.
short s = 25;

If during the course of your application you wish to represent this value type as a reference type, you would box the value as follows:


// Box the value into an object reference.
object objShort = s;

Boxing can be formally defined as the process of explicitly converting a value type into a corresponding reference type by storing the variable in a System.Object. When you box a value, the CLR allocates a new object on the heap and copies the value type’s value (in this case, 25) into that instance. What is returned to you is a reference to the newly allocated object. Using this technique, .NET developers have no need to make use of a set of wrapper classes used to temporarily treat stack data as heap-allocated objects.

The opposite operation is also permitted through unboxing. Unboxing is the process of converting the value held in the object reference back into a corresponding value type on the stack. The unboxing operation begins by verifying that the receiving data type is equivalent to the boxed type, and if so, it copies the value back into a local stack-based variable. For example, the following unboxing operations work successfully, given that the underlying type of the objShort is indeed a short:


// Unbox the reference back into a corresponding short.
short anotherShort = (short)objShort;

Again, it is mandatory that you unbox into an appropriate data type. Thus, the following unboxing logic generates an InvalidCastException exception:


// Illegal unboxing.
static void Main(string[] args)
{ short s = 25; object objShort = s;

try { // The type contained in the box is NOT an int, but a short! int i = (int)objShort; } catch(InvalidCastException e) { Console.WriteLine(“OOPS!\n{0} “, e.ToString()); } }

At first glance, boxing/unboxing may seem like a rather uneventful language feature that is more academic than practical. In reality, the (un)boxing process is very helpful in that it allows us to assume everything can be treated as a System.Object, while the CLR takes care of the memory-related details on our behalf.

To see a practical use of this technique, assume you have created a System.Collections.ArrayList to hold numeric (stack-allocated) data. If you were to examine the members of ArrayList, you would find they are typically prototyped to receive and return System.Object types:


public class System.Collections.ArrayList : object,
  System.Collections.IList,
  System.Collections.ICollection,
  System.Collections.IEnumerable,
  ICloneable
{
...
  public virtual int Add(object value);
  public virtual void Insert(int index, object value);
  public virtual void Remove(object obj);
  public virtual object this[int index] {get; set; }
}

However, rather than forcing programmers to manually wrap the stack-based integer in a related object wrapper, the runtime will automatically do so via a boxing operation:


static void Main(string[] args)
{
  // Value types are automatically boxed when
  // passed to a member requesting an object.
  ArrayList myInts = new ArrayList();
  myInts.Add(10);
  Console.ReadLine();
}

If you wish to retrieve this value from the ArrayList object using the type indexer, you must unbox the heap-allocated object into a stack-allocated integer using a casting operation:


static void Main(string[] args)
{
… // Value is now unboxed. int i = (int)myInts0;

// Now it is reboxed, as WriteLine() requires object types! Console.WriteLine(“Value of your int: {0}”, i); Console.ReadLine(); }

When the C# compiler transforms a boxing operation into terms of CIL code, you find the box opcode is used internally. Likewise, the unboxing operation is transformed into a CIL unbox operation. Here is the relevant CIL code for the previous Main() method (which can be viewed using ildasm.exe):


.method private hidebysig static void  Main(string[] args) cil managed
{
...
  box  [mscorlib]System.Int32
  callvirt  instance int32 [mscorlib]System.Collections.ArrayList::Add(object)
  pop
  ldstr  "Value of your int: {0}"
  ldloc.0
  ldc.i4.0
  callvirt  instance object [mscorlib]
    System.Collections.ArrayList::get_Item(int32)
  unbox  [mscorlib]System.Int32
  ldind.i4
  box  [mscorlib]System.Int32
  call  void [mscorlib]System.Console::WriteLine(string, object)
...
}

Note that the stack-allocated System.Int32 is boxed prior to the call to ArrayList.Add() in order to pass in the required System.Object. Also note that the System.Object is unboxed back into a System.Int32 once retrieved from the ArrayList using the type indexer (which maps to the hidden get_Item() method), only to be boxed again when it is passed to the Console.WriteLine() method, as this method is operating on System.Object types.

The Problem with (Un)Boxing Operations

Although boxing and unboxing are very convenient from a programmer’s point of view, this simplified approach to stack/heap memory transfer comes with the baggage of performance issues (in both speed of execution and code size) and a lack of type safety. To understand the performance issues, ponder the steps that must occur to box and unbox a simple integer:

  1. A new object must be allocated on the managed heap.
  2. The value of the stack-based data must be transferred into that memory location.
  3. When unboxed, the value stored on the heap-based object must be transferred back to the stack.
  4. The now unused object on the heap will (eventually) be garbage collected.

Although the current Main() method won’t cause a major bottleneck in terms of performance, you could certainly feel the impact if an ArrayList contained thousands of integers that are manipulated by your program on a somewhat regular basis.

Now consider the lack of type safety regarding unboxing operations. As previously explained, to unbox a value using the syntax of C#, you make use of the casting operator. However, the success or failure of a cast is not known until runtime. Therefore, if you attempt to unbox a value into the wrong data type, you receive an InvalidCastException:


static void Main(string[] args)
{ ArrayList myInts = new ArrayList(); myInts.Add(10);

// Runtime exception! short i = (int)myInts0; // Now it is reboxed as WriteLine() requires object types! Console.WriteLine(“Value of your int: {0}”, i); Console.ReadLine(); }

In an ideal world, the C# compiler would be able to resolve these illegal unboxing operations at compile time, rather than at runtime. On a related note, in a really ideal world, we could store sets of value types in a container that did not require boxing in the first place. Generics are the solution to each of these issues.


This excerpt was from Pro C# 2008 and the .NET 3.5 Platform, Fourth Edition by Andrew Troelsen.

Founders at Work



Add your comments

Please keep your comments relevant to this blog entry: inappropriate or purely promotional comments may be removed. To add hyperlink, please follow this example: "your link text":http://your.link.url