Author: Robin Dewson
Editor: Simon Robinson
Introduction
After several years of the prospect being talked about, SQL Server finally has the CLR runtime the included in its functionality. This gives you as a developer a massive boost in the development of systems in that you can now build .NET stored procedures. Although T-SQL remains the best method to work with data, for those times that you need non T-SQL functionality it is now very easy to achieve. In the past it was necessary to write Extended Stored Procedures that were in C++, hook them in to the database and pray that there were no memory leaks and they were well written. For a developer, and a database administrator, these worries can now be managed.
This inclusion of support for the .NET framework is in my mind the greatest advancement within SQL Server 2005. I don't think anything has excited database developers as much as this for a long time. This doesn't mean T-SQL is dead, not by a long shot. But having the ability to build objects that can use .NET technology, use .NET stored procedures, or .NET user-defined-types expands the horizons of development, functionality and performance considerably over previous versions of SQL Server.
The greatest challenge for Microsoft has been the inclusion of Microsoft's .NET Framework Version 2.0 into SQL Server 2005. It is a challenge that I am glad they have taken and I can only see huge advancements in development processes.
Benefits of .NET Framework Integration
Up to now the only programming language available within SQL Server was T-SQL. At best, you could extend that via extended stored procedures written in C++. T-SQL is very fast for working with data, especially sets of data but very limited in functionality. If it was necessary to do anything T-SQL could not cope with, you had to find a different solution. Typically, if you were writing the business logic code in, say, C# and you needed to perform some processing on data that was not available through T-SQL you may well have written a .NET assembly that would connect to SQL Server, run some T-SQL code to get the relevent data back to the C# program, process that data and then return the results. The performance implications of this kind of approach are no doubt obvious and bad.
With the inclusion of the .NET framework it is now possible to have your code sit very close to the data. .NET integration not only lets you write .NET programs, like stored procedures, but also allows assemblies to exist as triggers or user-defined aggregations that expand the limits of SQL functions like SUM, AVG, etc, or that supply user defined types, thus increasing the data types available from the choice offered by T-SQL (int, nvarchar, datetime, etc.).
I should stress that T-SQL is not in any way considered bad or deprecated. On the contrary, T-SQL is still the fastest method for processing set based data, but if you need that extra bit of functionality or even clarity by writing a .NET stored procedure instead of a cumbersome T-SQL one, then .NET integration will be the solution that you want.
How it Works
Unlike the case for T-SQL code, when building managed objects using .NET integration you don't use SQL Server Management Studio to write the code. What happens is that you write a managed DLL assembly and compile that as normal using the appropriate compiler (eg. C# or VB). You must implement certain interfaces and apply certain attributes so that the assembly and the classes contained in it can work with SQL Server. Once the assembly is created, it is loaded in to SQL Server via a SQL command, CREATE ASSEMBLY . More details of these can be found in a forthcoming Apress book Professional SQL Server 2005 Assemblies.
Security Implications
One worry you, and especially your database administrators, may have about switching on CLR support in SQL Server is whether there are any security concerns associated with allowing CLR code to run in SQL Server. By default CLR integration is switched off, and you may well find your database administrators reticent to switch it on, as they may have been bitten in the past with rogue C++ Extended Procedures, or perhaps because the code you will be submitting is not T-SQL they will be uncomfortable assessing whether the code is safe. CLR-based assemblies will when executing still have a system process id, ( spid ), and therefore the processes can be killed just like any other erroneous process. Another worry will be that code may not free up memory once resources are complete as occasionally happened with badly written C++ Extended Stored Procedures in prior releases of SQL Server. This shouldn't normally be a problem because .NET has, thanks to garbage collection, a very efficient process for ensuring memory is not wasted. Thirdly your administrators may be worried that CLR based assemblies can access resources outside of SQL Server and this will have security implications or could be used to access areas of the company infrastructure that could bring down SQL Server. Microsoft have thought of this and they have applied 3 levels of security to assemblies, SAFE , EXTERNAL_ACCESS and UNSAFE . These levels of security are defined when the assembly is created within SQL Server. A database administrator will then know immediately what the risk level is that they are taking on.
- SAFE - the assembly is not accessing any resources outside of SQL Server. Although it may be executing other .NET assemblies there will be no attempt to access system resources like the Event Log, file system files, or Internet Explorer
- EXTERNAL_ACCESS means that the assembly still has garbage collection and is running safe code but is accessing resources outside of SQL Server so there is a potential that your server is at risk. For example, you would have this setting if you needed to write to the Event Log. Your server could stall if the event log became full
- UNSAFE is the highest risk and you really will need to convince a DBA that the risk is justifiable. I see UNSAFE as comparable to running a C++ Extended Stored Procedure. You may be executing code that access .NET assemblies that run unmanaged code for example (which brings the risk of memory problems if the code is badly written).
SQL .NET Features
But you may wonder what you can do with this .NET integration. There are a number of types of objects that can be written and I'll briefly cover those now
.NET Stored Procedures
Stored procedures can be called from any T-SQL code in the same way as T-SQL based stored procedures. A stored procedure comprises a single unit of work. In the main you would use a .NET Stored Procedures for those times that T-SQL code does not have the necessary functionality required, for example some string processing, math or security function for example. A .NET stored procedure can access data back in SQL Server so it is still possible for it in turn to call a T-SQL stored procedure. You can even use a new ADO.NET connection to access some data that is provided from outside the server the .NET stored procedure has been called from. .NET stored procedures are most useful when you need to either perform row at a time processing or processing that requires functionality provided by existing .NET assemblies.
The code below is the basics of a .NET Stored Procedure assembly created by a DLL called Sproc1.dll.
namespace Apress.CallASproc
{
public partial class StoredProcedures
{
public static void RunASproc(string InputParamFromSQLServer)
{
SqlCommand sqlCmd = SqlContext.GetCommand();
sqlCmd.CommandText = "AnySproc";
sqlCmd.CommandType = CommandType.StoredProcedure;
sqlCmd.Parameters.Add
("@Param1", SqlDbType.NVarChar, 255).Value =
InputParamFromSQLServer;
SqlPipe pipe = SqlContext.GetPipe();
pipe.Execute(sqlCmd);
}
To then load this in to SQL Server and link it with running a SQL Server stored procedure then you would need the following code to run in a query pane
CREATE ASSEMBLY NETSproc AUTHORIZATION [dbo] FROM 'C:\Program Files\Microsoft SQL Server\Assemblies\Sproc1.dll' WITH PERMISSION_SET=SAFE CREATE PROCEDURE MySproc (@TheInputParm nvarchar(255)) AS EXTERNAL NAME Sproc1.[ Apress.CallASproc.StoredProcedures].RunASproc
Then when you wished to run the .NET stored procedure you would have in your code
MySproc "TheIngoingParm"
User Defined Data Types
Up until SQL Server 2005, user-defined data types (UDT's) had to be derived from base SQL Server data types like nvarchar , int , datetime . This is no longer the case. For example, it is possible to have a data type just for time, i.e. HH:MM without having the extra information of a "dormant" date as you have with the datetime data type. Or perhaps you wish a data type of a square on a chess board. In the past this might be CHAR(2) , one item for the row number and one item for the column letter. Or you may define this as two different columns. Now you could write a UDT where the first part would be a letter for the column and the second an integer for the row.
The following code snippet demonstrates how a UDT called ClientName would receive the information in to it when it is called from T-SQL Code. It would then split the information up in to first name and last name which could then be accessed separately via their own properties.
[SqlMethodAttribute(IsDeterministic = true)]
public static ClientName Parse(SqlString fullDetails)
{
if (fullDetails.IsNull)
{
return null;
}
else
{
ClientName FullName = new ClientName();
string FullNameCv = Convert.ToString(fullDetails);
string[] nameDets = FullNameCv.Split(new char[] { ' ' }, 2);
FullName.FirstName = nameDets [0];
FullName.LasttName = nameDets [1];
FullName.bIsNull = false;
return (FullName);
}
}
User Defined Functions
Unlike current T-SQL functions, .NET ones these can use .NET functionality to expand the possibilities of what is possible. A user defined function is a piece of code that can run against a single column for each row of data within your T-SQL. A stored procedure is a whole program. An example of a function would be one for "SalesTax". It would have passed in to it the total sale value and return from it the calculated local sales tax for that item.
Using .NET functionality it is now possible as an example, to use pre-written mathematical functions from the Math .NET assembly to provide smaller, more concise and easier to understand functions. Or perhaps you wish to perform some string or numerical manipulation. Again via a user-define function, or UDF this can be done.
The following code snippet shows an extremely simple function where we take in two parameters, SalesTotal and the tax rate and return the tax amount. Normally you would do this in T-SQL as it is so straight forward but the code shows the principle of how you define a UDF.
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlMoney SalesTax(SqlMoney SalesTotal, SqlMoney TaxRate)
{
// Put your code here
return new SqlMoney(SalesTotal * (TaxRate / 100));
}
User Defined Aggregations
SQL Server has good, but limited aggregations. For many years, COUNT , AVG , MIN , MAX , etc have served developers and database administrators well. But have you ever wished that there were other aggregations? Science based applications especially lack aggregations that could be used to work out scientific results. User-defined aggregations, or UDAGGs, which work in the same fashion as existing aggregations are now possible.
The following is pseudo-code of an aggregation that calculates interest. I haven't shown the implementation of the interest calculation, but have shown the main method stubs that illustrate the kinds of things you need to implement if you are writing an aggregation - it's not a simple step as running an aggregation involves a number of steps. In particular, when the first row of data is found, the Init() method will be automatically called by SQL Server and when you reach the end of your set of data then the Terminate() method is auto called and returns the results of your aggregation. Every row in between will automatically call the Accumulate() method.
private float runningInterest = 0;
private SqlConnection sqlConn = SqlContext.GetConnection();
private EventLog evl = new EventLog("UDA", ".", "InterestCalc");
public void Init()
{
}
public void Accumulate(float Value, DateTime asAt)
{
//Here you work out your interest
}
public void Merge(Interest Group)
{
// Put your code here if you want to merge any data
}
public float Terminate()
{
// Put your code here
return runningInterest;
}
.NET Triggers
The final area where you can use managed code is .NET based triggers. These can either be Data Modification Language (DML) triggers or Data Definition Language, (DDL), triggers. If you write a .NET based DML trigger that is invoked when a row insertion, update or deletion takes place, you will still have access to the tables inserted and deleted as you do at present. However using triggers using T-SQL finding the column that has been updated requires a cumbersome set of code checking bit settings and the even more cumbersome process when you have a large number of columns. Using .NET it is possible through checking a collection which columns have been updated and is a much simpler process. DDL triggers, that is to say triggers that fire on an action like adding a stored procedure, dropping a table, etc, can be used for many different scenarios. My favorite scenario is one that would be used by database administrators. It would be possible to have a DDL trigger fire when a stored procedure is created. The DDL trigger could then make checks such as ensuring that the action is not happening during core hours, and if it was, then the action could be rolled back with an email being sent informing the database administrator that this was being attempted.
But you may now be thinking that this is the end of T-SQL code, or perhaps, when should I use T-SQL or .NET? Time to take look at this briefly now.
T-SQL or Managed code
This will be the question many people will be asking and want an instant answer to: When do you use T-SQL and when do you use managed code? Unfortunately there is no single universal answer to fits all situations. Obviously you will use managed code if T-SQL simply cannot do what you want to do. Also, .NET code in the main will run within the SQL Server process therefore it will be executing "close to the data", in otherwords, there is no network connection for commands and data to pass over. However, there will be times that ADO.NET will be invoked and this will imply a performance degradation on your .NET code as opposed to similar T-SQL code.
As a developer, that last thing you want to do though is have to write a solution twice, once in T-SQL and once in .NET to see which performs the better so that the best option is selected. The best rule of thumb I can give you is this. Set based data processing is best done within T-SQL. Complex logical processing or processing that requires .NET classes should or has to be completed within a .NET assembly. But where possible, try to keep to T-SQL for data processing as this is where SQL Server 2005 can tune queries.
Web Services
SQL Server Web Services are not new. A SQL Server Web service was very easy to write by simply having some XML output produced via SQLXML and having a Web service written in .NET or other code take and return the data. However within SQL Server 2005 it's even easier. Native XML Web Services are provided via SQL Server through either HTTP, SOAP or WSDL and replace the use of SQLXML which required IIS to be installed.
The HTTP option does require Service Pack 2 on XP to be installed, or for SQL Server to be running on a Win2k3 server.
Through the use of HTTP and SOAP, it is no longer necessary for IIS and MDAC to be present; these were mandatory technologies for Web Services built for SQL Server 2000in fact, the Windows operating system is also not required either.
The removal of the need of the components such as IIS and MDAC give SQL Server Web Services greater flexibility and allow technologies such as Java that may be running on a Unix box easier access to connect to SQL Server via these Web Services.
ADO.NET Notification Support
Virtually every application will have tables that remain relatively static. These could be tables that hold addresses based on post codes, product information tables, or a table holding a set of your suppliers. There may be times that the data will change, and you will want to pick up those changes, but these changes may be rare. In the past you would either read in the table at the start of your application and store the results and ignore any changes until the application restarted or have to read the data each time and take the performance hit. It was a case of deciding which scenario was best for you. Tables could be held within SQL Server cached memory so retrieval would still be fast but your query would still consume resources. Although in principle triggers exist in SQL Server to solve this kind of problem, up to now ADO.NET hasn't supplied the infrastructure to make implementing this kind of solution very easy. Now with ADO.NET 2.0 Notification support it is possible to send a command to SQL Server and to receive a notification if executing the same command again would give different results. The notification is sent through to your application via a Service Broker queue that can be polled.
Multiple Active Results Sets (MARS)
MARS are really a new feature of SQL Server itself and not directly related to .NET integration but I've chosen to discuss this topic here because it has the potential to hugely simplify ADO.NET coding for some problems. Until now, when you have sent a query to SQL Server asking for the results in a data reader, you have had to loop round in code checking whether all the resulting rows from your T-SQL query have been returned before being able to issue a second command. Or perhaps you wished to execute a second query for each row returned. This would require two connections to SQL Server, one connection for the outer loop and the second connection for the inner loop. This is demonstrated in the code below taken from an ASP.NET C# web page with a Table control on it.
string salesOrderNo = "43659,43661,43663";
string sqlServerConn = "Data Source=localhost;Integrated Security=SSPI;" +
"Initial Catalog=AdventureWorks";
SqlConnection sqlHeaderConn = new SqlConnection(sqlServerConn);
SqlConnection sqlPiecesConn = new SqlConnection(sqlServerConn);
SqlCommand headerComm = new SqlCommand("SELECT SalesOrderID,OrderDate," +
"SalesOrderNumber,SubTotal, TaxAmt, Freight,TotalDue" +
" FROM Sales.SalesOrderHeader" +
" WHERE SalesOrderId IN (" + salesOrderNo + ")",sqlHeaderConn);
SqlCommand detailComm = new SqlCommand("SELECT p.Name,OrderQty,UnitPrice," +
"LineTotal" +
" FROM Sales.SalesOrderDetail sod " +
" JOIN Production.Product p ON p.ProductID = sod.ProductID" +
" WHERE SalesOrderId = @sodId",sqlPiecesConn);
headerComm.CommandType = CommandType.Text;
detailComm.CommandType = CommandType.Text;
sqlHeaderConn.Open();
sqlPiecesConn.Open();
using (SqlDataReader salesHeader = headerComm.ExecuteReader())
{
while (salesHeader.Read())
{
string sodId = salesHeader["SalesOrderID"].ToString();
TableRow rowAdd = new TableRow();
tbSales.Rows.Add(rowAdd);
TableCell cellAdd1 = new TableCell();
cellAdd1.Text = salesHeader["SalesOrderID"].ToString();
rowAdd.Cells.Add(cellAdd1);
TableCell cellAdd2 = new TableCell();
cellAdd2.Text = salesHeader["OrderDate"].ToString();
rowAdd.Cells.Add(cellAdd2);
TableCell cellAdd3 = new TableCell();
cellAdd3.Text = salesHeader["SalesOrderNumber"].ToString();
rowAdd.Cells.Add(cellAdd3);
TableCell cellAdd4 = new TableCell();
cellAdd4.Text = salesHeader["SubTotal"].ToString();
rowAdd.Cells.Add(cellAdd4);
TableCell cellAdd5 = new TableCell();
cellAdd5.Text = salesHeader["TaxAmt"].ToString();
rowAdd.Cells.Add(cellAdd5);
TableCell cellAdd6 = new TableCell();
cellAdd6.Text = salesHeader["Freight"].ToString();
rowAdd.Cells.Add(cellAdd6);
TableCell cellAdd7 = new TableCell();
cellAdd7.Text = salesHeader["TotalDue"].ToString();
rowAdd.Cells.Add(cellAdd7);
detailComm.Parameters.AddWithValue("@sodId",Convert.ToInt32(sodId));
using (SqlDataReader salesDetail = detailComm.ExecuteReader())
{
while (salesDetail.Read())
{
TableRow rowAdd2 = new TableRow();
tbSales.Rows.Add(rowAdd2);
TableCell cellAdd8 = new TableCell();
cellAdd8.Text = salesDetail["Name"].ToString();
rowAdd2.Cells.Add(cellAdd8);
TableCell cellAdd9 = new TableCell();
cellAdd9.Text = salesDetail["OrderQty"].ToString();
rowAdd2.Cells.Add(cellAdd9);
TableCell cellAdd10 = new TableCell();
cellAdd10.Text = salesDetail["UnitPrice"].ToString();
rowAdd2.Cells.Add(cellAdd10);
TableCell cellAdd11 = new TableCell();
cellAdd11.Text = salesDetail["LineTotal"].ToString();
rowAdd2.Cells.Add(cellAdd11);
}
}
detailComm.Parameters.RemoveAt("@sodId");
}
}
With MARS the relevent code above can be changed to the following
string sqlServerConn = "Data Source=localhost;Integrated Security=SSPI;" +
"Initial Catalog=AdventureWorks";
SqlConnection oneConn = new SqlConnection(sqlServerConn);
SqlCommand headerComm = new SqlCommand("SELECT SalesOrderID,OrderDate," +
"SalesOrderNumber,SubTotal, TaxAmt, Freight,TotalDue" +
" FROM Sales.SalesOrderHeader" +
" WHERE SalesOrderId IN (" + salesOrderNo + ")", oneConn);
SqlCommand detailComm = new SqlCommand("SELECT p.Name,OrderQty,UnitPrice," +
"LineTotal" +
" FROM Sales.SalesOrderDetail sod " +
" JOIN Production.Product p ON p.ProductID = sod.ProductID" +
" WHERE SalesOrderId = @sodId", oneConn); headerComm.CommandType = CommandType.Text; detailComm.CommandType = CommandType.Text;
oneConn.Open();
If you had code similar to the first set of code, within an ASP.NET, or any application, you are immediately reducing the number of connections available for use by half. In the past one alternative solution would have been to use a stored procedure and complete cursor processing for each row. By bringing the data client side with two connections may have been a solution you employed to avoid using cursors on the server. With MARS now and the ability of .NET stored procedures, you may, instead of having the above MARS enabled code snippet within your ASP.NET application, you may find its now better to have it within a .NET stored procedure.
Its not just reading data that MARS can do, it is also possible to read and update data at the same time as well.
MARS is enabled via the connection string by default but it can be switched off by including the option MultipleActiveResultsSets=False in your connection string or using the StringConnectionBuilder class, similar to the code below.
string sqlServerConn = "Data Source=localhost;Integrated Security=SSPI;" +
"Initial Catalog=AdventureWorks";
SqlConnectionStringBuilder connBuilder = new
SqlConnectionStringBuilder(sqlServerConn);
connBuilder.MultipleActiveResultSets = false;
SqlConnection oneConn = new SqlConnection(connBuilder.ToString());
StringConnectionBuilder is a new class in ADO.NET 2.0 and provides a method to create and manage the contents of connection strings. Reasons why you might disable MARS include that they are not thread safe or you need true parallel execution of commands. There are also processing and memory overheads of having MARS enabled as up to 10 sessions could be cached in memory.
Incidentally you might wonder what happens if you try to execute code that uses MARS if MARS is disabled. Figure 1 illustrates this with the MARS code I presented earlier.

Figure 1. Attempting to use MARS when MARS is disabled
This figure has been reduced in size to fit in the text. To view the full image Click here
SQL Server Integration Services (SSIS)
Formerly known as DTS, this area of SQL Server has been rewritten from the ground up and renamed as SQL Server Integration Services (SSIS). An article such as this just cannot do SSIS any sort of justice and it really does need a whole book.
The first main point is that you build SSIS packages out of the Business Intelligence Development Studio and not the SQL Server Management Studio. SQL Server Management Studio though is the tool that would execute the packages.
To access SSIS you start up the Business Intelligence Development Studio from the Programs | Microsoft SQL Server menu options. Prior to SQL Server 2005, DTS had improved from a simple bulk copy program to something a bit more sophisticated. It was possible for example to link several operations together using a simple success of fail option. It was also possible to perform transformations of the data using VBA. However SSIS has moved on to the point where complex decisions via control of flow actions can determine how a package will proceed from one step to another. There are a great deal more pre-built transformation tasks and actions that can be performed as part of the package.
To build an SSIS solution, you still need to have data sources to have the data retrieval from. However the possible sources have changed and the data source destinations now are under their own toolbox items as well. For example to connect to a SQL Server instance you would now use a DataReader Source. There are also many more transformations that are possible each built around more defined and less generic transformations like previous versions. This will make your SSIS package less reliant on VBA code to complete transformations that can now be selected instead.
Instead of there just being one design area where every DTS object was placed, SSIS now consists of three design surfaces, one explorer tab, and one runtime surface. The first tab is the control flow design surface. This is used like the workflow items that you had prior to SQL Server 2005 and will show the order in which tasks are performed. This tab also shows when there are repeated workflow items to perform via the FOR loop and the FOREACH loop container. In this tab you will also see the more familiar tasks that you used in SQL Server 2000
The Data Flow design tab is used for data specific functionality, as you would have used for transformations, VBA code etc. It is also here you define where the resultant data will be loaded to
The third tab is the Event Handlers design tab is used to say what happens when a task succeeds or fails for example or actions on many different events. In the past you could only have a Completed or Failed route and then you had to execute a task on that event. With so many different possible events such as OnProgess, OnPreExecute, and so on, there are many more possibilities for recording and completing actions at times that will offer a developer or someone monitoring the system to know what is happening.
There is a fourth tab that is the Package Explorer tab. This is like the options within a DTS package that you had where you would define pre-created global variables like the package name, version, etc. It also shows in a tree view details about the package.
When the package runs in "design mode" then a progress tab is added. This details what is happening within the package at a specific time.
It is impossible to show an example of an SSIS package within this article, it just would take too long, but this section should give you a flavor of how much has altered.
Conclusion
There are a plethora of changes as you might expect as there has been 5 years since the last major release. SQL Server 2005 is not only a natural progression for a major database player but also a powerful upgrade. For me, the inclusion of the .NET Framework is the greatest leap in functionality and potential area of explosion of power that comes with this release. There are a large number of other improvements, re-writes and changes that have to be taken on board and I have just scratched the surface in this article. I think the biggest problem associated with inclusion of .NET that will occur will be the reluctance of database administrators to switch this facility on, until they learn how safe it is, and how it can help their lives as well as those of developers. .NET integration is where I feel as a SQL Server developer you now need to concentrate your learning.
But .NET is not the only area. The rewrite of Data Transformation Services, known as SSIS, to a vastly more flexible and powerful data import, export and manipulation tool is also something that will get many of you excited about. XML improvements including areas such as the xml data type and XQuery expand the horizons in the area. The embracing of SQL Server Notification Services as a full part of the product, and new technology like Service Broker all make this product more powerful, functionally rich and will make it harder to choose competitor products like Oracle. The next year or two will in the database arena be the most exciting and challenging times for a developer and database developer.
Related Links:
AUTHOR: Robin Dewson
TITLE: Forthcoming Apress book, Pro SQL Server 2005 Assemblies
URL: http://www.apress.com/book/bookDisplay.html?bID=10032
