Stored Procedures and SQL Server Compact – the Great Debate

.style1
{
font-weight: bold;
}

Much has been debated about the need for stored procedures within SQL Server Compact.  We often hear customers say, “Without stored procedures, how can you call yourself a real database”.  Followed by: “Is this just a temporary situation, or will SQLce add sprocs in a future release?”  The simple answer is we don’t intend to support stored procedures as an equivalent feature of our bigger brother, SQL Server.  Why, read on… 

If you really do want the exact features of the server on the client, than the beauty is Microsoft does have a product; SQL Server Express.  The thing to keep in mind is where do you prioritize your needs on the client?  Are you looking for developer simplicity so your developers can use the exact programming model of the server on the client?  Or do you need a consistent programming model across all clients from device to desktop?  Do you prioritize your end user experience for simplicity of deployment, configuration and overall working set?  With SQL Server Compact and SQL Server Express, Microsoft delivers a choice for where you want your consistency. 

  • SQL Server Express – consistency with the server programming model is the priority, with the assumption that deployment and working set are less of a concern. 
  • SQL Server Compact – consistency across all Microsoft clients with a focus on simplicity of deployment, and overall minimal working set.  A common subset of the server features are available, but full consistency isn’t the concern, or priority.

SQL Server delivers from the smallest device, to the largest data center

Finding SQL Server Compacts Center of Value

At the heart of the issue is we target SQL Server Compact as the “compact” version of SQL Server.  SQL Server Compact (SQLce) uses a single file, code free format.  This makes SQLce one of the last code free, document formats. Every feature we do is gated on how it impacts our size, simplicity of deployment, code free, doc centric model and its ability to work across all Microsoft Clients (smallest devices on up to the desktop, and in some cases even the server).  SQLce is designed to support an individual’s set of data and become part of the application.  It can take advantage of the Windows document security system. If we take advantage of these scoping requirements, we have the freedom to do some very interesting features, yet still remain compact so developers never have to think twice about including SQLce within their application.  Some of those features include the updateable, scrollable, databindable SqlCeResultSet, others include the private deployment model unique to SQLce and the doc centric model.

Simplicity of moving code from my central database to the client

The most popular reason developers ask for SQLce to support sprocs is so they can simply copy/paste their server data access layer code to the client or replace their MSDE/Express install with SQLce.  It’s a fair, and on the surface seems a simple ask that would gain a huge advantage.  We all want to leverage the code we right, but how likely is it that your cached client has the same schema, or the same amount of data? It may only have some of the rows, some of the columns, or in some cases, may even de-normalize the database for client side simplicity.  On the server, we can assume endless resources, from data, to additional services. The client is typically meant to be an emissary to the application.  It’s there to gather information, make the end user productive, but typically not meant to be the “authority” for all changes.  For a related blogicle, see: Empowering your users with reference data and knowledge

If we supported sprocs, how much do you want?

Let’s just say SQLce supported stored procedures? Would you expect all the features of SQL Server to be supported?  SQL Server supports an extensive T-SQL syntax including intrinsic functions, types, DML, etc.  If we supported everything, SQLce would lose its primary goal of being compact and easy to deploy.  If we support a subset, what’s the appropriate subset?  Is it just the set of query operations?  Seems like a logical place to draw the line.  If that’s the case, do you really need stored procedures, or do you just need a central place to put your queries?

The Value of a Sproc

Before we go further into why we’re not looking to add stored procedures, it helps to review the typical reasons developers and DBAs typically use stored procedures.  The table below captures some of the common requirements, and whether they typically apply to the local store.

Requirement

Applies to a local store

Abstraction – By placing all the queries in a sproc, developers and DBAs can maintain a “public contract”, enabling changes to the underlying database, but maintaining the same query input parameters and returned result

X

For multi-user databases, this is an important feature as new applications that come online may require the underlying data model to change.  Sprocs give a great way to shield app2 from breaking app1.  On the client, do we need multiple application abstraction from a single database?  Or is the application and database coupled together on the client? 

Security – Certain applications/users are granted rights to groupings of sprocs.  These sprocs return only the information pertinent to that user.  Columns within the database, such as salary, commissions, etc., may never be exposed to the general user.

X

Multi-user databases require multi-level security models.  Each query could be from a different user, so different security models may apply.  In a local store scenario, the database is assumed to be the user’s partition of data.  If the user doesn’t require salary information, it’s never brought to the client in the first pace. For columns that are required on the client for processing, but not directly exposed to the user, can the application provide security abstractions through its user interface for how the data can be viewed and modified?

Performance By placing queries in sprocs, the database can “compile” them, locking in a query plan, making them optimized for the next execution

There are some definite benefits to “locking in” your query.  The reality is SQL Server has matured quite a bit, and no longer has the same benefits from caching stored procedures.  It’s not that stored procedures got slower, but SQL Server got faster and smarter about building a plan.  The opposite can also be true.  There are many situations where a sproc locked in a query plan when there was a small amount of data in the source tables.  As the tables grew, the locked plan was wrong reducing the performance.  While a local store doesn’t have nearly the same performance requirements, nobody likes a slow query, so we want to benefit from a caching solution.

Centralization of all T-SQL – DBAs like one place to go.  They don’t need to cull through lots of developer code to find an offending query. 

Until the database can participate in compile time verification at all levels of performance, it will continue to be helpful to have one place to view, edit, tweak and tune any queries that apply to the database.  It’s interesting to note that LINQ queries would likely follow this model as well.

Ability to incorporate business rules

Many DBAs and even developers sometimes attempt to use sprocs as their business layer.  With the SQL CLR one might assume SQL Server got even closer.  But combining the data layer and business layer can be one of the biggest bottlenecks to performance in an application.  Ironically, this is where SQLce can be easier to use. 

Does one size fit all?
While stored procedures may be over complicated for the client, it doesn’t mean they shouldn’t be used on the server.  Regardless of how you protect your central database, sprocs, views, services, any changes from any client should always be re-validated before applying them to the center of truth in your company. 

Layers of Complexity Justified by their Benefit

When making a cake for a party of 20, it’s impractical to bake a single cake. Instead, the cake is baked in layers, with icing in-between each layer.  However, if all we wanted was an easy way to give our kids a treat in their lunch bag, the good old cupcake comes in handy.  Same batter, same icing, but I ‘m not sure it’s practical to bake 5 separate 2″ cakes and layer them with icing. 

On the server we can justify the multitude of layers because of the added complexity of requirements.  Tables have Views.  Views are exposed through sprocs.  Sprocs, Views and Tables have table and column level permissions.  On the server, we need to serve a multitude of applications.  We need to protect the corporate assets, and must be able to scale to thousands of concurrent users.  On the client, we don’t have all these complex requirements.  All these layers become difficult to manage.  If the client were an exact mirror copy of the server, it might not be that bad, but as we discussed, that’s just not the reality of most applications.  5 layer cupcakes are interesting, but not very economically practical.

Mapping the Server Requirements to the Client
In the table above, we discussed a few of the features do apply to the client, so how can we achieve these features with SQLce on the client?

Abstraction
Abstraction is essentially a way to encapsulate a set of SQL statements to isolate the underlying store from the programming model.  This can be easily obtained by using a common dll for data access to the database.  If we place the queries in a common Queries.dll, then 2 desktop applications and a device application can all share the same common data access layer.  This looks a lot like what we’d do anyway as it’s a common practice to place all the ADO.NET Commands in a common Data Access Layer (DAL) dll.  The benefit here with SQLce is the same Queries.dll can be shared across devices and desktop as SQLce is consistently available across all Microsoft client environments.

What’s the difference?
Assuming we did support sprocs within SQLce, we’d likely have some code similar to the following.  We’d have a method signature in our code that takes some parameters for our query, and in this case the type of ResultSet we should return.  Within the method we construct an ADO.NET Command object, and set the CommandText to the name of the stored procedure.  We’d then have to go to the database and create that stored procedure, and manage it separately.

public SqlCeResultSet GetCustomersByName(string customerName, ResultSetOptions resultSetOptions)
{
    SqlCeCommand cmd = new SqlCeCommand(
                    Sales.DataAccess.SQLQueries.CustomerGetByName,
                    GetConnection());
    // Pass the customerName parameter
    cmd.Parameters.Add(“@companyName”, customerName);
    cmd.Connection.Open();
    return cmd.ExecuteResultSet(resultSetOptions);
}

Centralization of all T-SQL
One notion of sprocs is that all their definitions are made in a single location.  There’s no need to search through all the code to find queries that must be tuned.  Building on our abstraction layer, we can store all our T-SQL statements in a common location using the Resource designer introduced in Visual Studio 2005.  You might consider using the Settings Designer, however it’s not available in device projects, so we’ll use the Resource designer which is available across all project types.

I’ll add a Resource named SQLQueries to a project named Sales.DataAccess I can add an entry named: CustomerSelectAll with a value that contains the following query:

SELECT CustomerID, CompanyName, ContactName,
       ContactTitle, Address, City, Region,
       PostalCode, Country, Phone, Fax
  FROM Customers

In the Queries dll I could use the following code to return a SqlCeResultSet.  This is just the same as our above sproc example, with the exception that we’ll place the name of our strongly typed resource in the CommandText parameter.  We still have our queries as strings, but instead of them being placed within the database, we place them in the resources file of the project.  By using the code bolded below, we get a strongly typed reference to our named resource.  While we can’t validate the query text at build time, we can validate that all our named resources are valid.  If we ever wanted to know if a query was being used, we could simply rename the query in the resource designer, do a build and find any broken references.  Not something we could do with sprocs as the CommandText is simply the name of the sproc.  Sure, you could put the sproc name in resource, but we’re back to 5 layer cupcakes. 

There are other advantages to using the resources model as well.  If we were to place the query directly in the code below, we’d have to place it in quotes.  We’d likely get hung up with formatting, placing line wrappings with quotes all over the place.  Using the resource designer, we can simply copy/paste the query directly into SQL Server Management Studio to test the query, executing the show plan to understand how index are used.

public SqlCeResultSet GetAllCustomers(ResultSetOptions resultSetOptions)
{
    SqlCeCommand cmd = new SqlCeCommand(
        Sales.DataAccess.SQLQueries.CustomerSelectAll,
        GetConnection());
    cmd.Connection.Open();
    return cmd.ExecuteResultSet(resultSetOptions);
}

Of course this could be expanded to support parameters as well.  Assuming I only wanted customers given a CustomerName, within the Resource designer, I add another resource named CustomerGetByName, with the following query:

SELECT CustomerID, Company Name, ContactName,
       ContactTitle, Address, City, Region,
       PostalCode, Country, Phone, Fax
  FROM Customers
 WHERE CompanyName LIKE @companyName

I then add a method to my data access dll similar to the following:

public SqlCeResultSet GetCustomersByName(string customerName, ResultSetOptions resultSetOptions)

{
    SqlCeCommand cmd = new SqlCeCommand(
                    Sales.DataAccess.SQLQueries.CustomerGetByName,
                    GetConnection());
    // Pass the customerName parameter
    cmd.Parameters.Add(“@companyName”, customerName);
    cmd.Connection.Open();
    return cmd.ExecuteResultSet(resultSetOptions);
}

Turkey or Turducken: Ability to Incorporate Business Rules

Once you have access to the full framework, why would you limit your business rules to T-SQL?  Sure you could host the CLR within SQL Server, but then you’re hitting some interesting scalability questions.  On the server, you’re communicating with an external service.  With SQLce on the client, your hosting the database in-proc with your app. Placing business logic within the database could be viewed as a Turducken. With the above model of using a common data access dll, you can place your business logic within your managed sprocs, and still centrally manage the queries.  We’ll just call this the classic stuffed turkey.

public int CustomerInsert(string customerID, string companyName, string contactName, string contactTitle, string address, string city, string region, string postalCode, string country, string phone, string fax)
{
  DataValidationErrorCollection dataValidationErrors = new DataValidationErrorCollection();
  if (companyName.Trim().Length == 0)
    dataValidationErrors.Add(new DataValidationErrorItem(“CompanyName”,
        companyName,
        “A value for CompanyName must be provided”));
  if (contactName.Trim().Length == 0)
    dataValidationErrors.Add(new DataValidationErrorItem(“ContactName”,
        contactName,
        “A value for ContactName must be provided”));
  if (phone.Trim().Length < 10)
    dataValidationErrors.Add(new DataValidationErrorItem(“Phone”,
        phone,
        “Phone numbers must be at least 10 digits long, including the area code”));
  if (dataValidationErrors.Count> 0)
    throw new DataChangeException(
      “Validation Errors when attempting to create a Customer row”,
      dataValidationErrors); 
  using (SqlCeCommand cmd =
          new SqlCeCommand(DataAccess.SQLQueries.CustomerInsert,
                           GetConnection()))
    {
     cmd.Parameters.Add(“@customerID”, customerID);
    
cmd.Parameters.Add(“@companyName”, companyName);
    
cmd.Parameters.Add(“@contactName”, contactName);
    
cmd.Parameters.Add(“@contactTitle”, contactTitle);
     cmd.Parameters.Add(“@address”, address);
     cmd.Parameters.Add(“@city”, city);
     cmd.Parameters.Add(“@region”, region);
     cmd.Parameters.Add(“@postalCode”, postalCode);
     cmd.Parameters.Add(“@country”, country);
     cmd.Parameters.Add(“@phone”, phone);
     cmd.Parameters.Add(“@fax”, fax);
     cmd.Connection.Open();
     return cmd.ExecuteNonQuery();
    }
}

Security
For a local database, you have data on the client.  There are some interesting ways to secure that data, but the best security model is to limit your exposure in the first place.  Since we’re typically using a local database to represent caches of data from one or more remote sources, the first security model is to leverage those sprocs on the server to only download the data relevant to the particular user.  With technologies like Sync Services for ADO.NET and the Sync Framework, you no longer need to worry about publications that must be replicated across all your clients.  You have a lot of flexibility to simply issue sync aware queries that can be sprocs, views, functions, or any other SQL Server construct that returns data.

Performance
A client database has a different performance profile than that of a server.  For any given application a number of queries are executed per second.  On the server, you have a number of queries per application multiplied by the number of users.  This means that the server has a multiplier of N x NumberOfUsers, so performance on the client is a subset.

The other pattern to recognize is the server requires a stateless environment in order to scale, while the client can benefit from maintaining open connections.  For SQLce, the first connection per database loads the engine, while SQL Server loads the engine upon service startup.  On the client we still want to benefit from cached query plans. SQL Server Compact caches its query plans per SqlCeCommand object. Since we’re not typically trying to share a database across multiple applications, this model works well as the database doesn’t bloat with cached plans.  On SQL Server, a significant amount of memory can be consumed by cached plans and data, not a problem on the server as it’s typically dedicated to these sorts of loads, amortized across multiple clients.  On the client, we need to work within a more constrained memory model as the client memory is shared by the entire application, not just the data layer. It also needs to share with Outlook, PowerPoint, Media Player/Zune, IE, and all those other things running on your computer. So, we still benefit from cached plans, but in a slightly different model.

Summing it up
If we consider all the reasons developers and dba’s like sprocs, the majority of these either don’t apply to the client, or their simply managed a slightly different way.  There are lots of analogies here ranging from motorcycles and cars, to SUV’s and Tractor Trailers.  You drive each of these with a base set of common concepts. Motorcycles have a throttle, clutch, brake and gears. You may have to learn the subtleties of how these apply to cars, trucks and motorcycles, but anyone that rides a motorcycle would tell you that a trying to use a steering wheel just would destroy the experience of the ride. Once you adjust to a few things, riding a bike enables a freedom like no other vehicle.  Likewise, if I was trying to move my family from New York to Redmond Washington, my motorcycle wouldn’t be appropriate. 

Now, if I still haven’t convinced you there are reasons to leverage both Compact and Express for their given tasks, than you can certainly use SQL Server Express as your local store.  But when we go out to dinner downtown, I’ll probably just must meet you there as I’ll assume you’ll be driving a truck and will have a hard time finding a parking spot.

Steve