Doc storage in SQLce and the 4gb limit

We’ve been getting a number of questions regarding Doc/blob storage and the current 4gb limit. 
For doc storage, I’m hearing two concerns. 

  • Perf when SQLce is used for blob storage.
  • Removing the 4gb limit so that you can store more docs in the database. 

On devices, space was confined, so these just weren’t mainline scenarios.  On desktop, we realize it will be.  For SQLce 3.1, we didn’t do a ton of work as the goal was to get the existing SQL Server Mobile product out and available for general desktop usage.  Most of the work went into MSI packaging, private deployment and of course, the several name changes the team had to deal with along the way. <g>

Sravanth, our Program Manager running our Perf lab for SQLce has done a lot of great perf work.  His team identified several opportunities to increase perf, such as bumping the .sdf file allocation up when more space is needed.  The team incorporated several of their findings into the 3.1 release, and will incorporate additional enhancements in the 3.5 release.  However, the team hasn’t yet done extensive testing of blob storage for desktop scenarios but will be soon.  They’re also working on a perf white paper based on their findings.  I’ll post some snippets as a preview to the doc soon. 
Regarding blob storage; I don’t want readers to think the team doesn’t consider these important scenarios for desktop usage.  Just being upfront that on devices this wasn’t the case so it may not be optimal with the current release.  For the desktop scenarios where SQLce has been used for products like Media Center PC, the data didn’t incorporate large amounts of storage, or blobs.  As Sravanth’s team has more perf data, we’ll post the findings, and let you know what we plan to do about it.  The obvious thing to start with is to keep blobs in separate tables.  Having a single table with an additional blob will certainly slow things down for any number of reasons.  Beyond that, …please stand by… and we’ll have real numbers soon.

For the 4gb limit, we recognize the need, but because the product was optimized around this large size for devices, it’s not a simple thing for the team to change.  On devices, 4gb is beyond the normal size scenario.  Anyone seen a Pocket PC device with more than 4gb of user storage?  I’m sure there are some, but certainly not the common.  The desktop, or laptop comparison would likely be a 1 terabyte 7200rpm drive.  Unlike Express Edition, SQLce isn’t meant to be the free database with the constrained set of features to get people started.  So, we will either remove this constraint, or put it in the range where it just won’t be an issue.  As for when, well, that’s a great question.  We have lots of requests.  It’s all a matter of prioritization.  If it were as simple as setting a bit somewhere, we’d run some perf tests to make sure it didn’t topple over and just change the bit.  Because the engine was optimized to assume a 4gb limit, it will just take a bit more work and the team just needs to prioritize this cost with several other features.

I’ll see if we can post the list of features we’re considering so you can post your comments on our priority list.  But, we’ve heard 64bit, XML, Eventing (something to fill the Trigger gap), and of course the sync runtime.  The 3.5 release is mostly focused on enabling the Sync Services (OCS) features I’ve been talking about here.  When talking to customers, as soon as they become comfortable with using SQLce as their local data store, most of the time, the very next question is how to synchronize it with their central server, and how to sync via services.  As part of the 3.5 release we’ll also have some additional features to round out the product, such as the TimeStamp datatype for better, and consistent concurrency checking. 

So, we hear you on the blob storage and 4gb limit.  What would help us is to get a feel for your specifics.

  • Do you need blobs in the same table?  Or can they be stored in a related table, 1:1, or 1:many?  Today, we don’t support column level synchronization in the new Sync Services.  Is storing objects in separate tables a reasonable solution?
  • What’s the range of file sizes do you think you’ll be storing on average?
  • How many objects do you think you’ll need to store on average?
  • Are these blobs mostly reference data originated from the server?  Created on the client and synchronized up?  Yes (ie both). 
  • Will these blobs have lots of changes locally before or after they are synchronized with the server?