Windows Azure Table Storage vs. Windows SQL Azure

时间:2022-03-15 16:43:44

http://www.intertech.com/Blog/post/Windows-Azure-Table-Storage-vs-Windows-SQL-Azure.aspx

 

By Jim White (Director of Training and Instructor)

 

Last week, my fellow Intertech colleague and Microsoft MVP, Tim Star, and I presented the Windows Azure Bootcamp
for the Twin Cities.  According to Microsoft reps, we broke the record
for the most attended bootcamp in the US with nearly a hundred people at
the event.

 

A common question that I received during and after
the bootcamp was "Why would I want to use Windows Azure Table Storage
versus Windows SQL Azure to store my application data?"  A good question
with the answer dependent on your application and data needs.

 

Table Storage and SQL Azure Defined

 

First,
allow me to backup and set the stage a little for this discussion.  SQL
Azure is essentially SQL Server in the Microsoft cloud computing
environment known as Azure.  That is not quite true in that SQL Azure
currently has a number of limitations and unsupported features that SQL
Server 2008 has (here is a starter list of limitations:  http://msdn.microsoft.com/en-us/library/ee336245.aspx). 
I like to say that SQL Azure is either SQL Server 2008 minus or SQL
Express plus depending on how you want to view it.  Table Storage is one
of three alternate storage mechanisms built into Azure that is
collectively called Windows Azure Storage Services.  The other two being
queues and blobs.  Table Storage allows you to store serialized
entities in a table, but the term table here  is not a
relational database table.  To provide people with some analogy they
can use to get their arms around Table Storage, I like to tell people to
think of Table Storage as a fancy spreadsheet.  You can store the state
of your entities in the columns of the spreadsheet.  However, there is
no linkage or relationship (therefore joins) between entities - at least
none that is automatically managed and maintained by Azure.  There are
no custom indexes - at least not today.

 

Interestingly, when
Azure was first introduced in 2008, SQL Server was not part of the
original picture.  Because of developer negative reactions, SQL Azure
was added to the next preliminary release in 2009.  There is a growing
faction that is trying to get the software community to look at SQL
alternatives.  The "No-SQL" community (see here and here),
to some extent, has influenced part of the Azure cloud computing
platform through the Table Storage option, but not enough to eliminate
it from the Microsoft cloud.

 

While both SQL Azure and Azure
Table Storage provide data persistence via table structure, there are a
number of differences between them.  The sections below outline some of
the key differences and factors you want to weigh before building an
application for Azure that requires some form of table persistence.

 

Scale and Performance

 

When
looking at sheer volume, Table Storage is today far more scalable than
SQL Azure.  Given a storage account (storage accounts hold blobs, queues
and tables) is allowed to be 100TB in size, in theory your table could
consume all 100TB.  At first glance, a 100TB chunk of data may seem
overwhelming.  However, Table Storage can be partitioned.  Each
partition of Table Storage can be moved to a separate server by the
Azure controller thereby reducing the load on any single server.  As
demand lessens, the partitions can be reconsolidated.  Reads of Azure
Table Storage are load balanced across three replicas to help
performance.

 

Entities in Table Storage are limited to 1MB each
with no more than 255 properties (3 of which are required partition key,
row key, and timestamp).  That seems like an absurd number, and it is,
but remember that there are no relationships and joins in Table
Storage.  Therefore, you might need some wide tables to handle
associated data.

 

Today, SQL Azure databases are limited to 1GB
or 10GB.  However, sometime this month (June 2010), a 50GB limit is
supposed to be available.  What happens if your database is larger than
10GB today (or 50GB tomorrow)?  Options include repartitioning your
database into multiple smaller databases or sharding (Microsoft's
generally recommended approach).  Without getting into the database
details of both of these database design patterns, both of these
approaches are not without issue and complexity, some of which must be
resolved at the application level.

 

Data Access

 

Data in
the cloud, be it in SQL Azure or Azure Table Storage, can be accessed
from in or out of the cloud.  To access data in SQL Azure, all the
standard tools and APIs apply that work with SQL Server.  Meaning, your
existing .NET/SQL Server knowledge and experience can be heavily
leveraged.  ADO.NET and ODBC APIs can be used by application code to
access the SQL Azure database.  Tools like SQL Server Management Studio
and Visual Studio can be pointed to the SQL Azure instance and
manipulate the schema and data just as you do today with SQL Server. 

 

Access
to Azure Table Storage is accomplished either via REST API or Storage
Client Library provided with the Windows Azure SDK.  Using the REST API
allows client applications to communicate and use data from Table
Storage without having detailed and specific knowledge of an Azure API,
but it is more complex and difficult to work with.  The Storage Client
Library (which leverages LINQ to Objects) provides a layer of
convenience but requires the application reference the Storage Client
Library APIs.  REST and the Storage Client Library incur a learning
curve that is typically not there when using SQL Azure.

 

Portability

 

As
mentioned, data in SQL Azure and Table Storage can be accessed from
applications in and out of the cloud.  That means applications can be
moved in or out of the cloud and still deal with the data in the cloud
in the same way.  However, one question that may need to be considered
is whether the data must always live in the cloud? 
Applications generally view data in SQL Azure similar enough to data in a
normal SQL Server database as to allow the data to migrate back and
forth between the cloud and on-premise databases.  In fact, there are
even migration tools to help move data between instances of SQL Server
and SQL Azure. 

 

However, given the unique nature and access
APIs of Table Storage, portability of the data is not as straight
forward.  Table Storage tightly couples your data to the cloud.  Moving
the data out of the cloud would require an on-premise data storage
alternative, a data migration strategy, and likely require application
code changes.

 

Transactions and Concurrency

 

SQL Azure
supports typical ACID transactions for work within the same database. 
Transactions across databases are not supported.  SQL Azure allows for
typical optimistic and pessimistic concurrency strategies.

 

Table
Storage supports transactions for entities in the same table and table
partition, but not across tables or partitions.  Additionally, only 100
operations or less (what is called a batch in Azure Table Storage) can
be part of the transaction.  Only one operation can be performed on each
entity in the batch, and the batch must be limited to 4MB.  Table
Storage abides strictly by an optimistic concurrent strategy.  If, on
commit of the transaction, data has been changed by another process the
whole transaction must be rolled back and retried.  Due to this single
concurrency strategy, a built-in retry option is provided with the
Storage Client Library.

 

Queries

 

Using Table Storage,
queries are limited to 1000 entities by default.  If more than 1000
entities are found, a continuation token is returned and must be used by
the application to retrieve the next set of entities.  Queries that
take longer than 5 seconds also return a continuation token.  Queries
that take longer than 30 seconds are cancelled.  Data in Table Storage
is organized by partition key and indexed by row key.  Because there are
no custom indexes in tables, queries by partition key and row key are
fast, but queries that do not use partition key and row key are slow.

 

Generally
speaking, SQL Azure has no limitations, issues or special programming
requirements to work with large queries.  Good database and index design
can help improve performance of queries; especially large ones.

 

Column types

 

Columns in Table Storage are limited to the types in the table below.

 
                                                                                                                                                                                                      
byte[]
bool
DateTime
double
Guid
Int32 or int
Int64 or long
String
 

Cost

 

Perhaps
the most unique aspect to designing and architecting applications for
the cloud is that it requires developers to think like businessmen. 
Each technical choice often has direct costs associated with it when
developing for the cloud.  The choice in data storage can have a huge
impact on the cost of running an application.

 

Azure Table
Storage costs 15? per GB of storage per  month.  Additionally, you pay 1
cent per 10,000 transactions with Table Storage.  SQL Azure costs are
$9.99 for 1 GB of storage per month ($99.99 for 10GB). 

 

See Microsoft's sight for more details and specifics on costs here.

 

Bottom Line

 

Chris Hay and Brian Prince in their forth coming book Azure in Action (published by Manning - see here)
provide a synopsis of the SQL Azure vs. Table Storage in a few
paragraphs.  "If size is the issue, that would be the first flag that
you might want to consider Azure Tables. As long as the support Tables
has for transactions and queries meets your needs. The size limit surely
will, at 100TB."  Further they suggest sophisticated transactions, or a
complex authorization model might require the services of SQL Azure. 
And as shown by the cost table above, "The final consideration is cost. I
can store a lot of data in Azure Tables for a lot less money than I can
in SQL Azure. SQL Azure is giving me a lot more features to use (joins,
relationships, etc.), but it does cost more."

 

Future

 

We
are given every indication by Microsoft that SQL Azure will have far
more capability in the future - akin to the SQL Server you might find in
your data centers today.   So some of the comparison above may be moot
or less important over time.  Additional functionality is also being
proposed to Table Storage as well.  For example, support of secondary
(non-key) indexes is already been suggested for a future release (see here). 
However, key architectural differences between SQL Azure and Table
Storage will remain and leave application designers having to pick the
best option for their systems.  Welcome to cloud computing.  There is a
lot of ROI to be had by running in the cloud, but only with proper
application design and architecture.

 

If Intertech can help you  negotiate the issues of cloud computing, please contact Ryan McCabe at ryan.mccabe@intertech.com.