SQL Server 2012 and the new columnstore index

Posted: 10.6.2012 15.57.18 (EET/GMT+2)

SQL Server 2012 includes a new indexing feature called the columnstore index, or in more detail an xVelocity memory optimized columnstore index. Shortly put, columnstore indexing allows much faster data retrieval in especially larger datasets than traditional tables and indexes. It is aimed in data-warehousing situations.

The main change behind columnstore indexes is that it effectively turns data storage 90 degrees: instead of using the traditional row-based storage ("rowstore format") it uses column-based storage ("columnar data format"). To visualize, think a traditional customer table and how it is stored as rows:

1234	CustomerA	CA	$20,000
2345	CustomerB	CA	$30,000
3456	CustomerC	FL	$40,000

In the column format, this data might be stored as follows:

1234	2345	3456
CustomerA	CustomerB 	CustomerC 
CA	CA	FL
$20,000	$30,000	$40,000

This format can bring many benefits such as faster data retrieval, improved compression, lesser network traffic, and so on. Microsoft says that in the best cases, querying large datasets could see tenhold improvements in query performance, which is great indeed.

There are, however, some major limitations in the current implementation of columnstore indexed tables. For one thing, such tables cannot be updated without first removing (dropping) the index. Thus, columnstore indexed tables are effectively read-only. Also, columnstore indexes cannot be unique, and they are not effective when used in table joins.

I haven't yet had the chance to run real-world performance tests, but after studying this feature, it sounds really good. As it is currently the first version of SQL Server where this feature is supported, there are certain (major) limitations on its use, but I'm pretty sure in couple of years such limitations can be lifted.

For more information, see this article on MSDN.