23 Jun New Clustered Columnstore Indexes in SQL 2014
New Clustered Columnstore Indexes in SQL 2014
In SQL Server 2012, Microsoft introduced the new memory optimized Columnstore Indexes (based on xVelocity). These group and store data for each column and then join these columns together to complete the index. The down side of this was that once the Columnstore index had been created, the underlying table became read-only. Essentially making this a great feature for a reporting system but not for read/write system.
The new storage engine in SQL Server 2014 overcomes this limitation. Known as a ‘Clustered Columnstore Index’ this allows us to utilize the highly performant memory optimized Columnstore indexes, but with the ability to write to the tables using the normal DML operations (INSERT, UPDATE and DELETE).
In the same fashion as a normal SQL clustered index, a clustered Columnstore index is used to define how the underlying data is stored on the physical disks. A table with a Columnstore index is first broken down into segments and organised into row groups. Each of these row groups can hold from 102,400 to 1,048,576 rows. Once this row group is identified it is broken into column segments, these are then compressed and inserted into the actual Columnstore.
If we look at how SQL handles smaller amounts of data (under 100,000 records), this data is staged in what is known as a Deltastore. Once the Deltastore reaches the maximum size this can then be drained and the data processed into a new row group.
The Deltastore in question is closed while this operation is taking place but the operation itself is not blocking for the whole table. Additional Deltastore will be generated for the table while the current Deltastore is drained and if the table itself is partitioned then each partition has its own set of Deltastores.
NOTE: Unlike previous version of Columnstore indexes, this new clustered version in SQL 2014 must include all columns on the table. This is because there is no other heap or clustered index to fall back on for the rest of the row. So this means that clustered Columnstore indexes cannot be combined with other types of indexes.