. In this article we will conduct a simple experiment utilizing this new technology and discussing the benefits. The experiment will be organized in the following steps:
--Creates a memory-enabled database by adding a memory-optimized filegroup with a specified directory.
--Note: This assumes that directory c:\tmp exists (the subdirectory is created by SQL Server).
CREATE DATABASE InMemDB;
GO
ALTER DATABASE InMemDB
ADD FILEGROUP [inMemOLTP_fg] CONTAINS MEMORY_OPTIMIZED_DATA
ALTER DATABASE InMemDB
ADD FILE ( name = [inMemOLTP_fg_dir],
filename= 'c:\tmp\InMemOLTPDir')
TO FILEGROUP [inMemOLTP_fg];
GO
2. Create the tables (disk-based, memory-enabled)
--Create sample tables representing a scenario involving 200000 Products
USE InMemDB;
GO
--Create traditional table (disk-based)
CREATE TABLE tblDiskBased(
id INT PRIMARY KEY,
code VARCHAR(50),
descr VARCHAR(250),
price float
);
GO
--Create a durable (data will be persisted) memory-optimized table
--Alternative choice: SCHEMA_ONLY = Data will be lost if the server turns off unexpectedly
CREATE TABLE tblMemOptimized(
id INT NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=400000),
code VARCHAR(50),
descr VARCHAR(250),
price float
)WITH (MEMORY_OPTIMIZED=ON, DURABILITY = SCHEMA_AND_DATA);
GO
--Create a durable (data will be persisted) memory-optimized table
--Alternative choice: SCHEMA_ONLY = Data will be lost if the server turns off unexpectedly
--Note: This table will be used by the natively-compiled stored procedure
CREATE TABLE tblMemOptimized2(
id INT NOT NULL PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=400000),
code VARCHAR(50),
descr VARCHAR(250),
price float
)WITH (MEMORY_OPTIMIZED=ON, DURABILITY = SCHEMA_AND_DATA);
GO
So far: We have created a memory enabled database and data structures. Now, it's time to populate the tables with sample data.
3. Populate the tables with sample data
--Populate the tables with sample data (200K records)
--tblMemOptimized
DECLARE @i INT
SET @i=0
WHILE (@i<200000)
BEGIN
INSERT INTO tblMemOptimized
SELECT CAST(@i as varchar(6)) as id, ('Code '+ (CAST (@i AS varchar(6)))) as code, ('Description '+ (CAST (@i AS varchar(6)))) as descr, (10 + (500-10)*RAND()) as price
SET @i=@i+1
END
GO
--Populate the tables with sample data (200K records)
--tblDiskBased
--tblMemOptimized2
INSERT INTO tblDiskBased
SELECT * FROM tblMemOptimized;
GO
INSERT INTO tblMemOptimized2
SELECT * FROM tblMemOptimized;
GO
All three tables now have same data (200K rows each). You can easily check it out!
4. Prepare the processing code for three scenarios (execution modes):
We will wrap each code block into a stored procedure for easier execution.
The scenario is to decrease each product's price by 5%.
a. Disk-based processing
--Disk-based processing
CREATE PROCEDURE updateDiskBased as
BEGIN
UPDATE dbo.tblDiskBased
SET price=price-(price*0.05)
END
GO
b. In-Memory optimization
--Memory-Optimized
CREATE PROCEDURE updateMemoryOptimized as
BEGIN
UPDATE dbo.tblMemOptimized
SET price=price-(price*0.05)
END
GO
c. In-Memory optimization with natively compiled stored procedure
--Natively-Compiled Stored Procedure
CREATE PROCEDURE updateMemoryOptimizedNatively
with native_compilation, schemabinding, execute as owner
as
begin atomic with (transaction isolation level = snapshot,language = N'English')
UPDATE dbo.tblMemOptimized2
SET price=price-(price*0.05)
END
GO
So far: We have created a memory enabled database and data structures. We have populated the tables with sample data (all three tables have identical data for experiment consistency purposes) and we have prepared the stored procedures that target each one of the three tables and perform the processing, that is the update of the 'price' column by decreasing it by 5%.
Now it's time to run each stored procedure and measure the execution times.
But before executing the stored procedures let's check a sample (TOP 5) from each table:
|
Figure 1: Top 5 Records from each table before update. |
5. Run all three execution modes and measure execution times
Ok, now it's time to execute the three stored procedures, thus update all tables and measure execution times and here they are:
|
Figure 2: Execution times for three modes. |
Additionally, let's check once more some records just to confirm that the output is the same in all three tables:
|
Figure 3: Top 5 Records from each table after update. |
6. Discussion
First of all, from the above figure we can see that the output of the three update operations is exactly the same which is something that validates the correctness of the three different execution modes.
As you can see from the screenshot with execution times (Figure 2), the slowest execution was the one that involved the disk-based table (elapsed time: 1924 ms) having also the highest CPU time (328 ms).
The fastest execution was the one of the memory-optimized table with the natively compiled stored procedure where the elapsed time was 329 ms and the CPU 93 ms!
The memory optimized execution was faster than the disk-based with 404 ms elapsed time and 250 ms CPU time but it was not faster than the memory-optimized scenario combined with the natively compiled stored procedure.
In-Memory optimization in SQL Server is a great new technology which can significantly improve performance in many cases. Especially in cases of certain workloads such as staging processes in Data Warehouse systems, high data insert rates (i.e. smart metering) and low latency execution scenarios (i.e. investment trading), In-Memory optimization can boost performance. Of course, before applying it on Production systems, a proper analysis is required along with testing.
During the last six months I have been working developing a simulator for In-Memory optimization in SQL Server. I expect to make it publicly available by the end of September 2014. So make sure that you check my
official website often for updates!