Microsoft introduced in-memory OLTP in SQL Server 2014, advertising that queries can be up to 100 times faster. In this blog series I will explain the design principles of the in-memory OLTP engine to explain why it has the potential to be significantly faster than the traditional on-disk engine.
When a request is sent to the query engine, the traditional storage engine must first read the data pages from disk if they are not yet available in the buffer pool. Reading from the hard disk is still one of the most time-consuming operations and therefore largely determines the execution of the queries. Reading data from memory is much faster than reading from disk but memory was expensive in the past, therefore SQL Server loads data in the buffer pool on demand only. Memory is much cheaper nowadays, so that the core idea of the on-disk engine to regulate the memory allocation is no longer valid. In-memory OLTP does not save any data pages on disk but only reads and saves records in memory. Therefore, it is possible to avoid the complexity of handling the data pages in the buffer pool and skip physical reads which can result in much faster query execution.
The following examples compare the on-disk and the in-memory OLTP engine with regard to IO reads and memory allocation.
For the following analysis, we nee to create a clustered table with 5,000,000 records:
USE AdventureWorks2017;
GO

DROP TABLE IF EXISTS Sales.SalesOrderDetailMedium;
GO

-- Create a table with 5,000,000 records
SELECT TOP 5000000
	*
INTO Sales.SalesOrderDetailMedium
FROM
	Sales.SalesOrderDetailBig;
GO

-- Add a clustered key
-- Later we will see that an in-memory table always needs an index.
-- Since we are comparing the size of in-memory to an on-disk table, it is
-- appropriate to create a similar index.
CREATE CLUSTERED INDEX CL_SalesOrderDetailMedium_SalesOrderID_SalesOrderDetailID
ON Sales.SalesOrderDetailMedium (
									SalesOrderID
								  , SalesOrderDetailID
								);
GO

To ensure that the example is deterministic, we flush the buffer cache and enable statistics IO and time to measure query execution.

-- Flush the buffer cache 
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
GO
SET STATISTICS IO, TIME ON;
GO

Now let’s count the number of records in the table.

-- Costs :25.8056
-- Index Scan (Estimated I/O Costs) : 23.652
-- Scan count 9, logical reads 32437, read-ahead reads 32082
-- CPU time = 2016 ms,  elapsed time = 504 ms.
SELECT COUNT(*)
FROM
	Sales.SalesOrderDetailMedium;
GO

record_count_on_disk_table
The query is executed in parallel as the costs are higher than the Cost Threshold for parallelism on my SQL Server instance. 97% of the total query cost is caused by the clustered index scan because the storage engine reads 32,437 data pages from disk. The query execution time is 504 ms, but SQL Server has been working on the CPU for more than two seconds because of the parallel plan.
It is important to understand that all pages SQL Server read are saved in the buffer pool and allocates memory.

-- Identify the objects in the buffer cache 
-- SalesOrderDetailMedium : 250MB
SELECT
	DB_NAME(DB_ID()) AS db_name
  , OBJECT_SCHEMA_NAME(obj.object_id, DB_ID()) AS schema_name
  , OBJECT_NAME(obj.object_id) AS object_name
  , page_type
  , COUNT(*) AS cached_pages
  , (COUNT(*) * 8.0) / 1024 AS cached_pages_in_mb
  , SUM(row_count) AS row_count
  , SUM((free_space_in_bytes / 1024)) AS free_space_in_mb
  , SUM(read_microsec) AS read_microsec
FROM
	sys.dm_os_buffer_descriptors AS bd
INNER JOIN
	(
		SELECT
			object_id, index_id, allocation_unit_id
		FROM
			sys.allocation_units AS au
		INNER JOIN
			sys.partitions AS p
		ON
			au.container_id = p.hobt_id
			AND
				(
					au.type = 1
					OR au.type = 3
				)
		UNION ALL
		SELECT
			object_id, index_id, allocation_unit_id
		FROM
			sys.allocation_units AS au
		INNER JOIN
			sys.partitions AS p
		ON
			au.container_id = p.partition_id
			AND au.type = 2
	) AS obj
ON
	bd.allocation_unit_id = obj.allocation_unit_id
WHERE
	bd.database_id = DB_ID()
	AND OBJECT_NAME(obj.object_id) = 'SalesOrderDetailMedium'
GROUP BY
	OBJECT_SCHEMA_NAME(obj.object_id, DB_ID())
  , OBJECT_NAME(obj.object_id)
  , page_type
ORDER BY (COUNT(*) * 8.0) / 1024 DESC;
GO

buffer_allocation_record_count_on_disk_table
If we execute the same query again, we will see that the query plan and the costs are the same, but this time the storage engine does not have to read the pages from disk, but from the buffer pool instead. The estimated IO costs of 23.652 for the clustered index scan is based on the assumption that the data pages must be read from disk with every new execution of this query. This does not apply as long as the data pages are still available in the buffer pool. The query engine does not distinguish between physical and logical reads because it simply does not know whether the pages that the query needs to read are in the buffer pool or on disk. As long as the lazy