{"id":67662,"date":"2016-09-05T15:15:44","date_gmt":"2016-09-05T15:15:44","guid":{"rendered":"https:\/\/www.simple-talk.com\/?p=67662"},"modified":"2021-08-24T13:39:29","modified_gmt":"2021-08-24T13:39:29","slug":"memory-oltp-row-visibility-sql-servers-mvcc","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/database-administration-sql-server\/memory-oltp-row-visibility-sql-servers-mvcc\/","title":{"rendered":"In-Memory OLTP: Row Visibility in SQL Server&#8217;s MVCC"},"content":{"rendered":"<h2>Introduction<\/h2>\n<p>In the multi-valued concurrency control (MVCC) model within the in-memory OLTP engine, a row going through updates may have multiple versions in memory simultaneously.\u00a0 Rows inserted and deleted in pre- and post-commit states are also present.\u00a0 Transactions running concurrently and serially may access different row versions at once, depending on rules, toward correct\u2014or sometimes incorrect\u2014outcomes.<\/p>\n<p>In this article I\u2019ll try to give you a solid understanding of what row versions a transaction may see under various conditions.\u00a0 We\u2019ll start with groundwork concepts and an access rule for in-memory rows.\u00a0 Then I\u2019ll expose a timing issue inherent in the model.\u00a0\u00a0 With this understanding, we can explore further concepts and constructs that fit together to make row visibility consistent in the MVCC.<\/p>\n<h2><b style=\"mso-bidi-font-weight: normal;\"> Row Visibility: Basics<\/b><\/h2>\n<p>In-memory rows are structured differently than those in disk-based tables.\u00a0 In lieu of the page structure and its need for logical locking and physical latching, and other differences, each row has a metadata header and payload:<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img loading=\"lazy\" decoding=\"async\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image002.png\" width=\"356\" height=\"106\" border=\"0\" \/><\/span><\/p>\n<p>The payload comprises the actual table columns and their values for the row.<\/p>\n<p>I\u2019ve left out most of the header fields to focus on the two directly involved in row visibility. Both Begin Timestamp (<b style=\"mso-bidi-font-weight: normal;\">Begin-T<\/b>s) and End Timestamp (<b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b>) are monotonically increasing database-wide numbers that serialize the time at which transactions that create and delete rows commit their work.\u00a0 I\u2019ll refer to row versions only by their transaction serial numbers: &lt;<b style=\"mso-bidi-font-weight: normal;\">Begin-T<\/b>s, <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b>&gt;; the payload does not play a role in visibility.<\/p>\n<p>We\u2019ll soon see where these numbers come from, but for now let\u2019s try an example.\u00a0 I\u2019ll also use the terms <i style=\"mso-bidi-font-style: normal;\">row<\/i> and <i style=\"mso-bidi-font-style: normal;\">row version<\/i> interchangeably throughout.<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &lt;10, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &lt;31, 240&gt;<\/p>\n<p>The upper row was inserted by a transaction that committed at timestamp 10, and the infinity symbol for the end timestamp means that no transaction has deleted this row; this row version is currently in the table.\u00a0 The lower row was inserted by a transaction that committed at a later point in time\u2014at transaction serial time 31\u2014and was deleted by a transaction having commit timestamp 240.<\/p>\n<p>The payload in an in-memory row can never change.\u00a0 To update column values, the <b style=\"mso-bidi-font-weight: normal;\"> End-Ts<\/b> for the current row version (remember, <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b> is <span style=\"'font-size: medium;\">\u221e<\/span>) is marked with the transaction\u2019s commit timestamp, and a new row with the updated payload columns is inserted with <b style=\"mso-bidi-font-weight: normal;\">Begin-T<\/b>s having the same commit timestamp and infinity for the <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b>.\u00a0 These two operations are done in one atomic step.<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &lt;31467, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &lt;20000, 31467&gt;<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &lt;10, 20000&gt;<\/p>\n<p>In this sample, if all three rows have the same primary key\u2014which, in in-memory tables, are immutable \u2014then the lowest row would be the row\u2019s insertion, and those above it resulted from two updates in separate transactions committing at timestamps 20000 and 31467 respectively. \u00a0If the keys are different, then the bottom row was inserted by an early transaction, then deleted by a later transaction, which also inserted the middle row, and the top row was inserted by a transaction with the latest commit timestamp, and this transaction also deleted the middle row.<\/p>\n<p>The samples imply that although the in-memory OLTP engine works with <b style=\"mso-bidi-font-weight: normal;\">INSERT<\/b>, <b style=\"mso-bidi-font-weight: normal;\">UPDATE<\/b>, and <b style=\"mso-bidi-font-weight: normal;\">DELETE<\/b> modification statements, they reduce to operations insert <b style=\"mso-bidi-font-weight: normal;\">&lt;C-Ts, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/b> and (logical) delete <b style=\"mso-bidi-font-weight: normal;\">&lt;original C-Ts, C-Ts&gt;<\/b>, where C-Ts is the commit timestamp of the transaction doing the work.<\/p>\n<p>Timestamps <b style=\"mso-bidi-font-weight: normal;\">Begin-T<\/b>s and <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b> are gotten from a counter of type <b style=\"mso-bidi-font-weight: normal;\">bigint<\/b> known as the Global Transaction Timestamp (GTTs).\u00a0 Its value at any moment was the one given to the latest transaction that issued a commit.\u00a0 (For implicit and explicit transactions, the <b style=\"mso-bidi-font-weight: normal;\">COMMIT TRAN<\/b> statement is part of the code, and for single-statement autocommit and atomic block statements used with natively compiled procedures, it is written in automatically.)<\/p>\n<p>Here are its rules:<\/p>\n<ul>\n<li>The counter value is persisted through instance restarts to maintain transaction serial order and correct row version visibility.<\/li>\n<li>At execution start, the transaction is assigned the GTTs current value\u2014the logical read time.\u00a0 This determines its single point-in-time row version visibility.\u00a0 Any number of concurrent transactions can have the same read timestamp.<\/li>\n<li>At commit, the transaction is assigned the post-incremented GTTs value\u2014the commit timestamp\u2014making the number always one or more higher than any currently active transaction\u2019s logical read time.\u00a0 This value is written into Begin-Ts and End-Ts fields in row headers as was discussed.\u00a0 Commit timestamps are unique across all committing transactions.<\/li>\n<\/ul>\n<p>We now have enough information to understand row visibility.span style=&#8221;mso-spacerun:yes&#8221;&gt;\u00a0 The &lt;<b style=\"mso-bidi-font-weight: normal;\">Begin-T<\/b>s, <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b>&gt; pair is the validity interval.\u00a0 A transaction may access a row version if its position in the transaction serial order is at least as great as that of the transaction that created the row, and less than the transaction that deleted it, if any; i.e. its order falls within the validity interval of the row:<\/p>\n<p style=\"margin: 0; margin-left: 30px;\"><b style=\"mso-bidi-font-weight: normal;\">Begin-T<\/b>s &lt;= logical read time &lt; <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b>.\u00a0<\/p>\n<p>In the three-row sample above, validity intervals for all row versions are disjoint, so a transaction with logical read time of at least 10 could see exactly one of the rows.\u00a0 By contrast, given intervals <b style=\"mso-bidi-font-weight: normal;\">&lt;50, 80&gt;<\/b>, <b style=\"mso-bidi-font-weight: normal;\">&lt;60, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/b>, a transaction with logical read time from 60 to 79 could see both rows (and the versions can\u2019t share the same primary key).<\/p>\n<h3>A Timing Issue<\/h3>\n<p>These concepts are sufficient for row visibility\u2014but only in a static database.\u00a0 In-memory OLTP is meant to increase performance in (highly) transactional systems, though, so let\u2019s look at behavior of transactions running together.<\/p>\n<p>For the examples, rows<b style=\"mso-bidi-font-weight: normal;\"> x <\/b>and<b style=\"mso-bidi-font-weight: normal;\"> y <\/b>have these <b style=\"mso-bidi-font-weight: normal;\">begin<\/b> and <b style=\"mso-bidi-font-weight: normal;\">end<\/b> timestamps in their current versions before the transactions start:<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 X: &lt;10, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Y: &lt;676, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/p>\n<p>I\u2019ll use a modified schedule representation to show interleaving of operations:<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img loading=\"lazy\" decoding=\"async\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image004.png\" width=\"280\" height=\"201\" border=\"0\" \/><\/span><\/p>\n<p>When transaction i&#8211;<b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> for short\u2014commits, say it gets timestamp 12345; these are the new row versions:<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 X: &lt;10, 12345&gt;<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &#8212;&#8212;&#8212;&#8212;&#8212;&#8211;<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Y: &lt;12345, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0 &lt;676, 12345&gt;<\/p>\n<p>If no other transactions commit between the time <b style=\"mso-bidi-font-weight: normal;\">txn-i <\/b>commits and <b style=\"mso-bidi-font-weight: normal;\">txn-k <\/b>starts, the latter will get the logical read time 12345; else it will be greater.\u00a0 In either case, by the formula <b style=\"mso-bidi-font-weight: normal;\">txn-k <\/b>reads the newly inserted<b style=\"mso-bidi-font-weight: normal;\"> y <\/b>row version having interval <b style=\"mso-bidi-font-weight: normal;\">&lt;12345, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/b> and<b style=\"mso-bidi-font-weight: normal;\"> x <\/b>is not visible.<\/p>\n<p>Now let\u2019s interleave the transactions\u2019 operations:<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img loading=\"lazy\" decoding=\"async\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image006.png\" width=\"391\" height=\"180\" border=\"0\" \/><\/span><\/p>\n<p>If <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> gets the same commit timestamp, then this time the logical read time for <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> must be less than 12345.\u00a0 In this case, for the<b style=\"mso-bidi-font-weight: normal;\"> y <\/b>row it reads the deleted version, <b style=\"mso-bidi-font-weight: normal;\">&lt;676, 12345&gt;.<\/b>\u00a0 It had already read the<b style=\"mso-bidi-font-weight: normal;\"> x <\/b>version whose header values were <b style=\"mso-bidi-font-weight: normal;\">&lt;10, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;.\u00a0 <\/b>If it were to re-read the row later, it would access the same version, now with interval <b style=\"mso-bidi-font-weight: normal;\">&lt;10, 12345&gt;.<\/b>\u00a0 Which is also correct:<\/p>\n<p style=\"margin-left: .5in;\">All rows [re-]read by a transaction remain consistent as of a single point in time\u2014at transaction start.\u00a0 Providing consistent read sets (snapshots) is the rationale for row versioning.<\/p>\n<p>Now imagine that in either schedule <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> reads both<b style=\"mso-bidi-font-weight: normal;\"> x <\/b>and<b style=\"mso-bidi-font-weight: normal;\"> y <\/b>before <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> commits; for example:<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img loading=\"lazy\" decoding=\"async\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image008.png\" width=\"319\" height=\"172\" border=\"0\" \/><\/span><\/p>\n<p>Because <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> doesn\u2019t have a commit timestamp while it inserts and deletes rows, affected <b style=\"mso-bidi-font-weight: normal;\">Begin-T<\/b>s and <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b> values are unknown when <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> reads the rows, which it can do immediately in an optimistic, non-blocking system.\u00a0 The basic row visibility formula cannot be applied.<\/p>\n<p>Even the original schedules can have this problem.\u00a0 Though <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> does have the commit timestamp, there is a lag time, as we\u2019ll see, before the in-memory OLTP engine can write commit timestamps to headers.\u00a0 In this case, if we imagine <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> affecting thousands of rows, <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> can read some headers with complete interval values and some having unknown values.\u00a0 To make it worse, <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> can issue a rollback or be aborted in a post-commit phase.<\/p>\n<p>When validity intervals are incomplete, how does a reader know which schedule is in effect and what action to take?<\/p>\n<h3>Two Puzzle Pieces<\/h3>\n<p>When a transaction starts, it gets the timestamp value of the last transaction issuing a commit from the GTTs counter, which is its logical read time.\u00a0 It also gets, from the in-memory OLTP engine, a unique, serial transaction identifier.\u00a0 For a transaction <b style=\"mso-bidi-font-weight: normal;\">k<\/b>, I\u2019ll refer to it as <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> ID.\u00a0\u00a0\u00a0 The identifier counter is of type <b style=\"mso-bidi-font-weight: normal;\">bigint<\/b>, incrementing for each new transaction involving in-memory structures.\u00a0 Unlike the GTTs, it isn\u2019t persisted between instance startups (because there is no need to identify non-existent transactions).<\/p>\n<p>Our second puzzle piece is the lifecycle of an MVCC transaction.\u00a0 At the 30,000 feet elevation, imagine it as having three phases:<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img loading=\"lazy\" decoding=\"async\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image010.png\" width=\"306\" height=\"122\" border=\"0\" \/><\/span><\/p>\n<h4>The Active Phase<\/h4>\n<p>The Active phase starts with the transaction, and ends when the COMMIT (or ROLLBACK) command is reached.<\/p>\n<p>As each insert and delete operation is performed during the Active phase, affected rows immediately receive the transaction\u2019s unique identifier in the appropriate header timestamp fields, along with a bit flag signaling that the value is an identifier not a timestamp.\u00a0 This is a significant placeholder as we\u2019ll see\u2014the unknown from the previous section\u2014that will later be replaced by the commit timestamp upon successful completion.<\/p>\n<p>For row<b style=\"mso-bidi-font-weight: normal;\"> x &lt;10, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/b> consider this schedule and timeline for reader <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> and writer <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>:<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img loading=\"lazy\" decoding=\"async\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image012.png\" width=\"361\" height=\"288\" border=\"0\" \/><\/span><\/p>\n<p>The timeline is meant to show <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> starting before <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>, at the same time, or later but before <b style=\"mso-bidi-font-weight: normal;\"> txn-i<\/b> reaches the COMMIT command.\u00a0 Here are the versions of<b style=\"mso-bidi-font-weight: normal;\"> x <\/b> after update:<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &lt;<b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> ID, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/p>\n<p style=\"margin: 0; margin-left: 30px;\">\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 &lt;10, <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> iD&gt;<\/p>\n<p>If <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> successfully commits, then its commit timestamp must be greater than the logical read time for <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b>.\u00a0 Therefore, in each read, <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> accesses the same original row version: first with interval <b style=\"mso-bidi-font-weight: normal;\">&lt;10, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;,<\/b> then <b style=\"mso-bidi-font-weight: normal;\">&lt;10, txn-i ID&gt;<\/b>, then <b style=\"mso-bidi-font-weight: normal;\">&lt;10, C-Ts&gt;<\/b> if the commit timestamp replaces the transaction identifier in time, or <b style=\"mso-bidi-font-weight: normal;\">&lt;10, txn-i ID<\/b>&gt; again if not.\u00a0 Were <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> to rollback or fail, internal code will reset the <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b> to<b style=\"mso-bidi-font-weight: normal;\"> <span style=\"'font-size: medium;\">\u221e<\/span><\/b> in the original row.\u00a0 In either case, reader <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> is unaffected by concurrently running writer <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>.<\/p>\n<p>Let\u2019s add this rule to our basic visibility rule:<\/p>\n<p style=\"margin-left: .5in;\">For <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> and <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>: <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> can read <b style=\"mso-bidi-font-weight: normal;\">&lt;C-Ts, txn-i ID&gt;<\/b> when C-Ts &lt;= <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b>\u2019s logical read time and their Active phases overlap.<\/p>\n<p>Then this must also be true:<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> cannot read <b style=\"mso-bidi-font-weight: normal;\">&lt;txn-i ID, <span style=\"'font-size: medium;\">\u221e<\/span>&gt; <\/b>when their Active phases overlap.<\/p>\n<p>Logical read time for <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> must be before the newly inserted row\u2019s validity interval when <b style=\"mso-bidi-font-weight: normal;\"> txn-i<\/b> commits.\u00a0 If the writer rolls back or fails for any reason, row <b style=\"mso-bidi-font-weight: normal;\"> &lt;txn-i ID, <span style=\"'font-size: medium;\">\u221e<\/span>&gt; <\/b>will be inaccessible\u2014marked for garbage collection.<\/p>\n<h4>The Validate Phase<\/h4>\n<p>Now let\u2019s consider row visibility when readers start during writers\u2019 <b style=\"mso-bidi-font-weight: normal;\">Validate<\/b> phase.\u00a0 We adjust the concurrency timeline thus:<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img loading=\"lazy\" decoding=\"async\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image014.png\" width=\"443\" height=\"156\" border=\"0\" \/><\/span><\/p>\n<p>Reader <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> starts on or after <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s commit statement but before its final resolution point, and ends anytime afterward.\u00a0 Before we can examine visibility, though, we need a little understanding of the <b style=\"mso-bidi-font-weight: normal;\">Validate<\/b> phase.<\/p>\n<p>The in-memory OLTP engine performs these actions during validation: check for transaction isolation level violations; wait for \u2018commit dependencies\u2019 to clear\u2014which I\u2019ll explain in a moment; and write modifications to the log upon successful outcome (depending on table durability; not covered).\u00a0\u00a0 But any of these steps can fail, although rarely.\u00a0 In this case the engine rolls back the transaction and marks it ABORTED; else, it marks it COMMITTED.\u00a0 Therefore, when <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> reads a row version inserted by <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> before <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> reaches the final committed\/aborted point, although the commit timestamp is available, the row\u2019s status is unknown and the reader takes a commit dependency on the writer.<\/p>\n<p>If the writer aborts, the reader has essentially done an unsuccessful dirty read.\u00a0 Row versions with validity interval <b style=\"mso-bidi-font-weight: normal;\">&lt;C-Ts, txn-i ID&gt;<\/b> and logical read time for <b style=\"mso-bidi-font-weight: normal;\"> txn-k &gt;= C-Ts<\/b> should have been visible but weren\u2019t because the engine will put back the infinity value where the transaction ID is.\u00a0 When the engine rolls back <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> it must also roll back <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b>.\u00a0 It will also make rows with intervals<b style=\"mso-bidi-font-weight: normal;\"> &lt;txn-i ID, <span style=\"'font-size: medium;\">\u221e<\/span>&gt; <\/b>invisible to readers by marking them for garbage collection.<\/p>\n<p>Rows visible to <b style=\"mso-bidi-font-weight: normal;\"> txn-k<\/b> instead have intervals<b style=\"mso-bidi-font-weight: normal;\"> &lt;txn-i ID, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/b> because <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b>\u2019s logical read time is &gt;= <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s commit timestamp and the engine optimistically assumes <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> will be committed.\u00a0 After <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> is marked committed, the commit dependency that reader <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> had on it is released.\u00a0 If <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> is in its own <b style=\"mso-bidi-font-weight: normal;\">Validate<\/b> phase before <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> resolves, it may wait, however briefly, for the commit dependency to clear.<\/p>\n<p>To make a rule of this:<\/p>\n<p style=\"margin-left: .5in;\">For <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> and <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>: <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> can read &lt;<b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> ID, <span style=\"'font-size: medium;\">\u221e<\/span>&gt; with commit dependency when it starts during <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s Validate phase<\/p>\n<p>Certainly:<\/p>\n<p>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> cannot read &lt;C-Ts, <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> ID&gt; when it starts during <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s Validate phase\u00a0\u00a0\u00a0<\/p>\n<p>Commit dependencies form chains of two or more transactions; think of it as directed acyclic graphs.\u00a0 The acyclic part means no deadlocking.\u00a0 There is no deadlocking in in-memory OLTP.\u00a0 Period.<\/p>\n<h4>The Resolve Phase<\/h4>\n<p>In the final phase, reader <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> starts on or after <b style=\"mso-bidi-font-weight: normal;\"> Txn-i<\/b>\u2019s Validate phase finishes with the COMMITTED or ABORTED result, and ends anytime afterward.\u00a0<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img decoding=\"async\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image016-2.png\" border=\"0\" \/><\/span><\/p>\n<p>The way I\u2019ve structured the lifecycle of a transaction, resolution begins when the transaction is committed or aborted, and (if committed) it can start releasing commit dependencies held by readers.\u00a0 It may be more common to think of the release of commit dependencies as ending the validation phase; I\u2019ve placed it here to show that a reader transaction starting in my <b style=\"mso-bidi-font-weight: normal;\">Resolve<\/b> phase doesn\u2019t take out a commit dependency on the writer because the writer\u2019s outcome is known and was made permanent.\u00a0 Only the order of operations is important: first write to log, then mark transaction as committed, then release commit dependencies.<\/p>\n<p style=\"margin-left: .5in;\">For <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> and <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>: <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> can read <b style=\"mso-bidi-font-weight: normal;\">&lt;txn-i ID, <span style=\"'font-size: medium;\">\u221e<\/span>&gt; <\/b>when it starts during <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s Resolve phase and <b style=\"mso-bidi-font-weight: normal;\"> txn-i<\/b>\u2019s state is committed<\/p>\n<p>Also:<\/p>\n<p style=\"margin-left: .5in;\"><b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> can read <b style=\"mso-bidi-font-weight: normal;\">&lt;C-Ts, txn-i ID&gt;<\/b> when it starts during <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s Resolve phase and <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b>\u2019s logical read time &gt;= C-Ts and <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s state is aborted<\/p>\n<p>After releasing commit dependencies, the engine replaces the transaction IDs in row header fields with commit timestamps and clears the transaction ID bit flags. \u00a0(You may know this final operation as being in the post-processing or commit processing phase.)<\/p>\n<h3>The Third Interlocking Puzzle Piece<\/h3>\n<p>This model is sufficient to ensure that transactions always read the right row versions, and when failure happens, that the in-memory OLTP engine can take corrective action.\u00a0 To fulfill the implementation, the third puzzle piece is no surprise: the engine must have metadata to track logical read times, commit timestamps and transaction states.\u00a0 The data is in a table known as the global transaction table (<b style=\"mso-bidi-font-weight: normal;\">GTTbl<\/b>), and we can access it through a dynamic management view (DMV), sys.dm_db_<b style=\"mso-bidi-font-weight: normal;\">xtp_transaction<\/b>s.\u00a0 Table 1 shows sample readout with most columns elided:<\/p>\n<p><span style=\"mso-no-proof: yes;\"> <img decoding=\"async\" class=\"\" src=\" https:\/\/www.simple-talk.com\/wp-content\/uploads\/2016\/09\/clip_image018-2.png\" border=\"0\" \/><\/span><\/p>\n<p>The column meanings are as follows:<\/p>\n<ul>\n<li><b style=\"mso-bidi-font-weight: normal;\"> xtp_transaction<\/b>.\u00a0 The unique, increasing serial number assigned to the transaction.\u00a0 I\u2019ve been referring to it in examples as <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> ID.\u00a0 This value is also unique in the DMV.<\/li>\n<li><b style=\"mso-bidi-font-weight: normal;\">begin_tsn.<\/b>\u00a0 Begin transaction serial number.\u00a0 The logical read time of the transaction.<\/li>\n<li><b style=\"mso-bidi-font-weight: normal;\">end_tsn<\/b>.\u00a0 End transaction serial number.\u00a0 The commit timestamp.\u00a0 This is 0 until the COMMIT statement is reached and resets to 0 if the transaction aborts.\u00a0 Replaces the <b style=\"mso-bidi-font-weight: normal;\">xtp_transaction <\/b>value in <b style=\"mso-bidi-font-weight: normal;\"> Begin-Ts<\/b> and <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b> fields during the Resolve phase.<\/li>\n<li><b style=\"mso-bidi-font-weight: normal;\">state_desc<\/b>.\u00a0 This can be ACTIVE, VALIDATING, COMMITTED, and ABORTED.<\/li>\n<li><b style=\"mso-bidi-font-weight: normal;\"> result_desc<\/b>.\u00a0 Some results are as shown above; see the books online documentation for more.<\/li>\n<\/ul>\n<p>When a transaction comes to a row that otherwise meets the constraint conditions of its query or modification statement and has a transaction ID in either of the <b style=\"mso-bidi-font-weight: normal;\">Begin-T<\/b>s or <b style=\"mso-bidi-font-weight: normal;\">End-Ts<\/b> header fields, it finds the row writer\u2019s transaction in the <b style=\"mso-bidi-font-weight: normal;\">GTTbl<\/b> by matching the transaction ID to the unique <b style=\"mso-bidi-font-weight: normal;\">xtp_transaction<\/b> column.\u00a0 It then compares this transaction\u2019s end_tsn to its own logical read time and takes into account the state column in deciding if the row is visible.\u00a0 The decision, of course, is based upon the rules we\u2019ve developed.<\/p>\n<p>For example, referring back to Schedule 4 and discussion, <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b>\u2019s first <b style=\"mso-bidi-font-weight: normal;\">read[x] <\/b>accesses row version <b style=\"mso-bidi-font-weight: normal;\"> &lt;10, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/b> without need of the <b style=\"mso-bidi-font-weight: normal;\">GTTbl<\/b>.\u00a0 After <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> updates x, the second <b style=\"mso-bidi-font-weight: normal;\">read[x]<\/b> finds versions <b style=\"mso-bidi-font-weight: normal;\">&lt;10, txn-i iD&gt;<\/b> and<b style=\"mso-bidi-font-weight: normal;\"> &lt;txn-i ID, <span style=\"'font-size: medium;\">\u221e<\/span>&gt;<\/b> and so matches the ID to the corresponding <b style=\"mso-bidi-font-weight: normal;\">xtp_transaction<\/b> column in the <b style=\"mso-bidi-font-weight: normal;\">GTTbl<\/b> to get <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s row.\u00a0 Because the <b style=\"mso-bidi-font-weight: normal;\">state_desc<\/b> column shows ACTIVE, <b style=\"mso-bidi-font-weight: normal;\">read[x]<\/b> yields the (same) <b style=\"mso-bidi-font-weight: normal;\"> &lt;10, txn-i iD&gt;<\/b> version. When <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b> commits and <b style=\"mso-bidi-font-weight: normal;\">txn-k<\/b> does the third read, there is a timing issue: either of the row interval IDs may not yet have been replaced with the C-Ts.\u00a0 In this case, <b style=\"mso-bidi-font-weight: normal;\">txn-i<\/b>\u2019s<b style=\"mso-bidi-font-weight: normal;\"> state_desc<\/b> is <b style=\"mso-bidi-font-weight: normal;\">VALIDATING<\/b> or <b style=\"mso-bidi-font-weight: normal;\">COMMITTED<\/b> (or <b style=\"mso-bidi-font-weight: normal;\">ABORTED<\/b>) and so the engine can use the <b style=\"mso-bidi-font-weight: normal;\">end_tsn<\/b> to pick the (same) visible row version.<\/p>\n<h3>Final Word<\/h3>\n<p>We\u2019ve seen how begin and end timestamps in row metadata define the row\u2019s validity interval and allow for row versioning.\u00a0 This along with transaction logical read times define visibility.\u00a0 However, by examining schedules we saw that validity intervals can be incomplete.\u00a0 The solution I used to deduce visibility centered on a three-part model of transaction processing.\u00a0 Finally, we saw a DMV that showed how SQL Server tracks transaction metadata to make the MVCC work.<\/p>\n<p>This article was all conceptual; no T-SQL were harmed during its production.\u00a0 For row visibility explained with T-SQL samples and further information, see references next.<\/p>\n<h3>References<\/h3>\n<p>Delaney, K., <i style=\"mso-bidi-font-style: normal;\">SQL Server Internals: In-Memory OLTP <span style=\"font-size: 9.0pt; line-height: 106%;\"> Inside the SQL Server 2014 Hekaton Engine<\/span><\/i>, Simple Talk Publishing, Cambridge, 2014, 215 p.<\/p>\n<p>Korotkevich, D., <i style=\"mso-bidi-font-style: normal;\"> Expert SQL Server In-Memory OLTP <span style=\"font-size: 9.0pt; line-height: 106%;\"> Revolutionizing OLTP Performance in SQL Server<\/span><\/i>, APress, New York, 2015, 248 p.<\/p>\n<p>Miranda, M., 2016, In-Memory OLTP \u2013 Part 3 Row Structure and Indexes, <i style=\"mso-bidi-font-style: normal;\">Simple Talk<\/i>, <a href=\"https:\/\/www.simple-talk.com\/sql\/database-administration\/in-memory-oltp---row-structure-and-indexes\/\"> https:\/\/www.simple-talk.com\/sql\/database-administration\/in-memory-oltp&#8212;row-structure-and-indexes\/<\/a> (August 16, 2016)<\/p>\n<p>In-Memory OLTP (In-Memory Optimization), <i style=\"mso-bidi-font-style: normal;\">Microsoft Developer Network<\/i>, <a href=\"https:\/\/msdn.microsoft.com\/en-us\/library\/dn133186.aspx\"> https:\/\/msdn.microsoft.com\/en-us\/library\/dn133186.aspx<\/a> (August 16, 2016)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>SQL Server&#8217;s In-memory OLTP is fast, due to its multi-valued concurrency control (MVCC).  MVCC avoids the need for locks by arranging for each user connected to the database to see a snapshot of the rows of the tables at a point in time, No changes made by the user will be seen by other users of the database until the changes have been completed and committed. It is conceptually simple but does the user always see the correct version of a row under all circumstances?  Shel Burkow  explains.&hellip;<\/p>\n","protected":false},"author":230507,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143527],"tags":[],"coauthors":[20272],"class_list":["post-67662","post","type-post","status-publish","format-standard","hentry","category-database-administration-sql-server"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/67662","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/230507"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=67662"}],"version-history":[{"count":4,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/67662\/revisions"}],"predecessor-version":[{"id":67683,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/67662\/revisions\/67683"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=67662"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=67662"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=67662"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=67662"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}