When Microsoft introduced their ‘Always On’ technology, they made the statement in TechNet, which our team has since had confirmed, that they cannot support distributed or cross-database transactions (CDT) with Always-On Availability Groups because they cannot guarantee data in such circumstances. It is made clear in the article ‘Not-Supported: AGs With DTC/Cross-Database Transactions‘ by the ‘SQL Server Always On support team’ as to where the problem lies. Each database commits independently within a cross-database transaction. This means that a failover could happen after one database commits a transaction but before the other database does, even for two databases in a single availability group. Always On synchronises each database transaction log across to the secondary independently, and this is where the issue lies when Cross-Database transactions are invoked. Our team, however, wanted a better understanding of Cross-Database transactions so we could understand this problem in more detail. Did it, for example, happen with databases on the same instance, or was this a problem that only happens when transactions are managed by Microsoft Distributed Transaction Coordinator (MS DTC)?
This exercise taught us a great deal about how to go about investigating Cross-Database transactions. This should prove useful to anyone who needs to assess the impact on their current systems and the impact it would have with any new technology that is introduced in the future.
Method used to understand the process
The transaction logs are the crux of the issue with Cross-Database transaction (CDT) as this is the mechanism through which CDT works. It is via these entries in the transaction logs that it is possible to identify what is happening when a CDT is processing.
NOTE: An important point here, CDT does not use MS DTC. The transactions are controlled within SQL Server code. MS DTC is an application run through Windows O/S specifically to handle cross instance/server transactions. . This information was provided by Shaun (Senior Microsoft Engineer) who debugged the code for us and isolated how CDT was working. To prove this, if you run a trace profiler against a CDT with the MS DTC counter enabled, you can confirm for yourself that MS DTC is not invoked.
To carry out the tests we chose, for simplicity, to use SSMS where possible. Our main tools were:
- The sys.fn_dblog function to read the log. An example of the code used to view the logs:
1234use DB1goSelect SPID, [Master DBID], [Master XDESID], Operation, Context, [Transaction ID]FROM sys.fn_dblog(NULL,NULL)
- A script to back up both the database and the transaction logs. I wanted to make sure I saw all the log entries around the transaction so I used the method of backing up the database and logs to set the logs back to the 20 rows that exist when it has been backed up. This meant I could view all those entries after the 20th row as potentially being involved in the transaction, either directly or as part of the administration of the transaction (for example “LOP_SET_BITS” for allocating a new extent). Obviously, I was mainly interested in those entries that were part of the transaction as indicated by the transaction id you will see in the tests.
- Scripts to create and alter stored procedures that will execute the Cross-Database transactions for each scenario.
- SQL Server Trace Profiler.
- SQL Server Activity Monitor or script to kill SPID during some scenarios.
- LockHunter to assist in test 7.
Description of the columns returned from sys.fn_dblog
This is a brief description of each of the columns returned in these tests:
Session ID for the transaction, crosses databases, so will be the same in all database transaction logs involved in the transaction.
Each database has a DBID to identify it. 1 = Master for example. This column is used for Cross-Database transactions to identify the coordinating database
Each Cross-Database transaction has a unique Cross-Database transaction id. This is where the id is stored for Cross-Database transactions.
The type of operation the database has performed in order that it can replay or rollback each operation.
The context in which the operation occurred. For example, LCX_CLUSTERED indicates an operation against a table with a clustered index.
The local transaction id maintained independently by transaction logs. This is not used to identify Cross-Database transactions.
Single database transactions
Baseline 1 was executed against a single database. The entries from the transaction log were then read. Criteria:
- The stored procedure was run from DB1 (DBID 7).
- This transaction is on a single database, NOT a CDT.
- This is a successfully committed transaction.
This gives us the entries in the transaction log on a standard non-CDT transaction. Using this, it is possible to compare the entries for a CDT and a non-CDT transaction.
From reading these, we can see that single database transactions are straightforward; the log will start the transaction, indicated by “LOP_BEGIN_XACT”.
NOTE: LOP_BEGIN_XACT isn’t actually entered in the log until the code reaches its first IUD (Insert / Update / Delete).
Once the IUD statement has been executed, it will enter the relevant row in the transaction log to indicate the action carried out; in this case LOP_INSERT_ROWS.
Finally, when it reaches the commit statement in code, it will carry out the commit and enter LOP_COMMIT_XACT in to the transaction log. Those are the complete set of rows entered in the log for this type of transaction.
NOTE: [SPID] holds the identity of the connection used for this transaction. As SPIDs are reusable, this is not unique to this transaction.
The [Master DBID] and [Master XDESID] are currently NULL as these columns are not used in non-CDT transactions. They will become important later when looking at CDT.
Baseline 2 deals with a failed transaction and the entries made in the transaction log. Criteria:
- The stored procedure was run from DB1 (DBID 7).
- This transaction is on a single database, NOT a CDT.
- The transaction was paused before commit and the SPID killed.
- This is a failed, rolled back transaction.
Baseline 2 starts off with the same entries as Baseline 1; LOP_BEGIN_XACT and LOP_INSERT_ROWS. At this point the transaction fails and the entries in the transaction reflect this. First any statements that had run (insert rows in our case) are rolled back and then the transaction is aborted LOP_ABORT_XACT. The single database transaction rollback is important later when considering how CDT handles the same issue.
NOTE: Any failure before a commit will automatically be rolled back with one exception when dealing with CDT. This exception will be discussed later.
Now we have a basic understanding how transactions are written to the transaction log, it is time to run the tests that will help us to identify how a CDT works in the log. The tests have been designed to identify the key processes operating within the CDT process.
To begin, we know that CDT uses a two-phase commit as described in the article “Not-Supported: AGs With DTC/Cross-Database Transactions” and that one of the databases involved in the transaction will act as the coordinator. This is important to understand, because it gives us a way of spotting in the log where a CDT is taking place. Once this database is identified and we are confident we can identify it in future, possibly in different scenarios, then we can look at what entries are added and when they are added to the transaction log, and assume that a Cross-Database transaction is taking place.
The initial tests are to identify the database CDT coordinator and validate how that allocation works. My tests expand upon, and confirm, the exact meaning of the statement “SQL Server elects the lowest DBID in the transaction to be the transaction coordinator”. These tests will also identify the additional log entries made by a CDT.
- Identify the Cross-Database transactions entries in the transaction log.
- Identify how the transaction links a Cross-Database transaction logs.
- Initial understanding of how the coordinator is assigned.
In each of the above tests, the CDT stored procedure was executed from a different database. In each case, the database with the lowest DBID was selected as the coordinator. In all cases DB1 was selected. We can tell this by the additional entries that have appeared in the transaction log and by the entry in the [Master DBID] and [Master XDESID] columns in each LOP_PREP_XACT entry. Points to note:
- The transaction ID is different for each transaction log. This cannot be used to identify a Cross-Database transaction. Each log creates its own transaction ID identifier.
- The SPID is the same across all three logs as expected; this indicates the session that the CDT is running under.
- The LOP_PREP_XACT has an entry in the [Master DBID]. This is the DBID of the database that is acting as the coordinator.
- The LOP_PREP_XACT has an entry in the [Master XDESID]. This is the Cross-Database transaction id and is the same in all the logs.
- DB1 is the only database to have the LOP_FORGET_XACT which is another way of identifying it as the coordinator. This is the coordinating database and when the transaction is complete, this entry indicates that it has cleared / dropped the Cross-Database transaction.
- LOP_PREP_XACT followed by LOP_COMMIT_XACT are the entries for the two-phase commit of the Cross-Database transaction.
The two phase commit is similar conceptually to that used by a distributed transaction (see Two-Phase Commit).
The LOP_PREP_XACT is the first phase of the commit. This entry in each transaction log indicates that the database is ready to commit. Once all the transaction logs have this entry then the transaction can be committed. LOP_COMMIT_XACT will be entered in the logs as each database commits. The coordinator always commits first (the reason for this will become apparent later) followed by the cooperating databases. The coordinating database finishes the transaction with LOP_FORGET_XACT to its own log, which indicates the Cross-Database transaction has been committed and the transaction can now be dropped. DTC Transaction States sees this “forget” as a means of closing a transaction which may throw some light on exactly what is happening at this point.
NOTE: Most blogs that hold descriptions of each “LOP_*” describe LOP_FORGET_XACT as the transaction is being rolled back. This is simply not the case as my above tests prove.
NOTE: Cross-Database transactions are not the sole preserve of user databases, it is also seen in the master database quite a lot, but we will not be covering this within our discussions, but it gives us the important information that it is possible to have “LOP_PREP_XACT” entries in the user database that are related to activity coordinated by the master database. As the master database always has the lowest DBID it will always act as the coordinator in these circumstances. Look for DBID 1 in the [Master DBID] to confirm that the system Master database is coordinating a Cross-Database transaction with your user database, like this entry in DB1:
Test 2 will look at:
- The statement “SQL Server elects the lowest DBID in the transaction” and prove exactly what it means.
- From 1 above, fully identify how the coordinator is assigned.
- It will confirm what happens in the log when the database running the SP is actually not involved in the transaction; in other words it does not execute (Update/Insert/Delete) against its own database.
Notice that the stored procedure ran from DB1. DB1 did not have any transactions to process even though DB1 does have the lowest DBID. To summarize:
- DB1: No entries in the initiating database’s log. It was not involved in the transaction, even though it ran the stored procedure.
- DB2: Became the coordinator. The [Master DBID] and [Master XDESID] entries confirm this. It was the lowest DBID involved in the transaction, even though it did not own the stored procedure that was run.
- This confirms that transactions start logging when an IUD statement is run against that database.
This proves that the statement means that the lowest DBID involved in the transaction will act as the coordinator; in this case DB2 (DBID 8).
NOTE: From a transaction log perspective, each transaction log of the Cross-Database transaction is independent. The only link at this point is the SPID (session id) and it is not until the commit is invoked that Cross-Database transaction starts.
Test 3 was run as a quick confirmation that running the code past the “BEGIN TRAN” but not running any transactions would result in empty logs. This is correct, no need for a blank image here. From the point of view of the transaction logs nothing has happened and the CDT coordinator has not been invoked at this point. This is important to understand for when the logs get replayed.
The tests now move away from successfully committed transactions to look at Cross-Database transactions that fail:
- Test 4 is a partially completed transaction before the commit phase. Two of the databases will have run their IUD statements (inserting rows) while the last one will not have the chance as the session has been killed.
- Test 5 is a transaction that reaches the prepare phase but then fails when part of the way through. It does not reach the commit phase of the two-phase commit.
- Test 6, with the help of a Senior Microsoft Engineer, a failure during the commit phase.
Test 4 looks at what happens when the transactions are rolled back due to a failure (in this case I will fail the transactions by killing the SPID as I did in the baseline 2 test. The aim is to confirm:
- No Cross-Database transaction coordinator is invoked at all.
- The transactions roll back independently with no knowledge they were involved in a CDT.
This failure is before the two-phase commit is invoked. With the thought in mind that each transaction log is working independently until the commit is invoked I was pretty sure I knew what would be entered in the logs.
The SPID was killed before reaching DB3. DB3 did have a transaction that it would have run if the stored procedure had not died before it reached the code. We see the transaction log entries are only for the first two databases and there are no CDT entries in either log. With no LOP_PREP_XACT there is no entry in the [Master DBID] or [Master XDESID] column, confirming that no coordinator was invoked.
It looks exactly the same as the original Baseline 2 test result; the rollback of a single database transaction. This indicates that as far as the transaction logs are concerned there was no coordinator and the logs are rolling back the transactions independently.
- Any transaction activity up to, but not including the two phase commit, would be seen as a single database transaction and not a Cross-Database transaction.
- No coordinator is required until the commit phase and is not invoked.
- Transactions at this stage are rolled back independently.
Two Phase Commit – The heart of Cross-Database Transactions
Test 5 is in three phases. It breaks the transaction during the “prepare” phase of the two-phase commit. The method used to do this:
- Execute the transaction (involving all three databases) up to, but just before the commit.
- Stop the code using a wait in the procedure.
- Rename the transaction log for DB2 (DBID 8).
- Code continues and the transaction fails.
Test 5.0 shows the state of the logs at the pause before the commit. No CDT invoked entries, the transaction is shown as started and the rows inserted.
Test 5.1 shows the state of the logs after the commit has been run (with the log for DB2 (DBID 8) renamed:
- Both working databases (DB1 and DB3) have rolled back.
- DB2 has gone in to recovery pending as expected.
- In the logs for DB1 and DB3 we can see the “LOP_PREP_XACT” entries from the “Prepare” phase of the CDT two-phase commit process.
- The “Prepare” phase of CDT requires all transaction logs involved in the transaction to indicate they are ready to commit. It expects to receive three acknowledgements (one “LOP_PREP_XACT” in each transaction log. It only received two as the other log was not able to reply.
- The CDT coordinator then invokes a rollback as the transaction is not able to commit all transactions.
Test 5.2 shows the final result after recovery.
- DB1 and DB3 have both rolled back.
- DB2 has been recovered and the log has rolled forward, showing no entries.
- The tables in each database have no rows added.
This tells us:
- The rollback at the “Prepare” phase is carried out by the coordinator rolling back. Each database within the transaction that is capable of rolling back will follow suit and rollback
- For the coordinator to commit a transaction it must receive acknowledgement from all databases involved in the transaction that they have successfully “prepared” for a commit. The LOP_PREP_XACT entry is the acknowledgement.
- Once LOP_PREP_XACT has been entered in the database transaction log, the transaction is recognized as a Cross-Database transaction.
- The entry in the [Master DBID] column in the LOP_PREP_XACT row indicates which database is acting as the coordinator.
- The entry in the [Master XDESID] column in the LOP_PREP_XACT row indicates that the transaction id is unique across all the databases that are involved in the transaction. You can see that it is the same in both the DB1 and DB3 transaction logs.
Right up to this point, all failures are treated as rollbacks. Because rollbacks happen independently with complete safety, this means that the only part of a transaction that would cause data integrity to be compromised would be during the second phase of the two-phase commit, the commit itself.
NOTE: If any of the databases that are involved in the Cross-Database transaction have an LOP_PREP_XACT entry, they will check with the coordinator database (identified via [Master DBID] and then using the [Master XDESID] entry, find the transaction.
If the coordinator has rolled back, the cooperating databases will also rollback. I point this out because of what happens in the next and final stage, the commit part of the two-phase commit. It is important to note that once LOP_PREP_XACT entry is in a transaction log, the coordinator will always be contacted.
Test 6 confirms the behavior of the commit phase of the two-phase commit. This information was obtained by debugging the actual transaction code.
Once the coordinating database has received confirmation that all databases cooperating in the transaction are ready to commit (LOP_PREP_XACT in the transaction logs) it will go ahead and commit the Cross-Database transaction. The coordinator will commit first and then each cooperating database will commit one by one. Once all databases have committed the transaction, the coordinator will then “close / forget” the transaction (LOP_FORGET_XACT entry in the transaction log).
In the case of failure during the commit phase then the database engine will:
- Obtain the database id for the coordinating database from the transaction log (column [Master DBID], LOP_PREP_XACT row).
- Obtain the unique Cross-Database transaction id also stored in the LOP_PREP_XACT row (column [Master XDESID]) in its own transaction log.
- Connect to the coordinating database.
- Query the coordinating database transaction log using the Master XDESID to find the transaction.
- Confirm commit or rollback was carried out on the coordinating database.
- Commit or rollback based on the answer.
If the coordinating database is not contactable, then message 3429 is logged:
Recovery could not determine the outcome of a cross-database transaction %S_XID, named '%.*ls', in database '%.*ls' (database ID %d:%d). The coordinating database (database ID %d:%d) was unavailable. The transaction was assumed to be committed. If the transaction was not committed, you can retry recovery when the coordinating database is available.
The cooperating database will then assume the transaction was committed in the co-ordinating database and commit the transaction in the recovering database as in most cases the transaction should have been committed in the co-ordinating database. A review of the Microsoft knowledge bases don’t show a single case of a customer logging a call about the 3429 error so it looks to be very rare.
Conclusions on Cross-Database Transactions and transaction log entries
The conclusions I have drawn from the investigation of the transaction logs when processing Cross-Database transactions come from testing in a single instance. These findings are relevant to our current database environment.
- Cross-Database transactions are fully supported and safe on a single instance.
- Cross-Database transactions do not use MS DTC.
- Cross-Database transactions are executed within SQL server.
- Cross-Database transactions are not invoked until the commit phase.
- The commit phase is a two-phase commit (prepare – commit).
- The commit phase of the two-phase commit is the only point at which transaction may not be rolled back on an error.
- If the coordinator cannot be contacted by the cooperating databases the default action is to commit the transaction in those cooperating databases.
- Rollbacks are handled independently by each database if the transaction did not write an LOP_PREP_XACT to the transaction log.
- Only those databases that are involved in the transaction are considered as candidates for the role of transaction coordinator, regardless of where the stored procedure is run from.
- The lowest DBID that is involved in the transaction will take the role of the transaction coordinator.
Based on the above, a Cross-Database transaction can only cause loss of data integrity in the commit phase of the two-phase commit.
Below I have created a process flow for Cross-Database transactions. (please click on the image to see it full-size)
This article would not have been possible without the valuable assistance of Shaun Beasley from Microsoft. He was able to fill in blanks by debugging the SQL Server engine around the two phase commit functionality. His technical review of the article was also of great assistance