{"id":1948,"date":"2015-02-06T00:00:00","date_gmt":"2015-02-06T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/curing-data-obesity-in-oltp-databases\/"},"modified":"2021-08-24T13:39:38","modified_gmt":"2021-08-24T13:39:38","slug":"curing-data-obesity-in-oltp-databases","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/database-administration-sql-server\/curing-data-obesity-in-oltp-databases\/","title":{"rendered":"Curing Data-Obesity in OLTP Databases"},"content":{"rendered":"<div id=\"pretty\">\n<h2> The Problem<\/h2>\n<p class=\"start\"> \tIt is quite common to have an OLTP database that  must store large amounts of data which pile up into hundreds of millions, even billions, of rows. What do we do in such  cases? In this article I will describe a way to deal with constant flows of OLTP data into production systems, and how  to offload this data, describing the process from beginning to end. <\/p>\n<p>Here are some assumptions for the sake of this  article:<\/p>\n<ul>\n<li>&#160;We have a 50M rows of data in a production database<\/li>\n<li>The table contains log data. There are 100,000 rows coming per day of data related to website visits)<\/li>\n<li>We are using SQL Server 2012 Enterprise edition<\/li>\n<li>We have two SQL Servers &#8211; one is for highly-available transactional databases and another one for staging  \t\tdata purposes<\/li>\n<\/ul>\n<p>How do we tackle the  problem?<\/p>\n<p>This is an illustration  of the classic problem of mixing&#160; OLTP and Analytics data. We have a database  which &#160;was designed to handle OLTP requirements, but it ends up gathering data  which is required for analytical purposes. As a general rule, OLTP databases should be light on their feet. They ought  to be small, storing only very recent data&#160; that should fit in memory so we  can easily do In-Memory-OLTP. (Please note that no fancy technology is needed for In-Memory-OLTP: anyone who is smart  enough can do In-Memory-OLTP, since memory nowadays is very cheap and the amount of relevant transactional data is  finally getting behind the available memory capacities).<\/p>\n<p>There are several ways  to deal with this situation:<\/p>\n<ul>\n<li>&#160;Ignore the problem until the data volume grows so much that it starts to jeopardize the database  \t\tbackups and restores, hence availability<\/li>\n<li>Copy some of the older data to a different location and delete it from the main database<\/li>\n<li>Leave a very small portion of the data in the production system and automate the rotation of the data and  \t\tthe copying of it to the secondary location<\/li>\n<\/ul>\n<p>Neither of the first two options are much good, but  the third one seems quite reasonable and we will explore it in this article. <\/p>\n<h2>The Solution<\/h2>\n<h3><b>The  setup:<\/b><\/h3>\n<p>For this article we can assume that we have a  database called <code>ProdLogDB<\/code>, which has a table <code>dbo.Log,<\/code> which is designed like  this:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">CREATE TABLE [dbo].[LOG]\n&#160;&#160;&#160; (\n&#160;&#160;&#160;&#160;&#160; [column1] [VARCHAR](50) NULL ,\n&#160;&#160;&#160;&#160;&#160; [column2] [VARCHAR](50) NULL ,\n&#160;&#160;&#160;&#160;&#160; [column3] [VARCHAR](50) NULL ,\n&#160;&#160;&#160;&#160;&#160; [column4] [VARCHAR](100) NULL ,\n&#160;&#160;&#160;&#160;&#160; [notes] [VARCHAR](MAX) NULL ,\n&#160;&#160;&#160;&#160;&#160; [timestamp] [DATETIME] NOT NULL ,\n&#160;&#160;&#160;&#160;&#160; [log_id] [INT] IDENTITY(1, 1)\n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; NOT NULL ,\n&#160;&#160;&#160;&#160;&#160; CONSTRAINT [PK_LOG] PRIMARY KEY CLUSTERED ( [log_id] ASC )\n&#160;&#160;&#160; )\nON&#160; [PRIMARY] \n<\/pre>\n<p>Now we need lots of data, and&#160; good representative data too. For the sake of this article I will generate some data with the help of Redgate&#8217;s  Data Generator. After the generation of data the <code>dbo.Log<\/code>&#160; table will have data for the time between 2012-01-01 and 2015-01-20. For this time frame I have chosen to have 10  million rows and the data is 1.5Gb. I will be using the same data generation project to generate 100,000 rows for the  next few days, while we do the database management and data distribution in this article. <\/p>\n<h3><b>The plan  of action:<\/b><\/h3>\n<p>We have the data, but what should we do next?  The idea is to have only a very small portion of the data in our production database, and to offload the rest of the  data to a secondary server where the historical data will be stored. In our case we need only 7 days of data in the  production system, so we can do analysis: The rest of the data can be someplace else. Furthermore, we want to automate  the offloading process in such a way that a scheduled job will run every day and will move the oldest day&#8217;s data to the  secondary location. (The code below is flexible enough to support the rotation of 1 month&#8217;s data, and we can provide a  parameter to specify how many days of data to keep, up to 31 days.)<\/p>\n<h4>How do we achieve that? <\/h4>\n<p>Firstly, we need to &#8217;empty&#8217; the <code>dbo.Log<\/code> table and move the data to a table containing the historical data.  There are many ways to do this: We can, for example, &#160;bulk-copy the data to a  different table, and batch-delete most of the data from the primary table. This, of course, would work, but it will take  a long time and use a lot of resources, even if we do batch deletes. I would prefer the following strategy:<\/p>\n<ul>\n<li>\n<div>\n<p> We Create a new table called dbo.Log1, which has identical schema as  the current production table, but it&#160; is partitioned, and ready for data  writes from the application. <\/p>\n<p>We run the following  script to do this:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">CREATE PARTITION FUNCTION PF_RingRecyclerByMonth (TINYINT)\nAS RANGE LEFT FOR VALUES (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 ,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31)\n&#160;\nCREATE PARTITION SCHEME PS_RingRecyclerByMonth\nAS PARTITION PF_RingRecyclerByMonth ALL TO ([PRIMARY])\n\nCREATE TABLE [dbo].[LOG1]\n&#160;&#160;&#160; (\n&#160;&#160;&#160;&#160;&#160; [column1] [VARCHAR](50) NULL ,\n&#160;&#160;&#160;&#160;&#160; [column2] [VARCHAR](50) NULL ,\n&#160;&#160;&#160;&#160;&#160; [column3] [VARCHAR](50) NULL ,\n&#160;&#160;&#160;&#160;&#160; [column4] [VARCHAR](100) NULL ,\n&#160;&#160;&#160;&#160;&#160; [notes] [VARCHAR](MAX) NULL ,\n&#160;&#160;&#160;&#160;&#160; [timestamp] [DATETIME] NOT NULL ,\n&#160;&#160;&#160;&#160;&#160; [log_id] [INT] IDENTITY(11000000, 1)\n&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; NOT NULL ,\n&#160;&#160;&#160;&#160;&#160; [Offset] AS ( CONVERT([TINYINT], DATEPART(DAY, [timestamp])) ) PERSISTED\n&#160;&#160;&#160; )\nON&#160; PS_RingRecyclerByMonth(Offset) \n\nGO\n<\/pre>\n<p>An important point to note here is that we are creating a new table, with one additional persisted computed  column called <code>[Offset]<\/code>. This column is used to partition the table by the day number. We will use this functionality later  on to find the oldest partitions and to send them to the historical table. <\/p>\n<p>Also, it is worth noting that we have the identity column <code>[log_id],<\/code>  however the identity seed starts from a certain level. This is very important to get right, since we do not want to have  duplicate ids later on. Make sure to set the identity seed from a level higher than the current one, even if there is a  small gap.<\/p>\n<p> In my case I have 10,000,000 rows, and I am estimating that in the next few days I will have 10,300,000, so I will set  the seed to be 11,000,000. There will be some gap but better to be on the safe side.<\/p>\n<\/p><\/div>\n<\/li>\n<li>\n<div>\n<p>So far, we have one empty partitioned table (<code>dbo.Log1<\/code>),  and one constantly growing production table (<code>dbo.Log<\/code>). Now it is time to do  something buzzingly exciting: we will move data from one table to the other. In the matter of a second. Here is how:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">BEGIN TRANSACTION;\nEXEC sp_rename 'dbo.LOG', 'LOG_History';\nEXEC sp_rename 'dbo.LOG1', 'LOG';\nCOMMIT TRANSACTION;\n<\/pre>\n<p>So here is what just happened: our application was writing to the <code>dbo.Log<\/code>  table, which was not partitioned and now, after renaming the tables, the application is writing to the newly created  partitioned table. If the application is smartly written, we won&#8217;t need to schedule for downtime or maintenance window.  The rename should take less than a second and the application should retry to write again any rows that fall in the time  of the rename.&#160; <\/p>\n<\/p><\/div>\n<\/li>\n<li>\n<div>\n<p> At this point we have two tables &#8211; <code>dbo.Log<\/code> &#8211; this one is partitioned and growing, and <code>dbo.Log_History<\/code> &#8211; this one is our large table which is not written to  anymore. <\/p>\n<\/p><\/div>\n<\/li>\n<li>\n<div>\n<p> Now it is time to offload the historical data to our secondary  server. One way to do it is to create a new database on the secondary server and bulk-copy the data. Another way to do  it is to simply backup the current production database and restore it as a historical database on the secondary server.  Either way, the result will be the same. <\/p>\n<p>After the successful copy of the data to the secondary server we can now delete the large table from our  production server. (some magic may have to be performed to regain the allocated space after removing the large table,  but this is a trivial task for a DBA)<\/p>\n<p>Also, keep in mind that your DBA will need to do some wizardry if your secondary server is not an Enterprise  edition, since the database won&#8217;t restore if it contains enterprise edition objects in it. <\/p>\n<\/p><\/div>\n<\/li>\n<li>\n<div>\n<p> At this point we have two databases on two different servers, one is  called <code>ProdLogDB<\/code> and contains the partitioned table <code>dbo.Log,<\/code> and the other is called <code>ProdLogDB_History<\/code> and contains the  \t<code>dbo.Log_History<\/code> table.  \t <\/p>\n<\/div>\n<\/li>\n<\/ul>\n<p>How do we rotate the production table, though?  \t \t<\/p>\n<p>The idea is to take the oldest partitions and to  move them to the historical database.  \t \t<\/p>\n<p>First we need to create an empty table with the  same structure so we can do partition switching. Create the same table in both databases:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tCREATE TABLE [dbo].[LOG_SwitchTarget]\n\t&#160;&#160;&#160; (\n\t&#160;&#160;&#160;&#160;&#160; [column1] [VARCHAR](50) NULL ,\n\t&#160;&#160;&#160;&#160;&#160; [column2] [VARCHAR](50) NULL ,\n\t&#160;&#160;&#160;&#160;&#160; [column3] [VARCHAR](50) NULL ,\n\t&#160;&#160;&#160;&#160;&#160; [column4] [VARCHAR](100) NULL ,\n\t&#160;&#160;&#160;&#160;&#160; [notes] [VARCHAR](MAX) NULL ,\n\t&#160;&#160;&#160;&#160;&#160; [timestamp] [DATETIME] NOT NULL ,\n\t&#160;&#160;&#160;&#160;&#160; [log_id] [INT] NOT NULL ,\n\t&#160;&#160;&#160;&#160;&#160; [Offset] AS ( CONVERT([TINYINT], DATEPART(DAY, [timestamp])) ) PERSISTED\n\t&#160;&#160;&#160; )\n\tON&#160; PS_RingRecyclerByMonth(Offset) \n\t\n\tGO\n<\/pre>\n<p>Note that there is no identity specification on  the <code>[log_id]<\/code> conlumn.  \t \t<\/p>\n<p>After creating the tables, we will use them to  switch the partitions to them by using the following stored procedure:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tIF OBJECT_ID('dbo.RingbufferRotate_ByMonth') IS NULL\n\t&#160;&#160;&#160; EXEC ('CREATE PROCEDURE dbo.RingbufferRotate_ByMonth AS RETURN 0;')\n\tGO\n\t\n\tALTER&#160; PROCEDURE RingbufferRotate_ByMonth\n\t&#160;&#160;&#160; @Now DATETIME = NULL ,\n\t&#160;&#160;&#160; @PartitionsToKeep INT = 7\n\tAS\n\t&#160;&#160;&#160; SET NOCOUNT ON\n\t\n\t&#160;&#160;&#160; BEGIN\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; IF @Now IS NULL\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; SET @Now = GETDATE()\n\t&#160;\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; DECLARE @BufferSize INT\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; SELECT&#160; @BufferSize = COUNT(*)\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; FROM&#160;&#160;&#160; sys.partitions P\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; JOIN sys.tables T ON P.object_id = T.object_id\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; WHERE&#160;&#160; T.name = 'LOG'\n\t&#160;\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; IF @PartitionsToKeep &gt; @BufferSize\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; BEGIN\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; RAISERROR ('Can''t keep more partitions than the current buffer size of: %i', 16, 1, @BufferSize)\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; RETURN\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; END\n\t&#160;\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; DECLARE @CurrentPartition INT\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; SET @CurrentPartition = DATEPART(DAY, @Now) \n\t&#160;\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; DECLARE @OldestPartitionToKeep INT\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; SET @OldestPartitionToKeep = ( @CurrentPartition + @BufferSize\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; - @PartitionsToKeep ) % @BufferSize\n\t&#160;\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; \/* Start from the next partition up from current and move forward *\/\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; DECLARE @P INT = ( @CurrentPartition + 1 ) % @BufferSize\n\t\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; DECLARE @I INT = 0\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; WHILE @I &lt; @BufferSize - @PartitionsToKeep\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; BEGIN\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; DECLARE @Sql NVARCHAR(4000) = 'ALTER TABLE LOG SWITCH PARTITION &lt;p&gt; TO LOG_SwitchTarget PARTITION &lt;p&gt;'\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; SET @Sql = REPLACE(@Sql, '&lt;p&gt;', CAST(@P + 1 AS NVARCHAR)) \n\t&#160;\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; EXEC sp_executesql @Sql\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;  \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; SET @I = @I + 1;\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; SET @P = ( @P + 1 ) % @BufferSize\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; END\n\t&#160;&#160;&#160; END\n<\/pre>\n<p>This procedure takes two parameters:  \t \t<\/p>\n<ul>\n<li><code>@Now<\/code>, which by default is set to get the current  timestamp. It can be used to set the time to a different point and rotate the partitions from there<\/li>\n<li><code>@PartitionsToKeep<\/code>, which is used to specify how many partitions (i.e. days of data) to keep in the current transactional table and  how many to move to the <code> \t\t[dbo].[LOG_SwitchTarget]<\/code>  table. This value can be up to 30, but for our case let&#8217;s choose to keep only 7 days of data and move all other data  \t\t \t\t<\/li>\n<\/ul>\n<p>At this point we don&#8217;t have that much data in  our Log table, since we renamed the tables fairly recently. But in a few days the data will pile up and our procedure  will be useful.  \t \t<\/p>\n<p>Here is a picture of how the data transfer looks  like:<\/p>\n<p class=\"illustration\">\t<img loading=\"lazy\" decoding=\"async\" height=\"189\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/2140-clip_image002.jpg\" width=\"605\" alt=\"2140-clip_image002.jpg\" \/><\/p>\n<p>Note that there are many different ways to  implement this solution, but the important part is the automated partitions switching, the bulk-copying of the data from  the&#160; <code> \t[dbo].[LOG_SwitchTarget]<\/code><\/p>\n<p>Table on the production server to the <code> \t[dbo].[LOG_SwitchTarget]<\/code> \ttable on the secondary server and merging the data to the <code>[dbo].[LOG_History] <\/code>table.  \t \t<\/p>\n<p>Here is the<code>  MERGE<\/code> procedure which should be created on the secondary server:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tIF OBJECT_ID('dbo.Log_History_Merge') IS NULL\n\t&#160;&#160;&#160; EXEC ('CREATE PROCEDURE dbo.Log_History_Merge AS RETURN 0;')\n\tGO\n\t\n\tALTER PROCEDURE dbo.Log_History_Merge\n\tAS\n\t&#160;&#160;&#160; SET IDENTITY_INSERT LOG_History ON \n\t&#160;&#160;&#160; MERGE INTO LOG_History AS T\n\t&#160;&#160;&#160; USING\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; ( SELECT&#160;&#160;&#160; [column1] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [column2] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [column3] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [column4] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [notes] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [timestamp] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [log_id]\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; FROM&#160;&#160;&#160;&#160;&#160; [dbo].[LOG_SwitchTarget]\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; ) AS A\n\t&#160;&#160;&#160; ON A.[log_id] = T.[log_id]\n\t&#160;&#160;&#160; WHEN NOT MATCHED THEN\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; INSERT ( [column1] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [column2] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [column3] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [column4] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [notes] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [timestamp] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; [log_id]  \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; )\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; VALUES ( A.[column1] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; A.[column2] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; A.[column3] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; A.[column4] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; A.[notes] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; A.[timestamp] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; A.[log_id]  \n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160; );\n\t&#160;&#160;&#160; SET IDENTITY_INSERT LOG_History OFF \n\t&#160;&#160;&#160; TRUNCATE TABLE [LOG_SwitchTarget];\n<\/pre>\n<p>From this point on it is very easy to implement  the solution. For example, we can create an SSIS package which will run daily and will carry out the following tasks:<\/p>\n<ul>\n<li>Execute the <code>RingbufferRotate_ByMonth<\/code> procedure, which will switch all older partitions to the \t<code>LOG_SwitchTarget<\/code> table<\/li>\n<li>Bulk-copy the data from the <code>LOG_SwitchTarget<\/code> table on the primary server to the  \t<code>LOG_SwitchTarget<\/code>  table on the secondary server<\/li>\n<li>Execute the <code>Log_History_Merge<\/code> procedure on the secondary server<\/li>\n<li>Truncate the <code>LOG_SwitchTarget<\/code> tables on both servers<\/li>\n<\/ul>\n<p>In order to test the code, let&#8217;s get back to our  test scenario: we have 10 million rows in the Log_History table. Let&#8217;s generate some data in the LOG table, which will  have timestamp between 2015-01-20 and 2015-01-28. We can then run the <code>RingbufferRotate_ByMonth<\/code> procedure and take a look at the data distribution per partition:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tSELECT&#160; $PARTITION.PF_RingRecyclerByMonth(OFFSET) AS PARTITION ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; COUNT(*) AS [COUNT] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; CONVERT(DATE, [timestamp]) Datestamp\n\tFROM&#160;&#160;&#160; dbo.[LOG]\n\tGROUP BY $PARTITION.PF_RingRecyclerByMonth(OFFSET) ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; CONVERT(DATE, [timestamp])\n\tORDER BY CONVERT(DATE, [timestamp]);\n<\/pre>\n<p>This query returns data from the <code>LOG<\/code> table, which looks like this:<\/p>\n<table class=\"MsoNormalTable\">\n<tbody>\n<tr>\n<td valign=\"bottom\">\n<p> \t\t\t<strong>PARTITION<\/strong><\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t<strong>COUNT<\/strong><\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t<strong>Datestamp<\/strong><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"bottom\">\n<p> \t\t\t23<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t111111<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t2015-01-23<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"bottom\">\n<p> \t\t\t24<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t111111<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t2015-01-24<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"bottom\">\n<p> \t\t\t25<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t111111<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t2015-01-25<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"bottom\">\n<p> \t\t\t26<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t111111<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t2015-01-26<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"bottom\">\n<p> \t\t\t27<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t111111<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t2015-01-27<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"bottom\">\n<p> \t\t\t28<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t111111<\/p>\n<\/td>\n<td valign=\"bottom\">\n<p> \t\t\t2015-01-28<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>And the following query will show us what we  have in the <code>LOG_SwitchTarget<\/code> table:<\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">\tSELECT&#160; $PARTITION.PF_RingRecyclerByMonth(OFFSET) AS PARTITION ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; COUNT(*) AS [COUNT] ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; CONVERT(DATE, [timestamp]) Datestamp\n\tFROM&#160;&#160;&#160; dbo.[LOG_SwitchTarget]\n\tGROUP BY $PARTITION.PF_RingRecyclerByMonth(OFFSET) ,\n\t&#160;&#160;&#160;&#160;&#160;&#160;&#160; CONVERT(DATE, [timestamp])\n\tORDER BY CONVERT(DATE, [timestamp]);\n<\/pre>\n<p>The data looks like this:<\/p>\n<table>\n<tbody>\n<tr>\n<td valign=\"top\">\n<p class=\"MsoNormal\"><b> \t\t\tPARTITION<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p class=\"MsoNormal\"><b> \t\t\tCOUNT<\/b><\/p>\n<\/td>\n<td valign=\"top\">\n<p class=\"MsoNormal\"><b> \t\t\tDatestamp<\/b><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t20<\/p>\n<\/td>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t111112<\/p>\n<\/td>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t2015-01-20<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t21<\/p>\n<\/td>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t111111<\/p>\n<\/td>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t2015-01-21<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t22<\/p>\n<\/td>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t111111<\/p>\n<\/td>\n<td valign=\"top\">\n<p class=\"MsoNormal\"> \t\t\t2015-01-22<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Now it is time to do a bulk-copy of the data  from the <code>LOG_SwitchTarget<\/code> table on the primary server to the<code> LOG_SwitchTarget<\/code> table on the secondary server.  \t \t<\/p>\n<p>After the bulk-copy has run, we can execute the  <code>dbo.Log_History_Merge<\/code> procedure. Now we can see that we have the new data into our  \t<code>LOG_History<\/code> table on the secondary  server.<\/p>\n<h2>Conclusions<\/h2>\n<p>It is important to remove historical data from a  fast-working OLTP database, and retain only any necessary summary and aggregate data. The requirement of transaction  processing and analysis are quite different, and it is unwise to mix the two.  \t \t<\/p>\n<p>In this article I have described a simple but  effective way to keep our log data under control by rotating the data in the production system and taking it  incrementally to a secondary server. There are plenty of other ways to do it, but the method you choose must be able to  comfortably handle the heaviest volume of data you are likely to experience.  \t \t<\/p>\n<p>This article focuses on a solution where the  primary server is using Enterprise edition, but in reality there is a possibility to do similar logic even in standard  edition, it will just take a bit more effort to code the data flows and some smart management of database file groups.  \t \t<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>OLTP databases work best when data that becomes no longer current is then transferred to a separate database for analysis and reporting. There are many ways to do this, but Feodor describes a rapid technique that takes advantage of partitions to automates the rotation of the data and moving it to the analysis server.&hellip;<\/p>\n","protected":false},"author":221902,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143527],"tags":[4168,4170,5438,6007,4150],"coauthors":[],"class_list":["post-1948","post","type-post","status-publish","format-standard","hentry","category-database-administration-sql-server","tag-database","tag-database-administration","tag-oltp","tag-oltp-partitions","tag-sql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1948","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221902"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=1948"}],"version-history":[{"count":5,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1948\/revisions"}],"predecessor-version":[{"id":92215,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1948\/revisions\/92215"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=1948"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=1948"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=1948"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=1948"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}