{"id":1326,"date":"2012-04-26T00:00:00","date_gmt":"2012-04-26T00:00:00","guid":{"rendered":"https:\/\/test.simple-talk.com\/uncategorized\/implementing-lookup-logic-in-sql-server-integration-services\/"},"modified":"2021-06-03T16:45:26","modified_gmt":"2021-06-03T16:45:26","slug":"implementing-lookup-logic-in-sql-server-integration-services","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/bi-sql-server\/implementing-lookup-logic-in-sql-server-integration-services\/","title":{"rendered":"Implementing Lookup Logic in SQL Server Integration Services"},"content":{"rendered":"<div id=\"pretty\">\n<p class=\"start\">There might be times when developing a SQL Server Integration Services (SSIS) package that you want to perform a lookup in order to supplement or validate the data in your data flow. A lookup lets you access data related to your current dataset without having to create a special structure to support that access. <\/p>\n<p>To facilitate the ability to perform lookups, SSIS includes the <strong>Lookup<\/strong> transformation, which provides the mechanism necessary to access and retrieve data from a secondary dataset. The transformation works by joining the primary dataset (the <em>input<\/em> data) to the secondary dataset (the <em>referenced<\/em> data). SSIS attempts to perform an equi-join based on one or more matching columns in the input and referenced datasets, just like you would join two tables in in a SQL Server database. <\/p>\n<p>Because SSIS uses an equi-join, each row of the input dataset must match at least one row in the referenced dataset. The rows are considered matching if the values in the joined columns are equal. By default, if an input row cannot be joined to a referenced row, the <strong>Lookup<\/strong> transformation treats the row as an error. However, you can override the default behavior by configuring the transformation to instead redirect any rows without a match to a specific output. If an input row matches multiple rows in the referenced dataset, the transformation uses only the first row. The way in which the other rows are treated depends on how the transformation is configured. <\/p>\n<p>The <strong>Lookup<\/strong> transformation lets you access a referenced dataset either through an <strong>OLE DB<\/strong> connection manager or through a <strong>Cache<\/strong> connection manager. The <strong>Cache<\/strong> connection manager accesses the dataset held in an in-memory cache store throughout the duration of the package execution. You can also persist the cache to a cache file (.caw) so it can be available to multiple packages or be deployed to several computers. <\/p>\n<p>The best way to understand how the <strong>Lookup<\/strong> transformation works is to see it in action. In this article, we&#8217;ll work through an example that retrieves employee data from the <strong>AdventureWorks2008R2<\/strong> sample database and loads it into two comma-delimited text files. The database is located on a local instance of SQL Server 2008 R2. The referenced dataset that will be used by the <strong>Lookup<\/strong> transformation is also based on data from that database, but stored in a cache file at the onset of the package execution. <\/p>\n<p>The first step, then, in getting this example underway is to set up a new SSIS package in Business Intelligence Development Studio (BIDS), add two <strong>Data Flow<\/strong> tasks to the control flow, and connect the precedence constraint from the first <strong>Data Flow<\/strong> task to the second <strong>Data Flow<\/strong> task, as I&#8217;ve done in Figure 1. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig01.png\" alt=\"1480-SsisLookup_Fig01.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1: <\/strong><strong>Adding <\/strong><strong>data flow<\/strong><strong> <\/strong><strong>components <\/strong><strong>to <\/strong><strong>your SSIS package<\/strong> <\/p>\n<p>Notice that I&#8217;ve named the fist <strong>Data Flow<\/strong> task <strong>Load data into cache<\/strong> and the second one <strong>Load data into file<\/strong>. These names should make it clear what purpose each task serves. The <strong>Data Flow<\/strong> tasks are also the only two control flow components we need to add to our package. Everything else is at the data flow level. So let&#8217;s get started. <\/p>\n<h2>Writing Data to a Cache <\/h2>\n<p>Because we&#8217;re creating a lookup based on cached data, our initial step is to configure the first data flow to retrieve the data we need from the <strong>AdventureWorks2008R2<\/strong> database and save it to a cache file. Figure 2 shows what the data flow should look like after the data flow has been configured to cache the data. As you can see, you need to include only an <strong>OLE DB<\/strong> source and a <strong>Cache<\/strong> transformation. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig02.png\" alt=\"1480-SsisLookup_Fig02.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 2<\/strong><strong>: <\/strong><strong>Configuring the data flow that loads data into <\/strong><strong>a cache<\/strong> <\/p>\n<p>Before I configured the <strong>OLE DB<\/strong> source, I created an <strong>OLE DB<\/strong> connection manager to connect to the <strong>AdventureWorks2008R2<\/strong> database on my local instance of SQL Server. I named the connection manager <strong>AdventureWorks2008R2<\/strong>. <\/p>\n<p>I then configured the <strong>OLE DB<\/strong> source to connect to the <strong>AdventureWorks2008R2<\/strong><strong> <\/strong>connection manager and to use the following T-SQL statement to retrieve the data to be cached: <\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">USE AdventureWorks2008R2;\nGO\n\nSELECT \n&#160;&#160;BusinessEntityID, \n&#160;&#160;NationalIDNumber\nFROM \n&#160;&#160;HumanResources.Employee\nWHERE\n&#160;&#160;BusinessEntityID &lt; 250; <\/pre>\n<p>Notice that I&#8217;m retrieving a subset of data from the <strong>HumanResources.Employee<\/strong> table. The returned dataset includes two columns: <strong>BusinessEntityID<\/strong> and <strong>NationalIDNumber<\/strong>. We will use the <strong>BusinessEntityID<\/strong> column to match rows with the input dataset in order to return a <strong>NationalIDNumber<\/strong> value for each employee. Figure 3 shows what your <strong>OLE DB Source<\/strong> editor should look like after you&#8217;ve configured it with the connection manager and T-SQL statement. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig03.png\" alt=\"1480-SsisLookup_Fig03.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 3<\/strong><strong>: <\/strong><strong>Using an OLE DB source to retrieve data from the AdventureWorks2008R2 database<\/strong> <\/p>\n<p>You can view a sample of the data that will be cached by clicking the <strong>Preview<\/strong> button in the <strong>OLE DB Source<\/strong> editor. This launches the <strong>Preview Query Results<\/strong> dialog box, shown in Figure 4, which will display up to 200 rows of your dataset. Notice that a <strong>NationalIDNumber<\/strong> value is associated with each <strong>BusinessEntityID<\/strong> value. The two values combined will provide the cached data necessary to create a lookup in your data flow. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig04.png\" alt=\"1480-SsisLookup_Fig04.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 4<\/strong><strong>: <\/strong><strong>Previewing the data to be saved to a cache<\/strong> <\/p>\n<p>After I configured the <strong>OLE DB<\/strong> source, I moved on to the <strong>Cache<\/strong> transformation. As part of the process of setting up the transformation, I first configured a <strong>Cache<\/strong> connection manager. To do so, I opened the <strong>Cache Transformation<\/strong> editor and clicked the <strong>New<\/strong> button next to the <strong>Cacheconnection manager<\/strong> drop-down list. This launched the <strong>Cache Connection Manager<\/strong> editor, shown in Figure 5. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig05.png\" alt=\"1480-SsisLookup_Fig05.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 5<\/strong><strong>: Adding a Cache connection manager to your SSIS package<\/strong> <\/p>\n<p>I named the <strong>Cache<\/strong> connection manager <strong>NationalIdCache<\/strong>, provided a description, and selected the <strong>Use File Cache<\/strong> checkbox so my cache would be saved to a file. This, of course, isn&#8217;t necessary for a simple example like this, but having the ability to save the cache to a file is an important feature of the SSIS lookup operation, so that&#8217;s why I&#8217;ve decided to demonstrate it here. <\/p>\n<p>Next, I provided and path and file name for the .caw file, and then selected the <strong>Columns<\/strong> tab in the Cache Connection Manager editor, which is shown in Figure 6. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig06.png\" alt=\"1480-SsisLookup_Fig06.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 6<\/strong><strong>: <\/strong><strong>Configuring the column properties in your Cache connection manager<\/strong> <\/p>\n<p>Because I created my <strong>Cache<\/strong> connection manager from within the <strong>Cache<\/strong> transformation, the column information was already configured on the <strong>Columns<\/strong> tab. However, I had to change the <strong>Index<\/strong> <strong>Position<\/strong> value for the <strong>BusinessEntityID<\/strong> column from <strong>0<\/strong> to <strong>1<\/strong>. This column is an index column, which means it must be assigned a positive integer. If there are more than one index columns, those integers should be sequential, with the column having the most unique values being the lowest. In this case, there is only one index column, so I need only assign one value. The <strong>NationalIDNumber<\/strong> is a non-index column and as such should be configured with an <strong>Index<\/strong> <strong>Position<\/strong> value of <strong>0<\/strong>, the default value. <\/p>\n<p>When a <strong>Cache<\/strong> connection manager is used in conjunction with a <strong>Lookup<\/strong> transformation, as we&#8217;ll be doing later in this example, the index column (or columns) is the one that is mapped to the corresponding column in the input dataset. Only index columns in the referenced dataset can be mapped to columns in the input dataset. <\/p>\n<p>After I set up the <strong>Cache<\/strong> connection manager, I configured the <strong>Cache<\/strong> transformation. First, I confirmed that the <strong>Cache<\/strong> connection manager I just created is the one specified in the <strong>Cache<\/strong> connection manager drop-down list on the <strong>Connection<\/strong> <strong>Manager<\/strong> page of the <strong>Cache Transformation<\/strong> editor, as shown in Figure 7. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig07.png\" alt=\"1480-SsisLookup_Fig07.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 7<\/strong><strong>: <\/strong><strong>Setting up the Cache transformation in your data flow<\/strong> <\/p>\n<p>Next, I confirmed the column mappings on the <strong>Mappings<\/strong> page of the <strong>Cache Transformation<\/strong> editor. Given that I hadn&#8217;t changed any column names along with way, these mappings should have been done automatically and appear as they do in Figure 8. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig08.png\" alt=\"1480-SsisLookup_Fig08.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 8<\/strong><strong>: <\/strong><strong>Mapping columns in the Cache transformation<\/strong> <\/p>\n<p>That&#8217;s all there is to configuring the first data flow to cache the referenced data. I confirmed that everything was running properly by executing only this data flow and then confirming that the .caw file had been created in its designated folder. We can now move on to the second data flow. <\/p>\n<h2>Performing Lookups from Cached Data <\/h2>\n<p>The second data flow is the one in which we perform the actual lookup. We will once again retrieve employee data from the <strong>AdventureWorks2008R2<\/strong> database, look up the national ID for each employee (and adding it to the data flow), and save the data to one of two files: the first for employees who have an associated national ID and the second file for those who don&#8217;t. Figure 9 shows you what your data flow should look like once you&#8217;ve added all the components. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig09.png\" alt=\"1480-SsisLookup_Fig09.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 9<\/strong><strong>: <\/strong><strong>Configuring the data flow <\/strong><strong>to<\/strong><strong> load data into text files<\/strong> <\/p>\n<p>The first step I took in setting up this data flow was to add an <strong>OLE DB<\/strong> source and configure it to connect to the <strong>AdventureWorks2008R2<\/strong> database via to the <strong>AdventureWorks2008R2<\/strong> connection manager. I then specified that the source component run the following T-SQL statement in order to retrieve the necessary employee data: <\/p>\n<pre class=\"theme:ssms2012 lang:tsql\">SELECT \n&#160;&#160;BusinessEntityID,\n&#160;&#160;FirstName,\n&#160;&#160;LastName,\n&#160;&#160;JobTitle\nFROM\n&#160;&#160;HumanResources.vEmployee; <\/pre>\n<p>The data returned by this statement represents the input dataset that will be used for our lookup operation. Notice that the dataset includes the <strong>BusinessEntityID<\/strong> column, which will be used to map this dataset to the referenced dataset. Figure 10 shows you what the <strong>Connection<\/strong> <strong>Manager<\/strong> page of the <strong>OLE DB Source<\/strong> editor should look like after you&#8217;ve configured that connection manager and query. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig10.png\" alt=\"1480-SsisLookup_Fig10.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1<\/strong><strong>0<\/strong><strong>: <\/strong><strong>Configuring an OLE DB source to retrieve employee data<\/strong> <\/p>\n<p>As you did with the <strong>OLE DB<\/strong> source in the first data flow, you can preview the data returned by the <strong>SELECT<\/strong> statement by clicking the <strong>Preview<\/strong> button. Your results should look similar to those shown in Figure 11. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig11.png\" alt=\"1480-SsisLookup_Fig11.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1<\/strong><strong>1<\/strong><strong>: <\/strong><strong>Previewing the employee data returned by the OLE DB source<\/strong> <\/p>\n<p>My next step was to add a <strong>Lookup<\/strong> transformation to the data flow. The transformation will attempt to match the input data to the referenced data saved to cache. When you configure the transformation you can choose the cache mode and connection type. You have three options for configuring the cache mode: <\/p>\n<ul>\n<li><strong>Full cache<\/strong>: The referenced dataset is generated and loaded into cache before the <strong>Lookup<\/strong> transformation is executed.  <\/li>\n<li><strong>Partial<\/strong><strong> cache<\/strong>: The referenced dataset is generated when the <strong>Lookup<\/strong> transformation is executed, and the dataset is loaded into cache.  <\/li>\n<li><strong>No<\/strong><strong> cache<\/strong>: The referenced dataset is generated when the <strong>Lookup<\/strong> transformation is executed, but no data is loaded into cache. <\/li>\n<\/ul>\n<p>For this exercise, I selected the first option because I am generating and loading the data into cache before I run the <strong>Lookup<\/strong> transformation. Figure 12 shows the <strong>General<\/strong> page of the <strong>Lookup Transformation<\/strong> editor, with the <strong>Full Cache<\/strong> option selected. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig12.png\" alt=\"1480-SsisLookup_Fig12.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1<\/strong><strong>2<\/strong><strong>: <\/strong><strong>Configuring the General page of the Lookup Transformation editor<\/strong> <\/p>\n<p>Notice that the editor also includes the <strong>Connection type<\/strong> section, which supports two options: Cache Connection Manager and <strong>OLE DB<\/strong> <strong>Connection<\/strong> <strong>Manager<\/strong>. In this case, I selected the Cache Connection Manager option because I will be retrieving data from a cache file, and this connection manager type is required to access the data in that file. <\/p>\n<p>As you can see in Figure 12, you can also choose an option from the drop-down list <strong>Specify how to handle rows with no matching entries<\/strong>. This option determines how rows in the input dataset are treated if there are no matching rows in the referenced database. By default, the unmatched rows are treated as errors. However, I selected the <strong>Redirect rows to no match output<\/strong> option so I could better control the unmatched rows, as you&#8217;ll see in a bit. <\/p>\n<p>After I configured the <strong>General<\/strong> page of the <strong>Lookup Transformation<\/strong> editor, I moved on to the <strong>Connection<\/strong> page and ensured that the <strong>Cache<\/strong> connection manager named <strong>NationalIdCache<\/strong> was selected in the Cache Connection Manager drop-down list. This is the same connection manager I used in the first data flow to save the dataset to a cache file. Figure 13 shows the <strong>Connection<\/strong> page of the <strong>Lookup Transformation<\/strong> editor with the specified <strong>Cache<\/strong> connection manager. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig13.png\" alt=\"1480-SsisLookup_Fig13.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1<\/strong><strong>3<\/strong><strong>: <\/strong><strong>Configuring the Connection page of the Lookup Transformation editor<\/strong> <\/p>\n<p>Next, I configured the <strong>Columns<\/strong> page of the <strong>Lookup Transformation<\/strong> editor, shown in Figure 14. I first mapped the <strong>BusinessEntityID<\/strong> input column to the <strong>BusinessEntityID<\/strong> lookup column by dragging the input column to the lookup column. This process created the black arrow between the tables that you see in the figure. As a result, the <strong>BusinessEntityID<\/strong> columns will be the ones used to form the join between the input and referenced datasets. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig14.png\" alt=\"1480-SsisLookup_Fig14.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1<\/strong><strong>4<\/strong><strong>: <\/strong><strong>Configuring <\/strong><strong>the Columns page of the Lookup T<\/strong><strong>ransformation editor<\/strong> <\/p>\n<p>Next, I selected the checkbox next to the <strong>NationalIDNumber<\/strong> column in the lookup table to indicate that this was the column that contained the lookup values I wanted to add to the data flow. I then ensured that the lookup operation defined near the bottom of the <strong>Columns<\/strong> page indicated that a new column would be added as a result of the lookup operation. The <strong>Columns<\/strong> page of your <strong>Lookup Transformation<\/strong> editor should end up looking similar to Figure 14. <\/p>\n<p>My next step was to add a <strong>Flat File<\/strong> destination to the data flow. When I connected the data path from the <strong>Lookup<\/strong> transformation to the <strong>Flat File<\/strong> destination, the <strong>Input Output Selection<\/strong> dialog box appeared, as shown in Figure 15. The dialog box let&#8217;s you chose which data flow output to send to the flat file-the matched rows or the unmatched rows. In this case, I went with the default, <strong>Lookup Match Output<\/strong>, which refers to the matched rows. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig15.png\" alt=\"1480-SsisLookup_Fig15.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1<\/strong><strong>5<\/strong><strong>: Selecting an output for the Lookup transformation<\/strong> <\/p>\n<p>Next, I opened the <strong>Flat File<\/strong> <strong>Destination<\/strong> editor and clicked the <strong>New<\/strong> button next to the <strong>Flat File<\/strong> <strong>Connection<\/strong> <strong>Manager<\/strong> drop-down list. This launched the <strong>Flat File<\/strong> <strong>Connection<\/strong> <strong>Manager<\/strong> editor, shown in Figure 16. I typed the name <strong>MatchingRows<\/strong> in the <strong>Connection<\/strong> <strong>manager<\/strong> <strong>name<\/strong> text box, typed the file name <strong>C:\\DataFiles\\MatchingRows.txt<\/strong> in the <strong>File<\/strong> <strong>name<\/strong> text box, and left all other setting with their default values. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig16.png\" alt=\"1480-SsisLookup_Fig16.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 16<\/strong><strong>: <\/strong><strong>Setting up a Flat File connection manager<\/strong> <\/p>\n<p>After I saved my connection manager settings, I was returned to the <strong>Connection<\/strong> <strong>Manager<\/strong> page of the <strong>Flat File Destination<\/strong> editor. The <strong>MatchingRows<\/strong> connection manager was now displayed in the <strong>Flat File<\/strong> <strong>Connection<\/strong> <strong>Manager<\/strong> drop-down list, as shown in Figure 17. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig17.png\" alt=\"1480-SsisLookup_Fig17.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 17: Configuring a Flat File destination in your data flow<\/strong> <\/p>\n<p>I then selected the <strong>Mapping<\/strong><strong>s<\/strong> page (shown in Figure 18) to verify that the columns were properly mapped between the data flow and the file destination. One thing you&#8217;ll notice at this point is that the data flow now includes the <strong>NationalIDNumber<\/strong> column, which was added to the data flow by the <strong>Lookup<\/strong> transformation. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig18.png\" alt=\"1480-SsisLookup_Fig18.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1<\/strong><strong>8<\/strong><strong>: <\/strong><strong>Verifying column mappings in your Flat File destination<\/strong> <\/p>\n<p>The next step I took in configuring the data flow was to add a second <strong>Flat File<\/strong> destination and connect the second data path from the <strong>Lookup<\/strong> transformation to the new destination. I then configured a second connection manager with the name <strong>NonM<\/strong><strong>atchingRows<\/strong> and the file name <strong>C:\\DataFiles\\<\/strong><strong>NonM<\/strong><strong>atchingRows.txt<\/strong>. All rows in the input data that do not match rows in the referenced data will be directed to this file. Refer back to Figure 9 to see what your data flow should look like at this point. <\/p>\n<h2>Running Your SSIS Package <\/h2>\n<p>The final step, of course, is to run the SSIS package in BIDS. When I ran the package, I watched the second data flow so I could monitor how many rows matched the lookup dataset and how many did not. Figure 19 shows the data flow right after I ran the package. <\/p>\n<p class=\"illustration\"><img decoding=\"async\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/imported\/1480-SsisLookup_Fig19.png\" alt=\"1480-SsisLookup_Fig19.png\" \/><\/p>\n<p class=\"caption\"><strong>Figure 1<\/strong><strong>9<\/strong><strong>: <\/strong><strong>Running your SSIS package<\/strong> <\/p>\n<p>In this case, 249 rows in the input dataset matched rows in the referenced dataset, and 41 rows did not. These are the numbers I would have expected based on my source data. I also confirmed that both text files had been created and that they contained the expected data. <\/p>\n<p>As you can see, the <strong>Lookup<\/strong> transformation makes it relatively easy to access referenced data in order to supplement the data flow. For this example, I retrieved my referenced data from the same database as I retrieved the input data. However, that referenced data could have come from any source whose data could be saved to cache, or it could have come directly from another database through an OLE DB connection. Once you understand the principles behind the <strong>Lookup<\/strong> transformation, you can apply them to your particular situation in order to create an effective lookup operation. <\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>With SSIS, you can perform a lookup on data in the course of a task, using referenced data from any OLE DB source. It is a useful feature that  enables you to check on the validity of data, or interpret it before proceeding.  Robert Sheldon explains.&hellip;<\/p>\n","protected":false},"author":221841,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143528],"tags":[5645,4179,4150,4151,4306],"coauthors":[],"class_list":["post-1326","post","type-post","status-publish","format-standard","hentry","category-bi-sql-server","tag-lookup-logic","tag-source-control","tag-sql","tag-sql-server","tag-ssis"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1326","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221841"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=1326"}],"version-history":[{"count":5,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1326\/revisions"}],"predecessor-version":[{"id":91231,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/1326\/revisions\/91231"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=1326"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=1326"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=1326"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=1326"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}